Why GenAI for Chinese Studies 2

Author

Kwok-leong Tang

Published

February 18, 2025

Answers to Questions

We will start with answering your questions.

Prompt Engineering

Prompt Engineering Guide
Learn Prompting
[https://langgptai.feishu.cn/wiki/ASXOwDbTEiH9CUkXFA5cLHumn88]
[https://github.com/langgptai/LangGPT?tab=readme-ov-file]

OCR, file formats, and Audio Transcription

File conversion & OCR: PDFGear
Google Drive of OCR, Google Keep, Onenote, Mac Screenshot……
Descript
Apple Note
Handwritten manuscript……experiments
Visualization: Napkin

DeepSeek and Climate Issues

How does Dr. Tang think of the risks and ethical consequences of using AI for humanities research? Could he possibly give some examples of how AI was used unethically and therefore has caused harm in the field of digital humanities studies?

DeepSeek’s Recent Impact

Achieved remarkable performance with significantly lower training costs (?)
Innovative use of knowledge distillation techniques
Open-source (?) approach with accessible model weights
Capability for local deployment and customization

Climate Issues and AI Development

Knowledge limitations: Are we approaching the boundaries of available training data?
Resource constraints and environmental impact of AI training
The AGI debate: Single superintelligent system vs. distributed specialized agents

Local Deployment

The most significant question is not about where models are hosted or who controls them, but whether data needs to be sent to third parties or not.
Future predictions: local deployment and models for specific tasks: translation models focused solely on translation, coding models dedicated to programming, etc.

Explainable or Black-Box

Explainable AI-assisted research process refers to transparent AI systems that help write code, build workflows, and resolve complex technical issues. The process and decision-making steps are visible and interpretable to users through accessible code and documentation.
Black-box approach refers to AI systems where the internal workings and decision-making processes are not transparent to users. Examples include using AI for data extraction or analysis where only the input and output are visible, but the underlying mechanisms remain hidden.
Knowing when to use the correct approach is important to your AI workflow.

AI Research Tools and Workflows for Humanities Scholars

These are KL’s recommended tools and workflows for conducting AI-enhanced humanities research in 2025.

Primary writing tool: Visual Studio Code
- Visual Studio Code (VSCode) is an integrated development environment for programmers. Although many humanities students and scholars do not code, it could be the best editor for AI-enhanced writing.
AI assistant in VSCode: GitHub Copilot
- GitHub Copilot is a the AI coding assistant from GitHub (Microsoft). It has a free plan (2000 completions and 50 chats). The Pro plan is $10 per month. However, students can access the pro plan for free by joining the GitHub Education.
Writing in Markdown and manage the writing with Quarto.

Answers to Questions

Prompt Engineering

OCR, file formats, and Audio Transcription

Multi-Modal Models

DeepSeek and Climate Issues

DeepSeek’s Recent Impact

Climate Issues and AI Development

Local Deployment

Explainable or Black-Box

AI Research Tools and Workflows for Humanities Scholars

Using API for AI Processing