Why GenAI for Chinese Studies 2

Author

Kwok-leong Tang

Published

February 18, 2025

Answers to Questions

We will start with answering your questions.

Prompt Engineering

OCR, file formats, and Audio Transcription

  • File conversion & OCR: PDFGear
  • Google Drive of OCR, Google Keep, Onenote, Mac Screenshot……
  • Descript
  • Apple Note
  • Handwritten manuscript……experiments
  • Visualization: Napkin

Multi-Modal Models

  • Large Language Models (LLMs) are multi-modal models that can process and generate different types of content including text, images, and audio
  • Multi-modal models can analyze and identify objects, scenes, and activities in images and videos
  • Notable examples include Google Gemini, which excels at multi-modal tasks and understanding complex visual-language relationships

DeepSeek and Climate Issues

How does Dr. Tang think of the risks and ethical consequences of using AI for humanities research? Could he possibly give some examples of how AI was used unethically and therefore has caused harm in the field of digital humanities studies?

DeepSeek’s Recent Impact

  • Achieved remarkable performance with significantly lower training costs (?)
  • Innovative use of knowledge distillation techniques
  • Open-source (?) approach with accessible model weights
  • Capability for local deployment and customization

Climate Issues and AI Development

  • Knowledge limitations: Are we approaching the boundaries of available training data?
  • Resource constraints and environmental impact of AI training
  • The AGI debate: Single superintelligent system vs. distributed specialized agents

Local Deployment

  • The most significant question is not about where models are hosted or who controls them, but whether data needs to be sent to third parties or not.
  • Future predictions: local deployment and models for specific tasks: translation models focused solely on translation, coding models dedicated to programming, etc.

Explainable or Black-Box

  • Explainable AI-assisted research process refers to transparent AI systems that help write code, build workflows, and resolve complex technical issues. The process and decision-making steps are visible and interpretable to users through accessible code and documentation.
  • Black-box approach refers to AI systems where the internal workings and decision-making processes are not transparent to users. Examples include using AI for data extraction or analysis where only the input and output are visible, but the underlying mechanisms remain hidden.
  • Knowing when to use the correct approach is important to your AI workflow.

AI Research Tools and Workflows for Humanities Scholars

These are KL’s recommended tools and workflows for conducting AI-enhanced humanities research in 2025.

  • Primary writing tool: Visual Studio Code
    • Visual Studio Code (VSCode) is an integrated development environment for programmers. Although many humanities students and scholars do not code, it could be the best editor for AI-enhanced writing.
  • AI assistant in VSCode: GitHub Copilot
    • GitHub Copilot is a the AI coding assistant from GitHub (Microsoft). It has a free plan (2000 completions and 50 chats). The Pro plan is $10 per month. However, students can access the pro plan for free by joining the GitHub Education.
  • Writing in Markdown and manage the writing with Quarto.

Using API for AI Processing