Generative AI for East Asian Studies

Session 2: Hands-on Practice Session with LM Studio

Author

Affiliation

Kwok-leong Tang

Harvard University

Published

April 11, 2026

Before We Begin

Please make sure you have LM Studio or Ollama installed, and the Qwen3.5-0.8B model downloaded.
If you haven’t done so, follow the Software Installation Instruction.
Please also sign in to Google AI Studio at aistudio.google.com with your Google account. No waiting list or approval needed — just accept the privacy terms and you’re in. Google AI Studio gives you free access to Google’s latest models (e.g., Gemini 3.1 Pro) for experimentation.
OSU members: Ohio State provides approved access to Google Gemini — faculty and staff can log in at gemini.google.com using your university credentials. OSU also provides access to Google NotebookLM and Google Colab. See the full list of OSU Approved AI Tools.
If you have any problem with the software installation, don’t panic! You can follow along with any chatbot (e.g., ChatGPT) for some parts of the workshop.

Agenda

What is generative AI?
The nature and limitations of LLMs
Prompt engineering essentials
Use cases for East Asian Studies
Context is the solution: RAG and tool use

Let’s Begin With the Great Material Arts Master

Teaching goal: After this session, you will obtain the essential concepts of large language models (LLMs). It is not about the tools, but the mindset.

What is Generative AI?

Important

Everything is PREDICTION!

Large language models (LLMs) are trained on massive text corpora to predict the next token (word or subword).
When you type a prompt, the model generates a response one token at a time, each token predicted based on all the tokens that came before.

Image source: Chip Huyen, *AI Engineering: Building Applications with Foundation Models*, First edition (O’Reilly, 2025). Chapter 1.

Autoregressive Language Models

flowchart LR
    A["Input: The capital of France is"] --> B["P(Paris) = 0.92"]
    B --> C["P(the) = 0.45"]
    C --> D["P(city) = 0.38"]
    D --> E["..."]

Each token is generated sequentially based on the preceding context.
The model assigns a probability to each possible next token and samples from that distribution.

Important

Every token is generated based on PROBABILITY!

Galton box demonstration of the normal distribution (Wikimedia Commons)

Four Components of AI-Assisted Research

Component	Description	Example
Model	The LLM engine	GPT, Claude, Qwen, DeepSeek, LLaMA
Prompt	Your instructions and queries	Use cases, system prompts
Context	Additional information provided	Documents, databases, knowledge bases
Tools	External capabilities	Web search, APIs, MCPs

Let’s Get Started with LM Studio

Open LM Studio on your computer.
Load the Qwen3.5-0.8B model from the model dropdown.
You should see the chat interface ready for input.

For Ollama users, open your terminal and run:

ollama run qwen3.5:0.8b

Note

We choose a small 0.8B model because it may not be as “smart” as state-of-the-art (SOTA) models and uses fewer engineering tricks. This lets us observe the nature of a foundation model more clearly.

The Limitations of Prediction

Random Numbers

Try to repeat the following query multiple times with Qwen3.5-0.8B. What do you observe?

Give a random number between 1 to 100.

Explanation

Knowledge Cutoff

Every model has a training data cutoff date. After training, a model’s parameters are frozen — its built-in knowledge does not change.

Try the following prompts in LM Studio (or Ollama):

What is your knowledge cutoff date?

Who is the prime minister of Japan?

Tip

Compare the model’s answer with the current facts. Is the answer correct? Why or why not?

Hallucination

LLMs can generate plausible-sounding but incorrect information — this is called hallucination. Because the model is predicting the most probable next token, it may “make up” facts.

Try these prompts:

How many "r" in strawberry?

How many "u" in Labubu?

If you don’t know what Labubu is, check here.

How many "a" in gandamala?

Note

Character counting is notoriously difficult for LLMs because they process tokens, not individual characters. A token might represent a whole word or a piece of a word.

Autoregressive Nature

LLMs read text from left to right (or beginning to end). They struggle with tasks that require reading in reverse or non-sequential processing.

Try this prompt:

Can you tell me the meaning of this sentence: "B1 ammeG ehg tsniaga tluser eht erapmoc dna sledom ATOS eht htiw xobdnaS dravraH ni tpmorp emas eht yrt nac uoY"

Tip

The sentence is written backwards. Can the model figure it out? Why might this be difficult?

Autoregressive Nature: Classical Chinese

This example is revised from a similar prompt by Professor Peter Bol:

請找出下文所提及的歷史人物：
教宗外南調科詞宏學博中復士進舉後官入補蔭初精益索講熹朱栻張友又既游憲胡辰應汪奇之林從長傳之獻文原中有庭家之本學謙祖州婺居始祖其自也孫之問好丞右尚書恭伯字謙祖呂

Now try the same text in the correct reading order:

請找出下文所提及的歷史人物：
呂祖謙字伯恭書尚右丞好問之孫也自其祖始居婺州祖謙學本之家庭有中原文獻之傳長從林之奇汪應辰胡憲游既又友張栻朱熹講索益精初蔭補入官後舉進士復中博學宏詞科調南外宗教

Note

English translation: “Lü Zuqian, styled Bogong, was the grandson of Lü Haowen, who served as Right Vice Minister. Beginning from his grandfather, the family settled in Wuzhou. Zuqian’s learning was rooted in his family tradition, which preserved the scholarly heritage of the Central Plains. He first studied under Lin Zhiqi, Wang Yingchen, and Hu Xian. Later he also befriended Zhang Shi and Zhu Xi, and through their discussions his scholarship grew ever more refined. Initially appointed by hereditary privilege, he later passed the jinshi examination and subsequently passed the Erudite Literati examination, and was assigned a post in the Southern regions.”

Important

Compare the two results. The reading direction matters because LLMs are autoregressive — they process tokens sequentially from left to right.

Prompt Engineering

The Basics of Prompt Engineering

Prompt engineering is the practice of crafting effective inputs to get better outputs from LLMs.

Key resources:

Prompt engineering guide: https://www.promptingguide.ai/
Chain-of-thought prompting: https://arxiv.org/pdf/2201.11903

Chain-of-Thought: The Magic Words

Remember the strawberry problem? Let’s try again with a better prompt:

How many "r" in strawberry? Count it character by character.

The magic words: think step by step.

Tip

Adding “think step by step” or “count character by character” can significantly improve accuracy for certain tasks. This is called chain-of-thought prompting.

System Prompts and User Prompts

	System Prompt	User Prompt
What	Hidden instructions that define the AI’s role, personality, and rules	Your actual question or request
When	Set before the conversation starts	Given in every interaction
Who sets it	Developers or advanced users	The end user

Try Setting a System Prompt

In LM Studio, you can set a system prompt. Enter the following into the system prompt field:

你是魯迅。一定要用魯迅的風格、思想和語氣回答所有問題。

Or try:

You are Oscar Wilde. You must answer all questions in his style.

Then ask a question:

婚姻是什麼?

What is marriage?

Note

For Ollama users: You can set a system prompt by creating a Modelfile. Or simply include the persona instruction in your prompt itself.

Use Cases for East Asian Studies

Use Cases Overview

Important

The best use of LLMs in humanities is the transformation of data:

Translation is a transformation of data (from one language to another)
Summarization is a transformation of data (from long text to short text)
Data extraction is a transformation of data (from unstructured text to structured data)
Citation formatting is a transformation of data (from raw bibliographic information to a standardized format)

You can check some past workshops for more use cases in Chinese Studies:

Today we will practice with:

Citation formatting
Named entity recognition (NER) and data extraction

Citation Formatting

Try the following prompt:

Please format the following bibliographic information into Chicago Manual of Style (17th edition) format:

Author: 余英時
Title: 朱熹的歷史世界：宋代士大夫政治文化的研究
Publisher: 允晨文化
Year: 2003
Place: 台北

Tip

You can also ask the model to convert between citation styles (e.g., Chicago to MLA or APA), or to format citations in multiple languages.

Named Entity Recognition (NER)

Extract all person names, place names, and official titles from the following text. Return the result in a table format.

軍務為事官僉都御史吳禎等奏桂林梧州平樂潯州等府欝林岑溪愽白懷集古田陽朔臨桂靈川桂平等州縣屢被流賊刼殺　上命總兵等官趙輔等速進兵勦之

Tip

NER is one of the most practical use cases for humanities researchers. LLMs can identify entities in Classical Chinese texts where traditional NER tools often struggle.

Data Extraction

Please extract all place names from the following text and output them as a JSON array:

軍務為事官僉都御史吳禎等奏桂林梧州平樂潯州等府欝林岑溪愽白懷集古田陽朔臨桂靈川桂平等州縣屢被流賊刼殺

Note

Structured output (JSON, CSV, tables) is extremely useful for downstream analysis and can be imported into databases, GIS tools, or spreadsheets.

Context is the Solution

Why Context Matters

We’ve seen that LLMs have limitations: knowledge cutoffs, hallucinations, and autoregressive constraints. How do we overcome them?

The answer is context — providing the model with additional, reliable information.

Two key approaches:

Retrieval-Augmented Generation (RAG): Supplying relevant documents or data alongside the prompt
Tool use: Allowing the model to query external systems (APIs, databases, web)

Retrieval-Augmented Generation (RAG)

flowchart LR
    A[User Query] --> B[Retrieve Relevant Documents]
    B --> C[Combine Query + Documents]
    C --> D[LLM Generates Answer]
    D --> E[Response with Sources]

Instead of relying solely on the model’s training data, we provide relevant context at query time.
The model can then ground its response in the provided documents.

A popular RAG application is Google’s NotebookLM, which allows you to upload documents and ask questions grounded in their content.

Tool Use and MCP

A major advancement of GenAI in 2025–2026 is the usage of tools by LLMs. The most important standard is the Model Context Protocol (MCP) developed by Anthropic.

MCP allows LLMs to:

Fetch web pages
Query databases and APIs
Search library catalogs
Access geographic information systems

Demo: Calendar Conversion Without vs. With MCP

Without MCP — try asking the model directly:

Convert the Chinese date 乾隆三年正月初三 into the western calendar.

Warning

The model may hallucinate a date or admit it cannot perform the conversion. Historical calendar conversion requires specialized lookup tables — it is not something an LLM can reliably calculate from training data alone.

Adding the Calendar Converter MCP to LM Studio

In LM Studio’s left sidebar, click the hammer icon (Tools).
Click “+ Install”, then click edit mcp.json.
Paste the following configuration and save:

{
  "mcpServers": {
    "calendar": {
      "url": "https://calendar-converter.098484.xyz/sse/"
    }
  }
}

Switch on mcp/calendar. Change from “Per-tool permissions” to “Always allow all tools”.

Try Again With MCP

Now ask the same question again:

Convert the Chinese date 乾隆三年正月初三 into the western calendar.

This time, the model will call the CJK Calendar Converter tool and return the correct answer: 1738-02-21.

Important

Without the tool, the model guesses. With the tool, the model looks it up from a database of ~131,000 lunar month records covering China, Japan, Korea, and Vietnam. This is the power of MCP — giving LLMs access to authoritative data sources.

OCR: A Transformative Use Case

A major obstacle to adopting digital tools in East Asian Studies prior to 2025 was Optical Character Recognition (OCR).

Traditional OCR struggled with CJK historical texts
Multiple calligraphic styles posed significant challenges

Newly emerging OCR models in 2025–2026:

Note

Vision models (e.g., Gemini 3.1 Pro) can also perform OCR when you upload an image. We will see a demonstration if time permits.

OCR Apps You Can Use Today

GLM-OCR-MLX (Desktop, Apple Silicon)

glm-ocr-mlx is a desktop OCR app developed by DCI engineer Kevin Lin. It runs locally on Apple Silicon Macs using the GLM-OCR model.

Installation guide: GLM-OCR Installation Guide

OCR Batch Processor (Web App, uses LM Studio)

OCR Batch Processor is a progressive web app (PWA) that connects to your local LM Studio instance for privacy-focused OCR. Since you already have LM Studio installed, you can use this right away.

Dual providers: LM Studio (local) or Google Gemini (cloud)
Batch processing: Process multiple files in a queue, with smart skip for already-processed files
PDF tools: Convert PDF pages to images, split double-page scans into single pages
Side-by-side viewer: Compare original image, annotated image, and structured Markdown/HTML output

Tip

To use with LM Studio: download a vision model (e.g., Qwen-VL, Gemma-3-Vision), start the LM Studio local server, then open the OCR Batch Processor in your browser.

Free OCR (Browser-based)

Free OCR is a browser-based OCR tool that processes PDFs and images locally in your browser — no file uploads to external servers required. It uses the GLM-OCR model hosted on Hugging Face for text extraction.

Session 2 Summary

What We Learned

Generative AI is prediction: LLMs generate text token by token based on probability.
Limitations are real: Knowledge cutoffs, hallucination, and autoregressive constraints affect all models.
Prompt engineering helps: Chain-of-thought, system prompts, and clear instructions improve results.
Context is the solution: RAG and tool use overcome many LLM limitations.
The best use case is data transformation: Translation, summarization, data extraction, and citation formatting.

Looking Ahead: Session 3 & 4

In the afternoon sessions, we will dive deeper into:

Vibe coding with Google Antigravity
Building your own applications through natural language
Agentic approaches to humanities research

Resources

Prompt engineering: https://www.promptingguide.ai/
Model Context Protocol: https://modelcontextprotocol.io/
LM Studio documentation: https://lmstudio.ai/docs/app
Ollama documentation: https://ollama.com
Past workshops: https://kwokleongtang.net
Dan Cohen’s essays on AI and GLAMs: The Library’s New Entryway