Digital China Initiative

Harvard University


Kwok-leong Tang

Digital China Initiative, Harvard University

Digitizing Việt Nam: Vietnamese Studies in the Age of Digital Humanities and Artificial Intelligence
Columbia University · April 2026

The Digital China Initiative was established to support Harvard students and faculty in adopting digital tools and methods for China Studies.

Building Momentum in 2023


Tools of the Trade Conferences

A conference series designed to introduce humanities scholars to emerging digital tools and methodologies for research in China Studies.

First GenAI Workshop for Humanities

In April 2023, DCI and the China Biographical Database (CBDB) co-organized the first generative AI workshop for humanities at Harvard.

Connectivity and Individuality in Textual Traditions

Augmenting Retrieval for Eurasian Languages


A multi-institutional team led by Peter K. Bol received a $600,000 Schmidt Sciences grant to develop multilingual AI tools that help scholars detect patterns across Eurasian historical documents written in eight underserved languages — examining how textual traditions spread, change, and compete with one another.

Advances in OCR and ASR


Over the past eighteen months, optical character recognition (OCR) and automatic speech recognition (ASR) have improved dramatically, driven by the rapid development of multimodal models — models capable of processing text alongside images or audio.

Many of these models can now run on local machines, making high-quality digitization accessible without requiring cloud infrastructure or costly API subscriptions.

Vibe Coding


What Vibe Coding Solves


The Opportunity Cost Problem

Humanities researchers often face the prospect of investing weeks or months to learn a technical skill — GIS, web scraping, data visualization — that they may only use once. Vibe coding removes this barrier entirely.

From Learning Curves to Research Output

Instead of spending time mastering tools, scholars can spend time asking better questions. The time saved on technical training can be redirected toward interpretation, analysis, and discovery.

Build for Humans and Non-Humans


Build for Humans

Design intuitive interfaces, clear navigation, and rich visual presentations so that scholars and the public can engage meaningfully with digitized materials.

Build for Non-Humans

Structure data with clean metadata, open APIs, MCP servers, skills, and machine-readable formats so that AI systems can retrieve, process, and reason over our collections — amplifying their reach and longevity.

As creators of digitization projects, we have a responsibility to design for both audiences.

Thank You Very Much!


Kwok-leong Tang · Digital China Initiative, Harvard University