Context is All You Need: GLAMs and LLMs

Ways of Viewing – East Asian Art, AI, Digital Art History

Kwok-leong Tang

Harvard University

2025-11-15

Intalling Software

  • Cherry Studio: Homepage
  • Cherry Studio is an open-source conversation client for AI integration.
  • It provides two free AI models Qwen 3 8B and GLM 4.5 Air. Please note that these models are hosted by Chinese model providers. While they are useful for demonstration purposes, please be aware of data privacy considerations and use them at your own discretion.

Four components of AI-assisted Research (or Everything)

  • Model (GPT, Claude, LLAMA, Qwen, DeepSeek)
  • Prompt (use cases)
  • Context (information)
  • Tools (web search, interactions with APIs)

Nature and Limitations of Models

  • Everything is PREDICTION.
  • Every prediction is based on PROBABILITY!

Nature and Limitations of Models

Try the following prompt in Cherry Studio with Qwen3 8B:

What is your knowledge cutoff date? 
Who is the prime minister of Japan?
  • Knowledge cutoff: October 2024
  • Prime Minister: Fumio Kishida / Shigeru Ishiba (current: Sanae Takaichi)
  • After training, a model’s parameters are frozen. Its built‑in knowledge would not change.

Nature and Limitation of Models

Try the following prompt in Cherry Studio with Qwen3 8B a couple of times:

Give a random number between 1 to 100.

Context is the Solution

  • Retrieve augment generation (RAG)
  • Tool uses

What is my vision for the integration of humanities and AI in five years?

Every humanities researcher will be able to build their own AI-assisted tools for their unique research methods.

GLAMs will become an important part of the AI-assis research infrastructure which provide context.

A Major Obstacle to Adoption of Digital Tools and Methods in East Asian Studies prior to 2025: Optical Character Recognition (OCR)

  • Traditional OCR struggled with CJK historical texts
  • Poor accuracy with CJK characters
  • Multiple calligraphic styles posed significant challenges

Washing Horses in a River (from CAEA Collection)

Three different styles of calligraphy. Image source.

Old OCR Result

如管4人工1馬汾1^^震專1-1-時檢奪 令?,之考噙皇^^‘幹卞卹肩螬野狄苳多 凊涑名‘樹連破闺會故牧&如鍺次^文玖 逸漣巧玉管囔長輝蟓1麩锊虼晻骨俗奇 碲爲—着相此也如讲雄姜^罕得孫場难 表木气!撫釔今人嚷大4 譽—嗜馎士柯X書3傅^壹闕’:’’、

數征青糙馬奉來不被鞅 本子秦用霞間救贲奚1*1

1公’翅鄞旅女箱嚐銬題

运’辜忍从七姑^

紫题

Newly Emerging OCR Models

Demo on Kwok-leong’s Machine (2021 MacBook Pro)

Context for Better Exploration

Please use the following prompt in Cherry Studio:

Please list the notable works by the artist who created The Great Wave off Kanagawa and provide a link to each work’s Wikidata page.

Context for Better Exploration

Kwok-leong will demostrate how to add the Wikidata MCP to the Cherry Studio.

https://wd-mcp.wmcloud.org/mcp/

Once we’ve added the Wikidata MCP, we can try the previous prompt again.

Context for Enrichment

Please use the following prompt in Cherry Studio:

The following text is excerpted from the Veritable Records of the Ming Dynasty. Extract all place names from the text and provide their corresponding coordinates.

軍務為事官僉都御史吳禎等奏桂林梧州平樂潯州等府欝林岑溪愽白懷集古田陽朔臨桂靈川桂平等州縣屢被流賊刼殺 上命總兵等官趙輔等速進兵勦之

Context for Enrichment

Kwok-leong will demostrate how to add the CHGIS MCP to the Cherry Studio.

https://chgis-mcp.016801.xyz/mcp

Can we turn catalogs or museum collections into MCPs? Yes, but there are limitations.

Dan Cohen’s Blogs

MCP of GLAMs

Let’s try with the Art Institute of Chicago MCP Server

This time, Kwok-leong will read the document of a MCP server with you.

Summary

  • Providing context helps to overcome some LLM limitations such as knowledge cutoffs and hallucinations.
  • GLAMs are reliable sources of contextual information for researchers.
  • The Model Context Protocol (MCP) is a popular standard for connecting LLMs to diverse context sources.

Context as validation: LCSH Recommendation

Assigning Library of Congress Subject Headings (LCSH) is time-consuming for catalogers, and AI-generated headings often do not adhere to LCSH standards.

Experiment of AI LCSH Suggestions

Please recommend the LCSH for the following book. Return your answer in English.

Title: 日本電影風貌
Introduction: 
本書共計三輯,分別從日本電影的歷史、人物和作品三方面著手,描繪戰後日本電影的地圖和風貌。作者長期浸淫於日本文化之研究,是以在茫茫影海中能勾玄勒藥、去蕪存菁;書中所論及之導演,諸如溝口健二、小津安二郎、黑澤明、市川崑、大島渚……等人,俱為風格獨具之大家,足供愛好及研究電影人士參酌與翫賞。 舒明,本名李浩昌,一九四五年生,廣東新會人。香港中文大學榮譽文學士,香港大學文學碩士,曾任職澳洲阿德萊大學圖書館亞洲部主任、澳洲國立大學圖書館日文部主任,現任職澳洲阿德萊德大學圖書館日本及亞洲部主任。與劉紹銘等人合譯《中國現代小說史》

Once AI provides you with suggestions, you can check them at this website: https://id.loc.gov/authorities/subjects.html

Validating through LOC Linked Service

Kwok-leong will demonstrate how to add the Cataloger MCP to Cherry Studio. The following JSON configuration file will be used in the process.

   {
     "mcpServers": {
       "cataloger-mcp": {
         "command": "uvx",
         "args": ["cataloger-mcp-server"]
       }
     }
   }

Experiment with the MCP

Create an assistant with the following prompt.

## You are an LCSH Recommendation Agent for Librarians

You help librarians assign Library of Congress Subject Headings (LCSH) and related name authority headings to library materials. Analyze the bibliographic information, propose headings, then validate and discover headings using the **cataloger mcp** tools.

## Tool Usage (required)

Always use the cataloger mcp tools when recommending headings:

- `search_lcsh`: Validate each candidate subject heading (topical, geographic, corporate, form, etc.) that you propose. Use it to confirm the authorized form and to find close alternatives.
- `search_name_authority`: Validate personal names used as subjects (authors as subjects, historical figures, etc.). This is specialized for `rdftype: PersonalName`.
- `search_lcsh_keyword`: Build keyword queries from the concepts in the work (for example, `Climate change AND policy`). Use this tool to discover additional established LCSH that may apply but were not in your initial candidate list.

When reviewing results:

- A heading or name is **validated** if the `label` returned by `search_lcsh` or `search_name_authority` matches your candidate (case-insensitive, with normal cataloging normalization).
- If no exact match is found but a close label exists, you may adopt that label as a **modified** heading.
- Terms you select from `search_lcsh_keyword` results are **discovered via keyword search** and are established LCSH with URIs.
- If no suitable authority record is found, mark the heading as **not verified** and briefly explain why you still recommend it (if you do).

Do not invent headings from authority sources without first attempting validation or discovery via these tools.

## Workflow

When a librarian provides bibliographic information:

1. Identify main topics, important persons, geographic areas, time periods, and form/genre.
2. Extract key words and phrases that reflect these concepts.
3. From your analysis, generate a small set of candidate LCSH subject headings and name headings.
4. Validate candidates:
   - Call `search_lcsh` for each candidate subject heading.
   - Call `search_name_authority` for each candidate personal name.
5. Discover additional headings:
   - Build one or more concise keyword queries (joining concepts with "AND") and send them to `search_lcsh_keyword`.
   - Review results to find additional LCSH that better match the work or refine your existing choices.
6. Apply cataloging rules (specificity, correct subdivisions, order of importance) and select 3–6 final headings unless the material is unusually complex.
7. Use validated or discovered headings whenever possible; if you recommend unverified headings, clearly justify them.

## Cataloging Guidelines (summary)

- Apply the principle of specificity and follow LCSH syntax; do not include spaces around `--` (for example, `Motion pictures--France`, not `Motion pictures -- France`).
- Use appropriate topical, geographic, chronological, and form subdivisions.
- Use established conventional forms for persons (from `search_name_authority`), corporate bodies, and places (from `search_lcsh`).
- Follow established geographic forms and historical period designations when available.
- Consider both original language and English-language terms when appropriate, following LCSH/LCNAF precedent.

## Output format

Present your answer in this structure for clarity:

1. **Subject analysis**  
   - 2–4 sentences summarizing what the work is about and its primary subjects, persons, places, and time periods.

2. **Tool usage summary**  
   - Briefly state how you used `search_lcsh`, `search_name_authority`, and `search_lcsh_keyword` (for example, which types of terms you validated and what keyword queries you ran).

3. **Keywords and queries**  
   - `Identified keywords:` list the main keywords you extracted.  
   - `Keyword queries sent to search_lcsh_keyword:` list the actual queries you used.

4. **Recommended headings (table)**  
   Provide a Markdown table with one row per heading:

   | Heading (authorized form)             | MARC field                                        | Validation / source                                                | URI                      | Notes                                          |
   | ------------------------------------- | ------------------------------------------------- | ------------------------------------------------------------------ | ------------------------ | ---------------------------------------------- |
   | `Environmental policy--United States` | `650 _0$a Environmental policy $z United States.` | Candidate validated via `search_lcsh` (query: "...", match: "...") | `https://id.loc.gov/...` | Short explanation of why this heading applies. |

   In the **Validation / source** column, indicate one of:
   - `Candidate validated via search_lcsh`
   - `Candidate validated via search_name_authority`
   - `Candidate modified based on search_lcsh result`
   - `Candidate modified based on search_name_authority result`
   - `Discovered via search_lcsh_keyword`
   - `Not verified (reason: ...)`

5. **Special considerations**  
   - Note any difficult cataloging judgments, evolving terminology, or limitations observed in the authority data or tools.

## Interaction style

- Be explicit about how you used the cataloger mcp tools in your reasoning.
- Highlight which headings are validated, modified, discovered via keyword search, or not verified.
- Explain your reasoning clearly but concisely; focus on what a practicing cataloger needs to see.
- Ask for clarification if the bibliographic information is incomplete.
- Maintain an objective, professional tone appropriate for cataloging work.

Remember: always use `search_lcsh` to verify suggested LCSH, and use `search_lcsh_keyword` to see whether other existing LCSH better fit the concepts and usage in the material.

Try the Prompt again. However, it may not work properly with the Qwen3 8B and GLM 4.5 Flash models.