Blogs, Academic, Librarian 07 May 2026

How AI is revolutionizing archival research

Paired with human oversight, AI is generating “usable” text faster to broaden access and discovery

Libraries excel at preservation, capturing and organizing content, which is important to research and to maintaining an accurate historical record. However, making preserved content usable, without exhausting staff capacity, remains more difficult, especially where archival research is concerned. While digitization helps expand access to archival collections, scanning alone does not guarantee discovery or use.

Artificial intelligence (AI) is emerging as a powerful solution. Librarians and archivists are introducing AI as a means for transcription at scale, creating a shared textual layer across manuscripts, audio and broadcast media. That layer makes it possible to describe, search and segment collections at much finer levels of detail and to apply tools such as translation or passage level analysis consistently rather than selectively.

A recent webinar from Clarivate, AI in the archives: How AI is transforming archival research from institutions to content providers, examined how AI is being applied to archival materials across formats, including handwritten manuscripts, audio recordings and broadcast news. Speakers from Vanderbilt University, including Daniel Genkins, Interim Director, Digital Lab, and Jim Duran, Director, Vanderbilt Television News Archive, along with ProQuest archival product managers Farhana Hoque and Marc Cormier explored the practical workflows already putting AI to use and what they mean for libraries, archives and researchers.

“The Turning Point is The Transcript”

The speakers offered a concrete example of the impact of AI in archival research. They presented an image of a late16th century baptismal register from Havana. Its dense, inconsistent handwriting would normally require weeks of close reading by a specialist, but within minutes, the page was rendered as searchable text. The transcription was imperfect, but readable enough to support analysis that might otherwise never happen.

That moment reflects the way AI is being applied to archival collections: Human expertise remains central, with AI being used to reduce the time and effort required to reach the point where close reading and scholarly judgment can begin. Once usable text exists, whether drawn from handwriting, audio or video, collections can support search, segmentation, captioning and instruction in ways long associated only with born digital materials.

As Cormier put it, “The turning point is the transcript. Once a document, recording or broadcast becomes a reliable transcript, it turns into something researchers can actually work with.”

Making Handwritten Archives Searchable

Genkins described this approach through his work on the Slave Societies Digital Archive, which includes more than 700,000 pages of early modern manuscripts. Researchers often arrive with specific questions about names, places or dates, yet item level descriptions rarely capture that level of detail. When materials are handwritten in archaic Spanish or Portuguese, even identifying the right volume can take hours.

Genkins demonstrated how Archivault, an AI-powered modular platform developed at Vanderbilt University, produces usable text and structured outputs across formats, including handwritten registers, tables exported as CSV and multi object PDFs. While Archivault is capable of fully automating many extraction and transcription workflows, its modular design allows for varying degrees of human oversight. As Genkins noted, “The value is speed rather than perfection—providing a force multiplier that automates the rote while enabling staff to focus on high-level review and enrichment.” This ensures that while the heavy lifting is automated, accountability and care remain firmly human responsibilities.

What Happens Once Text Exists

Hoque focused on how transcription changes a researcher’s interaction with primary sources. She described a common scenario: opening a high resolution manuscript image and struggling to determine relevance before committing time to close reading.

She used examples from the Cecil Papers, a major collection in ProQuest One History that spans late 16th and early 17th century manuscripts documenting English government. In the ProQuest platform, transcripts appear alongside images and are supported by the AI-powered ProQuest Research Assistant. Users can use the AI tool to surface key points, view themes and ask focused questions of the text. As Hoque explained, the tool is “organizing what’s already there,” helping users decide how a document fits into their research without flattening it into a simplified summary.

The same approach applies to audio. In Latin American Studies: The NPR Archive, 1979–1990, available in ProQuest One Global Studies & International Relations, recordings feature synchronized transcripts that can be searched and navigated more easily. Language support via ProQuest Research Assistant enables researchers to pose questions in English about Spanish language audio and work across languages in ways that were previously impractical.

A Transcript-First Archive At Scale

Duran described how AI and human oversight are combined to keep pace with rapid collection growth at the Vanderbilt Television News Archive. Labor intensive abstract writing has been replaced with speech to text transcription and simplified segmentation, dramatically reducing processing time and eliminating major backlogs. AI systems assist with segmentation, metadata extraction and discovery, but staff oversight, validation, and contextual judgment are built into every stage. Experimental efforts, including a student led automation project through Vanderbilt’s Cloud Innovation Lab, powered by Amazon Web Services, emphasize human review and feedback to ensure accuracy and trust. The goal is not to replace staff, but to extend their capacity, using AI to handle scale and repetition while preserving human judgment where it matters most.

A Broader Shift

Taken together, these examples point to a widening view of AI’s role in primary source archives. The most meaningful gains come not from replacing expertise, but from reaching the point where expertise can be exercised more often and by more people.

Explore the full discussion in the on-demand webinar.

Recent Blogs

Related Blogs

The Pros and Cons of Teaching with Primary Sources

20 April 2021

The Pros and Cons of Teaching with Primary Sources

Study Reveals the Challenges of Using Primary Sources at the Undergraduate Level – and Why It’s Worth the Effort

Read more
The Impact of AI in the Classroom: Experiences and Learnings

27 March 2025

The Impact of AI in the Classroom: Experiences and Learnings

Learn what librarians and UX experts shared in a thought-provoking webinar

Read more
AI in libraries: sustainability, responsibility, and a practical path forward

25 March 2026

AI in libraries: sustainability, responsibility, and a practical path forward

How libraries can adopt AI in ways that strengthen service while remaining accountable to environmental, ethical, and institutional responsibilities

Read more
arrow_upward