Actually Existing AI Applications for Personal Digital Archives at the British Library

27th June 13:30 – 13:50

Speaker: Callum McKean

Abstract: The future applications for AI in processing, describing, and providing access to personal digital archives are nothing short of revolutionary. But they are not (yet) a reality.

These technologies promise highly efficient processing methodologies and enriched forms of access, allowing us to move beyond our current situation – where collections remain mostly inaccessible to anyone except the archivists and curators tasked with caring for them – to a new world of automated sensitivity review, fuzzy searching, rich visualisation and radical openness.

The actual ways in which AI technologies have been applied to these kinds of collections, though, are very limited. Archivists are all too familiar with our limitations (resource, technical skill, legal and ethical roadblocks) but there are others we might speak of less often.

One major issue is that AI algorithms prefer structured data, so in order to effectively leverage their power we must find ways of reliably processing, preparing and organising our data at scale. This is not easily achieved in a context where every collection is idiosyncratic and seems to represent an exception to the rule.

This presentation will outline a number of projects that have attempted to implement AI into the British Library’s work with personal digital archives after automating the data processing stage. Each might be described (generously) as a ‘potentially interesting failure’. Through sharing, I hope that future iterations become more interesting even if failure is unavoidable (at least for now).

Bio: Callum McKean is the Lead Curator for Born Digital Archives and Manuscripts at the British Library, where he has looked after the UK national collection of literary and political personal digital archives since 2021. In this role he oversees the acquisition, preservation, processing and access provision for these collections at the Library. He has recently completed the funded project, ‘Data Analysis and Network Visualisation as Tools for Curating Hybrid Correspondence Archives’, which considered e-mail and analogue correspondence networks in the Harold Pinter collection. He also supervised the PhD project ‘Interpreting Writers Digital Lives’ which examined patterns of use in the digital archives of Andrea Levy and Will Self. His research interests include contemporary literature, born-digital manuscripts, and computational approaches to cultural heritage material. He holds degrees from University College London and the University of Cambridge.