LUSTRE Workshop 4: The Future of AI to Unlock Digital Records


The fourth workshop organised by the AHRC-funded project Unlocking our Digital Past with Artificial Intelligence (LUSTRE) led by Dr Lise Jaillant (Loughborough University) took place on 27-28 June 2024 at the Science Museum in London.

Through a series of talks, panel discussions and breakout group, this two-day long workshop explored the emerging trends, challenges, and transformative innovations shaping the future of AI in the GLAM (Galleries, Libraries, Archives, and Museums) sector. Set against the backdrop of unprecedented technological advancements with AI, this workshop aimed to spark dialogue, foster collaboration, and inspire action among government professionals, GLAM sector professionals, and academics.

Invited speakers included: Dr James Lappin (Central Digital Data Office, UK), Callum McKean (The British Library, UK), Dr Lise Jaillant (Loughborough University), Prof Claire Warwick (Durham University, UK), Dr David Brown (Trinity College, Dublin, Ireland), Rebecca Taylor & Angharad Turner (Independent Office for Police Conduct, UK), Nicole Coleman (Stanford University, USA), Dr Javier de la Rosa (National Library of Norway), Prof Paul Gooding (University of Glasgow), Prof Richard Marciano (University of Maryland, USA), David Canning & Dr Kelcey Swain (The Cabinet Office, UK), and Dr Tim Boon (Science Museum London).

The workshop report can be accessed via the following link: LUSTRE Workshop 4 Report

Approaches for Using AI to Manage Records at Scale, Dr James Lappin (Central Digital Data Office)

Actually Existing AI Applications for Personal Digital Archives at the British Library, Callum McKean (British Library)

The Future of Access to Digital Records: A User’s Perspective, Dr Lise Jaillant (Loughborough University)

No Coalmines, No Canaries: 1990s Cyberspace and the Future of AI Policy,
Prof Claire Warwick (Durham University).

From Archives to Insights: Integrating Transkribus and ChatGPT in Archival Workflows
, Dr David Brown (Trinity College Dublin).

Getting Ready for AI at IOPC: First What is the Problem, and is AI the Answer?
, Rebekah Taylor and Angharad Turner (Independent Office for Police Conduct).

Legal Design for an AI Future: A Case Study in Law Enforcement Policy Manuals
, Nicole Coleman (Stanford University).

Artificial Intelligence at the National Library of Norway
, Dr Javier de la Rosa (National Library of Norway).

iREAL: Indigenising Requirements Elicitation for Artificial Intelligence in Libraries
, Prof Paul Gooding (University of Glasgow).

Harnessing Generative AI to Support Exploration and Discovery in Archival Collections
, Prof Richard Marciano (University of Maryland).

Using AI to Review Records in the Cabinet Office
, David Canning and Dr Kelcey Swain(The Cabinet Office).

Congruence Engine and the Linking of Archaic Museum Data
, Dr Tim Boon (Science Museum).


LUSTRE Workshops 1 & 2 report

Over the course of nine months, the AHRC-funded LUSTRE project has made significant progress in its exploration of using Artificial Intelligence (AI) to unlock born-digital and digitised government archives. The project has already delivered a range of activities, including more than 50 semi-structured interviews with professionals in the GLAM sector, computer scientists, and scholars. Additionally, there have been four online lunchtime talks and two hybrid workshops. 

In this blog post, we will focus on the two hybrid workshops organised in collaboration with our project partner, the Cabinet Office. These workshops took place at the Central Digital and Data Office in Whitechapel, London, as well as online. The talks from both workshops have been recorded and can be accessed on the LUSTRE website through the following links: 

LUSTRE Workshop 1 

LUSTRE Workshop 2 

The primary objective of these workshops was to foster the establishment of a network of professionals in the archival field and academic researchers who share an interest in exchanging knowledge regarding the opportunities, challenges, and potential risks associated with AI for accessing, managing, and using born-digital archives. 
Summary of Workshop 1

Invited speakers to the first workshop in January 2023 were Professor Stephanie Decker and Dr Adam Nix from the University of Birmingham; Dr Jenny Bunn from The National Archives; and Dr Tony Russell-Rose from Goldsmiths University of London. 
The day began with a presentation by Professor Stephanie Decker and Dr Adam Nix, who discussed the practices of digital archival discovery, focusing specifically on the use of AI to connect context and content in emails. Their talk provided valuable insights from a user perspective, emphasising the significance of meaningful access to born-digital archives for researchers, particularly those in the humanities and social sciences. 

Next, Dr Jenny Bunn delved into the ethical dimension of using advanced technologies, which are often categorised under the umbrella term of AI. She highlighted the challenges that the introduction of AI presents to the delicate balance between transparency, accountability, and fairness in recordkeeping activities. Dr Bunn explored new approaches that can maintain transparency and accountability in recordkeeping practices amidst the use of AI, emphasising the importance of good governance. 

The third talk of the day was led by Dr Tony Russell-Rose, and it showed how the LUSTRE project effectively brings together researchers and professionals from the GLAM sector, digital humanities, and computer science. Dr Russell-Rose’s presentation focused on access to databases and proposed approaches that move away from traditional command-line query builders. His talk Searching, fast and slow: rethinking the query builder paradigm centred on a platform called 2Dsearch which redefines ‘advanced search’, reducing syntactic errors, enhancing semantic transparency, and providing support for reuse and optimisation. 

The day concluded with a thought-provoking roundtable discussion featuring Lise Jaillant, Adam Nix, Stephanie Decker, and Jenny Bunn, centred around ethical issues in AI and government records. The speakers further expanded on the ethical dimension of conducting research on digital archives using AI-assisted technologies. They unanimously emphasised the need for transparent, explainable, and accountable processes and technologies. 
Adam Nix drew attention to the risks and contingent liabilities of already available collections when novel forms of searches unveil new materials. Stephanie Decker reflected on the challenges of anonymisation and privacy, highlighting the delicate balance between anonymising data and preserving valuable contextual research information. Jenny Bunn emphasised the active effort archivists must undertake to avoid embracing the ‘myth of objectivity’ when adopting new technologies, asserting that archival practice is inherently subjective and necessitates the archivist’s engagement as a gatekeeper. She also discussed the increased sensitivity risks associated with larger and more complex datasets. 

Overall, the first workshop provided valuable insights into the intersection of AI and born-digital archives, exploring both the practical and ethical dimensions of using  AI technologies for archival research. 

Summary of Workshop 2

The second LUSTRE workshop, titled ‘AI and Born-Digital Archives in the Government Sector and Beyond: Challenges and Opportunities’, was once again held  in Whitechapel. The workshop took place in May 2023 and, like the previous one, it was a hybrid event with five presentations, including two remote presenters. 

Invited speakers included Dr Keegan McBride from the Oxford Internet Institute, James Lappin from Loughborough University, Dr Lise Jaillant from Loughborough University (PI of the LUSTRE project), John Sheridan from The National Archives, and Professor Jason R Baron from the University of Maryland. 

Dr Keegan McBride commenced the day with his talk, ‘Artificial Intelligence in the Public Sector: Separating Myth from Reality’. He focused on differentiating the development of AI in the private sector from its applications in the public sector. Dr McBride highlighted the necessary building blocks for effective AI usage in the public sector, including technical, infrastructural, and legislative aspects. He also presented best practice case studies to illustrate AI implementation. 

James Lappin followed with a presentation on ‘AI and the Management of Email Accounts Over Time’, drawing on his doctoral research. He explored the impact of AI on access permissions and retention rules within digital corporate systems, particularly focusing on email. His talk covered various aspects, including the impact of email on recordkeeping, the effective organisation of correspondence, and key strategic choices when applying AI to email management. 

Dr Lise Jaillant, the Principal Investigator of the LUSTRE project, provided her perspective on the challenges posed by the transition from traditional archives to digital archives for professionals and academic researchers. She shared results from interviews conducted with GLAM professionals, researchers, and archive users. The participants highlighted two pressing obstacles in accessing digital records: mistrust between stakeholders and mistrust of technology. Dr Jaillant emphasised the importance of facilitating knowledge transfer, fostering collaborations between GLAM professionals, academics, and other stakeholders, and enabling access to archival materials as well as the discovery of their existence. 

Following a brief question and answer session involving both in-person and online participants, the workshop broke for lunch. The afternoon session commenced with a talk by John Sheridan, the Digital Director of The National Archives, titled ‘Navigating the Maelstrom (Looking for Lighthouses)’. He reflected on the rapid advancement of AI and large language models and the uncertainties these technologies bring. He presented strategic interventions that digital archives can implement to leverage AI’s opportunities. These interventions encompassed content classification and organisation, new text generation (including summaries and transcriptions), image/video generation based on instruction, and data transformation/manipulation based on instruction. 

The day concluded with a remote guest speaker, Professor Jason R Baron (University of Maryland). His talk focused on the process which should lead to the transfer of permanent federal government records to the National Archives and Records Administration (NARA) solely in electronic or digital formats by June 2024. 
He discussed the challenges NARA faces in meeting the 2024 deadline and the limitations presented by the Freedom of Information Act (FOIA) in providing timely access to archival records. Professor Baron also explored how AI tools can assist in ensuring real public access to government archives. He emphasised the role of AI in supporting archivists, records managers, and lawyers in filtering sensitive content from public records and recommended continued experimenting with Technology Assisted Review methods for efficiently discovering records and accurately segregating personal content. He also suggested exploring how generative AI could provide narratives explaining why documents or portions of documents have been withheld under freedom of information laws. Professor Baron concluded his talk by encouraging the embrace of AI, considering it not merely as a black box, but as a ‘gift’ to archivists to improve access to vast digital collections.

LUSTRE Workshop 2: AI and Born-Digital Archives in the Government Sector and Beyond: Challenges and Opportunities


The second workshop organised by the AHRC-funded project Unlocking our Digital Past with Artificial Intelligence (LUSTRE) led by Dr Lise Jaillant (Loughborough University) took place on May 4th at the CDDO in London and online.
Through a series of talks, this day-long workshop continued our discussion on AI applied to born-digital archives, particularly government archives.
Invited speakers included: Dr Keegan McBride from (Oxford Internet Institute), James Lappin (CO and Loughborough University), Dr Lise Jaillant (Loughborough University), John Sheridan (Digital Director, TNA) and Professor Jason R. Baron (University of Maryland).

Artificial Intelligence in the Public Sector: Separating Myth from Reality, Dr Keegan McBride (Oxford Internet Institute).

The impact of AI on records management,
 James Lappin (Loughborough University).

AI and Archives from a Researcher’s Viewpoint,
 Dr Lise Jaillant (Loughborough University).

Navigating the maelstrom (looking for lighthouses), John Sheridan (The National Archives).

Is NARA Ready Yet? The Newly Extended 2024 Start Date for Accessioning Records into the US National Archives Only in Electronic and Digital Formats, and What That Means, Professor Jason R. Baron (University of Maryland).


Workshop 2 – 4th of May 2023

Our Workshop 2 will take place on the 4th of May of 2023 in London both in person, at The White Chapel building in East London, and online. The workshop will continue our discussions on AI applied to born-digital archives, particularly government archives.

Details about registration and the full program of speakers will be published soon. 

AI AI and Archives AI Ethics born-digital archives Dark archives email archives lunchtime online talk Open Data recordkeeping Records Management search interface transparency UX


Workshop 1. AI and born-digital archives: Challenges and opportunities

Thursday 26 January 2023

Through a series of talks and a round table, this day long workshop delved into the challenges and opportunities that AI offers to the management and use of digital born archives.

The workshop was an hybrid event (in London and online).

Invited speakers included Professor Stephanie Decker and Dr Adam Nix from the University of Birmingham; Dr Jenny Bunn from The National Archives and University College London; and Dr Tony Russell-Rose from Goldsmiths University.


Finding light in dark archives: Using AI to connect context and content in email.

Professor Stephanie Decker and Dr Adam Nix (University of Birmingham).

The practice of digital archival discovery is still emerging, and the approaches future research will take when using digital sources remain unclear. Archival practice has been shaped by paper-based, pre-digital sources and guides assumptions around how researchers will access and make use of such collections. Paradoxically, dealing with the increasing relevance of born-digital records is not helped by the fact that many born-digital collections remain dark, in part while questions of how they should be effectively made available are answered. Our research takes a user perspective on discovery within born-digital archives and seeks to promote more meaningful access to born-digital archives for researchers. In doing so, our work deals with the implications that unfamiliar archival technologies (including artificial intelligence) have on disciplinary traditions in the humanities and social science, with a specific focus on historical and qualitative approaches.

Our work in this area currently focuses on the issue of context within organisational email, and the challenges of searching and interpreting large bodies of email data. We are particularly interested in how effective machine-assisted search and multiple pathways for discovery can be used to open contextually opaque collections. Such access is likely to leverage a collection’s structural and content characteristics, as well as targeted archival selection and categorisation. We ultimately suggest that by combining relatively open user-led interfaces with pre-selective material, digital archives can provide environments suited to both the translation of existing research practices and the integration of more novel opportunities for discovery. Our presentation will summarise our progress in this area and reflect on the technical and methodological questions our work here has raised.

Putting principle into practice: Transparency, recordkeeping and AI.

Dr Jenny Bunn (The National Archives and University College London).

At the level of principle it is difficult to argue against the inherent good-ness of ideals such as transparency, accountability and fairness, but they have never been easy to put into practice. The increasingly advanced assistance technologies, generally placed under the label of AI, can now offer us further complicate this picture; offering as they do new possibilities for us to distance ourselves from both the consequences of our decisions and the very making of them. Recordkeeping has long acted to bridge this distance and this presentation will consider the new forms it may need to take to continue to ensure that accounts are rendered and explanations offered in the enduring spirit of transparency.

Searching, fast and slow: rethinking the query builder paradigm.

Dr Tony Russell-Rose (Reader in Computer Science, Goldsmiths University).

Knowledge workers such as information professionals, legal researchers and librarians need to create and execute search strategies that are comprehensive, transparent, and reproducible. The traditional solution is to use command-line query builders offered by proprietary database vendors. However, these are based on a paradigm that dates from the days when databases could be accessed only via text-based terminals and command-line syntax. In this talk, we explore alternative approaches based on a visual paradigm in which users express concepts as objects on an interactive canvas. This offers a more intuitive UX that eliminates error, makes the query semantics more transparent, and offers new ways to collaborate and share best practices.

Ethics for AI and Government Records

Roundtable with Dr Lise Jaillant, Dr Adam Nix, Prof. Stephanie Decker and Dr Jenny Bunn

AI AI and Archives AI Ethics born-digital archives Dark archives email archives lunchtime online talk Open Data recordkeeping Records Management search interface transparency UX