Categories
Uncategorised

De-redacting the Archive: LLM-based Redacted Entity Estimation

Speaker: Professor Reuben Binns, University of Oxford 

Abstract: Redaction of sensitive details from publicly released archives has been a common means of protecting privacy and secrecy for decades. Speculation about redacted identities has long been a feature of such archives, from the Cold War intelligence files to the Epstein e-mail archive today. Digital tools, including AI, have long been used to assist in the process of redaction, identifying references to personal data, entities, and sensitive details, and flagging them for human review. However, with the increasing capabilities and widespread adoption of Generative AI tools like ChatGPT, they might also be used to undermine the efforts to redact information from archives. 

This talk will assess some of the possible threats of AI-driven de-redaction. Traditional, manual methods of guessing redacted identities depend on strong historical knowledge, analysis of context, and even typographic details like the length of the redaction bar. Now, budding de-redactors are likely to turn to AI to aid in their guesswork. This talk will present a preliminary assessment of the capabilities of current AI models for this task, through a series of case studies. It will also consider the risks that might arise when AI is used for de-redaction. Risks could arise both when AI is effective at de-redaction, undermining balance between public interest and privacy that redaction aims to uphold; and also when they are ineffective, leading users down blind alleys and inducing potentially misleading, harmful, or even defamatory inferences. 

Bio: Reuben Binns is an Associate Professor of Human Centred Computing, working between computer science, law, and philosophy, focusing on data protection, machine learning, and the regulation of and by technology. Between 2018-2020, he was a Postdoctoral Research Fellow in AI at the Information Commissioner’s Office, addressing AI / ML and data protection. He joined the Department of Computer Science at the University of Oxford as a postdoctoral researcher in 2015. He received his Ph.D. in Web Science from The University of Southampton in 2015.