Hi! I’m Claire. I am a PhD candidate in NLP at the University of Edinburgh, supervised by Michael Rovatsos and Nehal Bhuta. I am also affiliated with the Centre for Technomoral Futures. The expected submission date of my dissertation is June 2025.
As part of the Bloomberg PhD Fellowship received in 2023, I completed a Research Scientist Internship with Bloomberg AI and Law team.
I am interested in integrating knowledge in language models and enhancing their reasoning capabilities. Lately, I have been working on temporal reasoning, for example, ordering a sequence of events from an input text.
My PhD research focuses on legal NLP, utilizing the framework provided by legal texts to explore knowledge integration and the capabilities of LLMs. Over the past three years, my research has taken several directions:
(1) Information Extraction and Content Selection: Extracting relevant information from large unstructured datasets of legal texts to improve legal search and selecting salient legal content from lengthy documents to improve the quality of downstream tasks such as summarization.
(2) Evaluation of LLMs: Evaluate the capabilities of LLMs in retrieving information and reasoning to understand the types of signals they learn and retain in memory — whether they are syntactic, semantic, or specific to legal knowledge — and how well they generalize.
(3) Pretraining Data Representation: Recently, I have been exploring various methods for representing data used in pretraining to enhance specific reasoning capabilities of LLMs. This includes researching ways to improve their abilities in temporal reasoning, such as teaching a model to recognize temporal signals and link them to the appropriate events.
I am originally from Nice, France. Before starting my PhD, I worked as a financial analyst at Havas New York, and studied at Paris Dauphine University and Mines ParisTech. I graduated with a master’s of research in computer science in 2021.
Google Scholar | Twitter | Linkedin | Email |
News
- October 2024: Our paper Information Extraction for Planning Court Cases was accepted to NLLP 2024 (hosted at EMNLP)
- June 2024: Our new paper, Are we done with MMLU? is on Arxiv!
- April 2024: Consider attending/submitting to BAIPsy 2024, a student workshop that aims to faciliate discussion and exchange between Artificial Intelligence and Psychology researchers.
- From May to August 2024, I will be interning at Bloomberg (CTO office) in New York, focusing on information extraction in the legal domain.
- November 27th 2023, NYC, I will be co-organizing a workshop on NLPxFinance at the 4th ACM Conference on AI in Finance - NLP and Network Analysis in Financial Applications. Deadline for submission coming soon, November 10th!
- October 2023, 2 papers accepted at the NLLP workshop, EMNLP 2023! I will present them on December, 7th in Singapore
- October 2023, I will present my work at the Women in HPC workshop hosted at SC2023, Denver, CO
- July 2023, I’m very happy to announce that I received the Bloomberg Data Science Ph.D. Fellowship
- July 2023, Our paper presenting a new information extraction pipeline for legal documents was published in ACL Findings
- June 2023, I received the Best Doctoral Consortium Paper Award at ICAIL 2023
Research and Publications
Are We Done with MMLU?
Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, Claire Barale, Robert McHardy, Joshua Harris, Jean Kaddour, Emile van Krieken, Pasquale Minervini
preprint
Information Extraction for Planning Court Cases
Drish Mali, Rubash Mali, Claire Barale
Proceedings of the Natural Legal Language Processing Workshop (NLLP) at EMNLP 2024 | paper
Do Language Models Learn about Legal Entity Types during Pretraining?
Claire Barale, Michael Rovatsos, Nehal Bhuta
Proceedings of the Natural Legal Language Processing Workshop (NLLP) at EMNLP 2023 | paper – slides – poster
AsyLex: A Dataset for Legal Language Processing of Refugee Claims
Claire Barale, Mark Klaisoongnoen, Pasquale Minervini, Michael Rovatsos and Nehal Bhuta
Proceedings of the Natural Legal Language Processing Workshop (NLLP) at EMNLP 2023 | paper – slides – poster
Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal Practitioners
Claire Barale, Michael Rovatsos, and Nehal Bhuta
ACL Findings 2023 | paper
fAsyLex: Accelerating Legal NLP through Comparative Analysis of Multi-GPU Approaches
Claire Barale
Women in High Performance Computing Workshop (WHPC) at SC2023 | slides
Empowering Refugee Claimants and their Lawyers: Using Machine Learning to Examine Decision-Making in Refugee Law
Claire Barale
International Conference on Artificial Intelligence and Law (ICAIL) 2023, Doctoral Consortium, Best Paper Award | paper
Human-Centered Computing in Legal NLP - An Application to Refugee Status Determination
Claire Barale
Second Workshop on Bridging Human–Computer Interaction and Natural Language Processing at NAACL 2022 | paper
Refugee status determination: how cooperation with machine learning tools can lead to more justice
Claire Barale
Scottish Law and Innovation Network (SCOTLIN) Early Career Scholars Symposium 2022 | paper
What is fair data manipulation?
Alexis Tsoukias, Claire Barale
European Conference on Operational Research, 2021 | presentation
Explanations in decision support – Generating Fairness through explanations
Claire Barale
PSL Université Paris Dauphine, Paris. Masters of Research Dissertation, 2021
Blog Post
Dictionnary Series: What do we mean when we talk about Natural Language Processing (NLP)?
Claire Barale
Data for Children Collaborative, Edinburgh Futures Institute | post