Stanford AI Lab Papers and Talks at NAACL 2025

April 28, 2025

The 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL) is being hosted in Albuquerque, New Mexico from April 29 - May 4. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to the papers below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

Main Conference

Benchmarking Distributional Alignment of Large Language Models

Authors: Nicole Meister, Carlos Guestrin, Tatsunori Hashimoto
Contact: nmeist@stanford.edu
Links: Paper
Keywords: nlp tools for social analysis, language/cultural bias analysis, values and culture, benchmarking, nlp datasets, evaluation

Can Unconfident LLM Annotations Be Used for Confident Conclusions?

Authors: Kristina Gligorić*, Tijana Zrnic*, Cinoo Lee*, Emmanuel Candès, and Dan Jurafsky
Contact: gligoric@stanford.edu
Award nominations: Best paper, outstanding paper
Links: Paper
Keywords: llm annotations, statistical inference, validity, nlp tools for social analysis, human behavior analysis

REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance

Authors: Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky, Maarten Sap
Contact: katezhou@stanford.edu
Award nominations: Best Paper Runner Up
Links: Paper
Keywords: human-lm interaction, reliance, safety, expressions of uncertainty

Rethinking Word Similarity: Semantic Similarity through Classification Confusion

Authors: Kaitlyn Zhou, Haishan Gao, Sarah Chen, Dan Edelstein, Dan Jurafsky, Chen Shani
Contact: cshani@stanford.edu
Links: Paper
Keywords: semantic similarity, human-centered nlp, computational social science

Sneaking Syntax into Transformer Language Models with Tree Regularization

Authors: Ananjan Nandi, Christopher D. Manning, Shikhar Murty
Contact: tgk.ananjan@gmail.com
Award nominations: Oral
Links: Paper | Website
Keywords: multi-task approaches, constituency parsing, syntactic language models, sample efficiency, syntactic generalization

“All that Glitters”: Techniques for Evaluations with Unreliable Model and Human Annotations

Authors: Michael Hardy
Contact: hardym@stanford.edu
Links: Paper | Website
Keywords: evaluation methodologies, model bias/fairness evaluation, model bias/unfairness mitigation, educational applications, ethical considerations in nlp applications, transparency, human-subject application-grounded evaluations, human-centered evaluation

Workshop Papers

Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches

Authors: Chen Shani, Elizabeth C. Stade
Contact: cshani@stanford.edu
Workshop: CLPsych: The Workshop on Computational Linguistics and Clinical Psychology
Links: Paper
Keywords: nlp for cilnical psychology

What Can Large Language Models Do for Sustainable Food?

Authors: Anna T. Thomas, Adam Yee, Andrew Mayne, Maya B. Mathur, Dan Jurafsky, and Kristina Gligorić
Contact: gligoric@stanford.edu
Workshop: First Workshop on AI and Scientific Discovery: Directions and Opportunities (AISD)
Links: Paper
Keywords: large language models, sustainability, climate, food, health, optimization

We look forward to seeing you at NAACL 2025!

Keep on top of the latest SAIL Blog posts via , , or email: