Friday, April 25, 2025 12pm to 2pm
About this Event
1905 Colorado Avenue, Boulder, CO 80309
Title: AIs Playing Games with Language: Balancing Uncertainty, Truth, and Usefulness in Large Language Models
Abstract: In this talk, I'll discuss three silly games that show the same pattern: computers are very good, there are still things humans are better at, and that we can create much stronger human-computer teams. We first begin with games that test memory: testing the recall of obscure facts. While AI has been viewed as superhuman at the task of question answering, it isn't universally so. After building a new human-in-the-loop authoring system for drafting challenging examples, we show that a new measure of adversarial datasets based on item response theory (which can capture the gap between humans and computer skill) is decreasing but not yet closed, with computers still struggling on abstract reasoning and knowing when they know the correct answer. Given these disparate skill sets, we then analyze how we can best build human and computer teams to learn new facts and detect false statements: computers can help humans identify false statements---but only when the computer is not confidently incorrect. Finally, I close with a similar line of results for another silly language game, Diplomacy, where computers have still not reached dominance but can be used to assist human players think strategically and detect lies, which we capture using an analysis of grounded statements with abstract meaning representation and value functions. I'll then close with how these results help inform how we can construct better human--computer interactions in "the real world".
Bio: Jordan Boyd-Graber is a full professor at the University of Maryland. He has worked on model evaluations for human-centered topic models, psychologically inspired leaderboards, human–computer machine translation, and question answering. He also contributed new models for improving generative models with RL, interactive approaches for question answering, topic models, and negotiations. Of his twenty former PhD students, seven have gone on to tenure track positions. He and his students have been recognized with paper awards at EMNLP (2023), IUI (2018), NAACL (2016), and NeurIPS (2009, 2015), and he won the 2015 Karen Spärk Jones Award and a 2017 NSF CAREER Award. He served as PC for ACL 2023, SAC for EMNLP and NAACL, AC for ACL, NAACL, EMNLP, and NeurIPS, Poster Chair for EMNLP 2022, Tutorial Chair for ACL 2017, and Advisor for the ACL 2014 SRW.
He previously was an assistant professor at the University of Colorado, Visiting Research Scientist at Google Zürich, and Praktikant at the Berlin-Brandenburg Akademie der Wissenschaften. His undergraduate degrees are in Computer Science and History at the California Institute of Technology, and he received his PhD from Princeton University. His Erdös number is 2 (via Maria Klawe), and his Bacon number is 3 (by embarrassing himself on Jeopardy!). He lives in Silver Spring, Maryland with his wife, two roombas, three fish, two daughters, and their 外婆
User Activity
No recent activity