Friday, March 14, 2025 12pm to 2pm
About this Event
1905 Colorado Avenue, Boulder, CO 80309
Colloquia Recording: Dr. Jonathan Dunn
Talk Title: Computational Sociolinguistics: Population-Level Variation and Its Impact on Language Technology
Presenter: Jonathan Dunn, PhD, Associate Professor, Department of Linguistics, University of Illinois Urbana-Champaign
Abstract: Digital corpora enable us to observe language use in many places by many distinct populations. This talk shows how data-driven language mapping supports natural experiments about the relationship between exposure to usage and the emergence of grammars. At the same time, LLMs are trained from this same digital data. We show that population-level variation has a downstream impact on LLMs. This impact raises the question of whether machine-assisted production, derived from LLMs, will sufficiently alter exposure that it begins to impact dialects.
Bio: I am a computational linguist. My research models both (i) the emergence of grammatical structure within individuals and (ii) variation in grammatical structure across populations and registers.
To support this research, I've worked to develop large multi-lingual geographic corpora. My recent work has also focused on the impact of linguistic variation on natural language processing and on low-resource contexts. I have published over 40 papers on computational linguistics and two monographs with Cambridge University Press: Computational Construction Grammar (2024) and Natural Language Processing for Corpus Linguistics (2022). My interdisciplinary teaching experience includes a MOOC which has now taught 14,000 students about NLP.
User Activity
No recent activity