Sign Up

1905 Colorado Avenue, Boulder, CO 80309

View map

Colloquia Recording: Dr. Jonathan Dunn

Talk Title: Computational Sociolinguistics: Population-Level Variation and Its Impact on Language Technology

Presenter: Jonathan Dunn, PhD, Associate Professor, Department of Linguistics, University of Illinois Urbana-Champaign

Abstract: Digital corpora enable us to observe language use in many places by many distinct populations. This talk shows how data-driven language mapping supports natural experiments about the relationship between exposure to usage and the emergence of grammars. At the same time, LLMs are trained from this same digital data. We show that population-level variation has a downstream impact on LLMs. This impact raises the question of whether machine-assisted production, derived from LLMs, will sufficiently alter exposure that it begins to impact dialects.

Bio: I am a computational linguist. My research models both (i) the emergence of grammatical structure within individuals and (ii) variation in grammatical structure across populations and registers.

To support this research, I've worked to develop large multi-lingual geographic corpora. My recent work has also focused on the impact of linguistic variation on natural language processing and on low-resource contexts. I have published over 40 papers on computational linguistics and two monographs with Cambridge University Press: Computational Construction Grammar (2024) and Natural Language Processing for Corpus Linguistics (2022). My interdisciplinary teaching experience includes a MOOC which has now taught 14,000 students about NLP.

  • Taahaa Dawe

1 person is interested in this event

User Activity

No recent activity