Sign Up

1905 Colorado Avenue, Boulder, CO 80309

View map

Title: 

MLL LLM: Exploring the language acquisition capabilities of an LLM given pedagogical material in comparison with human multi-lingual learners

Abstract:

To acquire a language, modern large language models (LLMs) are typically provided with many trillions of tokens of language data. The data-greedy nature of these LLMs pose a problem when attempting to translate low-resource languages. To address this, recent prior research has been conducted to test the ability of LLMs to acquire language via prompting with reference grammars of low-resource languages. In my thesis, I extend this method to pedagogical material designed for humans acquiring a non-native language by providing T5Gemma with prompts of varying compositions and Japanese textbook data of varying degrees of detail and evaluating its performance on quizzes designed for human learners. I also provide two participants with the same textbook and quizzes and compare human results with those of the model. I observe that pedagogical material is largely beneficial for a model to learn an unfamiliar language. Additionally, my results highlight several places of overlap between human and model processing of unfamiliar linguistic material, indicating that LLMs may indeed be useful for modeling human language acquisition.