Explore the In-context Learning Capability of Large Language Models

Li, Tianle

dc.contributor.author	Li, Tianle
dc.date.accessioned	2024-05-10 17:24:01 (GMT)
dc.date.available	2024-05-10 17:24:01 (GMT)
dc.date.issued	2024-05-10
dc.date.submitted	2024-05-08
dc.identifier.uri	http://hdl.handle.net/10012/20554
dc.description.abstract	The rapid evolution of Large Language Models (LLMs) has marked the beginning of a new age in AI capabilities, particularly in the domain of natural language understanding and processing. Among the forefront of these advancements is the exploration of in-context learning, a paradigm that enables models to adapt to new tasks without explicit retraining. This thesis embarks on a comprehensive investigation into the in-context learning capabilities of LLMs, guided by two pivotal studies: KB-BINDER's deployment in Question Answering over Knowledge Bases (KBQA) and the evaluation of LLMs' performance on LongICLBench, a self-curated benchmark for long-context understanding. The first facet of this investigation, embodied by KB-BINDER, addresses the challenge of generalizing LLMs to diverse KBQA tasks without task-specific training. KB-BINDER pioneers a novel few-shot in-context learning approach, utilizing Codex to generate logical forms and employing BM25 for draft binding, demonstrating remarkable efficacy across heterogeneous KBQA datasets. We believe KB-BINDER can serve as an important baseline for future research in utilizing the few-shot capability of LLMs to resolve the problem of KBQA. Complementing this, the second study introduces LongICLBench, a specialized benchmark designed to test long-context LLMs in processing long, context-rich sequences across extreme-label classification tasks with in-context learning. Through evaluation with tasks of increasing difficulty level, an obvious performance threshold is identified, highlighting the current limitations of LLMs in handling extensive context windows and revealing a bias towards labels positioned towards the input's end after grouping the instances with the same labels in demonstration. This underscores a crucial gap in the current long-context LLMs' ability to reason over long sequences, paving the way for further enhancements in long-context comprehension. Together, these studies form the cornerstone of this thesis, encapsulating the dynamic landscape of in-context learning within LLMs. Through a detailed examination of KB-BINDER and LongICLBench, this work not only charts the current capabilities and boundaries of LLMs but also lays the groundwork for future advancements in making LLMs more adaptable and proficient in handling a wide array of complex tasks.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	natural language processing	en
dc.title	Explore the In-context Learning Capability of Large Language Models	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Mathematics	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Chen, Wenhu
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Li_Tianle.pdf
Size:: 2.941Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record