Show simple item record

dc.contributor.authorLi, Tianle
dc.date.accessioned2024-05-10 17:24:01 (GMT)
dc.date.available2024-05-10 17:24:01 (GMT)
dc.date.issued2024-05-10
dc.date.submitted2024-05-08
dc.identifier.urihttp://hdl.handle.net/10012/20554
dc.description.abstractThe rapid evolution of Large Language Models (LLMs) has marked the beginning of a new age in AI capabilities, particularly in the domain of natural language understanding and processing. Among the forefront of these advancements is the exploration of in-context learning, a paradigm that enables models to adapt to new tasks without explicit retraining. This thesis embarks on a comprehensive investigation into the in-context learning capabilities of LLMs, guided by two pivotal studies: KB-BINDER's deployment in Question Answering over Knowledge Bases (KBQA) and the evaluation of LLMs' performance on LongICLBench, a self-curated benchmark for long-context understanding. The first facet of this investigation, embodied by KB-BINDER, addresses the challenge of generalizing LLMs to diverse KBQA tasks without task-specific training. KB-BINDER pioneers a novel few-shot in-context learning approach, utilizing Codex to generate logical forms and employing BM25 for draft binding, demonstrating remarkable efficacy across heterogeneous KBQA datasets. We believe KB-BINDER can serve as an important baseline for future research in utilizing the few-shot capability of LLMs to resolve the problem of KBQA. Complementing this, the second study introduces LongICLBench, a specialized benchmark designed to test long-context LLMs in processing long, context-rich sequences across extreme-label classification tasks with in-context learning. Through evaluation with tasks of increasing difficulty level, an obvious performance threshold is identified, highlighting the current limitations of LLMs in handling extensive context windows and revealing a bias towards labels positioned towards the input's end after grouping the instances with the same labels in demonstration. This underscores a crucial gap in the current long-context LLMs' ability to reason over long sequences, paving the way for further enhancements in long-context comprehension. Together, these studies form the cornerstone of this thesis, encapsulating the dynamic landscape of in-context learning within LLMs. Through a detailed examination of KB-BINDER and LongICLBench, this work not only charts the current capabilities and boundaries of LLMs but also lays the groundwork for future advancements in making LLMs more adaptable and proficient in handling a wide array of complex tasks.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectnatural language processingen
dc.titleExplore the In-context Learning Capability of Large Language Modelsen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws-etd.embargo.terms0en
uws.contributor.advisorChen, Wenhu
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages