Security Evaluations of GitHub's Copilot

Asare, Owura

UWSpace will be migrating to a new version of its software from July 29th to August 1st. UWSpace will be offline for all UW community members during this time.

Show simple item record

dc.contributor.author	Asare, Owura
dc.date.accessioned	2023-08-11 13:26:34 (GMT)
dc.date.available	2023-08-11 13:26:34 (GMT)
dc.date.issued	2023-08-11
dc.date.submitted	2023-07-11
dc.identifier.uri	http://hdl.handle.net/10012/19675
dc.description.abstract	Code generation tools driven by artificial intelligence have recently become more popular due to advancements in deep learning and natural language processing that have increased their capabilities. The proliferation of these tools may be a double-edged sword because while they can increase developer productivity by making it easier to write code, research has shown that they can also generate insecure code. In this thesis, we perform two evaluations of one such code generation tool, GitHub's Copilot, with the aim of obtaining a better understanding of their strengths and weaknesses with respect to code security. In our first evaluation, we use a dataset of vulnerabilities found in real world projects to compare how Copilot's security performance compares to that of human developers. In the set of (150) samples we consider, we find that Copilot is not as bad as human developers but still has varied performance across certain types of vulnerabilities. In our second evaluation, we conduct a user study that tasks participants with providing solutions to programming problems that have potentially vulnerable solutions with and without Copilot assistance. The main goal of the user study is to determine how the use of Copilot affects participants' security performance. In our set of participants (n=21), we find that access to Copilot accompanies a more secure solution when tackling harder problems. For the easier problem, we observe no effect of Copilot access on the security of solutions. We also capitalize on the solutions obtained from the user study by performing a preliminary evaluation of the vulnerability detection capabilities of GPT-4. We observe mixed results of high accuracies and high false positive rates, but maintain that language models like GPT-4 remain promising avenues for accessible, static code analysis for vulnerability detection. We discuss Copilot's security performance in both evaluations with respect to different types of vulnerabilities as well its implications for the research, development, testing, and usage of code generation tools.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	copilot	en
dc.subject	security	en
dc.subject	code generation	en
dc.title	Security Evaluations of GitHub's Copilot	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Mathematics	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Nagappan, Meiyappan
uws.contributor.advisor	Asokan, N.
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Asare_Owura.pdf
Size:: 4.030Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record