Show simple item record

dc.contributor.authorRambidis, Andrew
dc.date.accessioned2023-08-24 13:23:14 (GMT)
dc.date.available2023-08-24 13:23:14 (GMT)
dc.date.issued2023-08-24
dc.date.submitted2023-08-14
dc.identifier.urihttp://hdl.handle.net/10012/19752
dc.description.abstractWith the high use of over-parameterized data in deep learning, the choice of optimizer in training plays a big role in a model’s ability to generalize well due to the existence of solution selection bias. We consider the popular adaptive gradient method: Adagrad, and aim to study its convergence and algorithmic biases in the underdetermined linear regression regime. First we prove that Adagrad converges in this problem regime. Subsequently, we empirically find that when using sufficiently small step sizes, Adagrad promotes diffuse solutions, in the sense of uniformity among the coordinates of the solution. Additionally, when compared to gradient descent, we see empirically and show theoretically that Adagrad’s solution, under the same conditions, exhibits greater diffusion compared to the solution obtained through gradient descent. This behaviour is unexpected as conventional data science encourages the utilization of optimizers that attain sparser solutions. This preference arises due to some inherent advantages such as helping to prevent overfitting, and reducing the dimensionality of the data. However, we show that in the application of interpolation, diffuse solutions yield beneficial results when compared to solutions with localization; Namely, we experimentally observe the success of diffuse solutions when interpolating a line via the weighted sum of spike-like functions. The thesis concludes with some suggestions to possible extensions of the content in future work.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectdata scienceen
dc.subjectcontinuous optimizationen
dc.subjectadaptive gradient methodsen
dc.subjectadagraden
dc.subjectimplicit biasen
dc.subjectunderdetermined linear regressionen
dc.subjectalgorithmic behaviouren
dc.titleAlgorithmic Behaviours of Adagrad in Underdetermined Linear Regressionen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentData Scienceen
uws-etd.degree.disciplineData Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws-etd.embargo.terms0en
uws.contributor.advisorVavasis, Stephen
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages