Algorithmic Behaviours of Adagrad in Underdetermined Linear Regression

Rambidis, Andrew

dc.contributor.author	Rambidis, Andrew
dc.date.accessioned	2023-08-24 13:23:14 (GMT)
dc.date.available	2023-08-24 13:23:14 (GMT)
dc.date.issued	2023-08-24
dc.date.submitted	2023-08-14
dc.identifier.uri	http://hdl.handle.net/10012/19752
dc.description.abstract	With the high use of over-parameterized data in deep learning, the choice of optimizer in training plays a big role in a model’s ability to generalize well due to the existence of solution selection bias. We consider the popular adaptive gradient method: Adagrad, and aim to study its convergence and algorithmic biases in the underdetermined linear regression regime. First we prove that Adagrad converges in this problem regime. Subsequently, we empirically find that when using sufficiently small step sizes, Adagrad promotes diffuse solutions, in the sense of uniformity among the coordinates of the solution. Additionally, when compared to gradient descent, we see empirically and show theoretically that Adagrad’s solution, under the same conditions, exhibits greater diffusion compared to the solution obtained through gradient descent. This behaviour is unexpected as conventional data science encourages the utilization of optimizers that attain sparser solutions. This preference arises due to some inherent advantages such as helping to prevent overfitting, and reducing the dimensionality of the data. However, we show that in the application of interpolation, diffuse solutions yield beneficial results when compared to solutions with localization; Namely, we experimentally observe the success of diffuse solutions when interpolating a line via the weighted sum of spike-like functions. The thesis concludes with some suggestions to possible extensions of the content in future work.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	data science	en
dc.subject	continuous optimization	en
dc.subject	adaptive gradient methods	en
dc.subject	adagrad	en
dc.subject	implicit bias	en
dc.subject	underdetermined linear regression	en
dc.subject	algorithmic behaviour	en
dc.title	Algorithmic Behaviours of Adagrad in Underdetermined Linear Regression	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	Data Science	en
uws-etd.degree.discipline	Data Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Mathematics	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Vavasis, Stephen
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Rambidis_Andrew.pdf
Size:: 1.766Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record