Bayesian Federated Learning in Predictive Space

Hasan, Mohsin

UWSpace will be migrating to a new version of its software from July 29th to August 1st. UWSpace will be offline for all UW community members during this time.

Show simple item record

dc.contributor.author	Hasan, Mohsin
dc.date.accessioned	2023-08-10 19:37:27 (GMT)
dc.date.available	2023-08-10 19:37:27 (GMT)
dc.date.issued	2023-08-10
dc.date.submitted	2023-08-08
dc.identifier.uri	http://hdl.handle.net/10012/19673
dc.description.abstract	Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's data is private. This paradigm is useful in settings where different entities own different training points, such as when training on data stored on multiple edge devices. Within this setting, small and noisy datasets are common, which highlights the need for well-calibrated models which are able to represent the uncertainty in their predictions. Alongside this, two other important goals for a practical FL algorithm are 1) that it has low communication costs, operating over only a few rounds of communication, and 2) that it achieves good performance when client datasets are distributed differently from each other (are heterogeneous). Among existing FL techniques, the closest to achieving such goals include Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. These provide uncertainty estimates, more naturally handle data heterogeneity owing to their Bayesian nature, and can operate in a single round of communication. Of these techniques, many make inaccurate approximations to the high-dimensional posterior over parameters which in turn negatively effects their uncertainty estimates. A Bayesian technique known as the ``Bayesian Committee Machine" (BCM), originally introduced outside the FL context, remedies some of these issues by aggregating the Bayesian posteriors in the lower dimensional predictive space instead. The BCM, in its original form, is impractical for FL due to requiring a large ensemble for inference. We first argue that it is well-suited for heterogeneous FL, then propose a modification to the BCM algorithm, involving distillation, to make it practical for FL. We demonstrate that this modified method outperforms other techniques as heterogeneity increases. We then demonstrate theoretical issues with the calibration of the BCM, namely that it is systematically overconfident. We remedy this by proposing β-Predictive Bayes, a Bayesian FL algorithm which performs a modified aggregation of the local predictive posteriors, using a tunable parameter β. β is tuned to improve the global model's calibration, before it is distilled. We empirically evaluate this method on a number of regression and classification datasets to demonstrate that it generally better calibrated than other baselines, over a range of heterogeneous data partitions.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	machine learning	en
dc.subject	bayesian inference	en
dc.subject	federated learning	en
dc.title	Bayesian Federated Learning in Predictive Space	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Mathematics	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Poupart, Pascal
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Hasan_Mohsin.pdf
Size:: 1.760Mb
Format:: PDF
Description:: PDF of Masters Thesis

View/ Open

This item appears in the following Collection(s)

Show simple item record