Adapting to Data Drift in Encrypted Traffic Classification Using Deep Learning

Malekghaini, Navid

dc.contributor.author	Malekghaini, Navid
dc.date.accessioned	2023-01-12 21:20:04 (GMT)
dc.date.available	2023-09-30 04:50:05 (GMT)
dc.date.issued	2023-01-12
dc.date.submitted	2023-01-09
dc.identifier.uri	http://hdl.handle.net/10012/19058
dc.description.abstract	Deep learning models have shown to achieve high performance in encrypted traffic classification. However, when it comes to production use, multiple factors challenge the performance of these models. The emergence of new protocols, especially at the application-layer, as well as updates to previous protocols affect the patterns in input data, making the model's previously learn patterns obsolete. Furthermore, proposed model architectures are usually tested on datasets collected in controlled settings, which makes the reported performances unreliable for production use. In this thesis, we start by studying how the performances of two high-performing state-of-the-art encrypted traffic classifiers change on multiple real-world datasets collected over the course of two years from a major ISP's network, Orange telecom. We investigate the changes in traffic data patterns highlighting the extent to which these changes, a.k.a. data drift, impact the performance of the two models in service-level and application-level classification. We propose best practices to manually adapt model architectures and improve their accuracy in the face of data drift. We show that our best practices are generalizable to other encryption protocols and different levels of labeling granularity. However, designing efficient model architectures and manual architectural adaptations is time-consuming and requires domain expertise. Neural architecture search (NAS) algorithms have been shown to automatically discover efficient models in other domains, such as image recognition and natural language processing. However, NAS's application is rather unexplored in Encrypted Traffic Classification. We propose AutoML4ETC, a tool to automatically design efficient and high-performing neural architectures for Encrypted Traffic Classification, given a target dataset and corresponding features. We define three powerful search spaces tailored specifically for the prominent categories of features in the Encrypted Traffic Classification state-of-the-art, i.e., packet raw bytes, flow time-series, and flow statistics. We show that a simple search strategy over AutoML4ETC’s search spaces can generate model architectures that outperform the state-of-the-art Encrypted Traffic Classification models on several benchmark datasets, including real-world datasets of TLS and QUIC traffic collected from a major ISP network. In addition to being more accurate, the AutoML4ETC’s architectures are significantly more efficient and lighter in terms of the number of parameters. We further showcase the potential of AutoML4ETC by experimenting with state-of-the-art NAS techniques and model ensembles generated from different search spaces. We also use AutoML4ETC to analyze the state of adoption of the QUIC protocol.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	Deep Learning	en
dc.subject	Data Drift	en
dc.subject	Encrypted Traffic Classification	en
dc.subject	HTTP/2	en
dc.subject	QUIC	en
dc.subject	Neural Architecture Search	en
dc.title	Adapting to Data Drift in Encrypted Traffic Classification Using Deep Learning	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Mathematics	en
uws-etd.embargo.terms	8 months	en
uws.contributor.advisor	Boutaba, Raouf
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Malekghaini_Navid.pdf
Size:: 2.084Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record