Mining Butterflies in Streaming Graphs
Abstract
This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A data-driven heuristic is used to architect the systems. To this end, initially, the growth patterns of bipartite streaming graphs are mined and the emergence principles of streaming motifs are discovered. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilized to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark stream analytics, particularly in the case of concept drift detection.
sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd. sGrapp achieves mean absolute percentage error up to 0.05/0.14 for the cumulative butterfly count in streaming graphs with uniform/non-uniform temporal distribution and a processing throughput of 1.5 million data records per second. The throughput and estimation error of sGrapp are 160x higher and 0.02x lower than baselines. sGradd demonstrates an improving performance over time, achieves zero false detection rates when there is not any drift and when drift is already detected, and detects sequential drifts in zero to a few seconds after their occurrence regardless of drift intervals.
Collections
Cite this version of the work
Aida Sheshbolouki
(2023).
Mining Butterflies in Streaming Graphs. UWSpace.
http://hdl.handle.net/10012/19606
Other formats
Related items
Showing items related by title, author, creator and subject.
-
Simultaneous Graph Representation Problems
Jampani, Krishnam Raju (University of Waterloo, 2011-02-18)Many graphs arising in practice can be represented in a concise and intuitive way that conveys their structure. For example: A planar graph can be represented in the plane with points for vertices and non-crossing curves ... -
Software Architecture Recovery based on Pattern Matching
Sartipi, Kamran (University of Waterloo, 2003)Pattern matching approaches in reverse engineering aim to incorporate domain knowledge and system documentation in the software architecture extraction process, hence provide a user/tool collaborative environment for ... -
Uniqueness and Complexity in Generalised Colouring
Farrugia, Alastair (University of Waterloo, 2003)The study and recognition of graph families (or graph properties) is an essential part of combinatorics. Graph colouring is another fundamental concept of graph theory that can be looked at, in large part, as the recognition ...