UWSpace will be migrating to a new version of its software from July 29th to August 1st. UWSpace will be offline for all UW community members during this time.
Scalable Informative Rule Mining
Abstract
In this thesis we present SIRUM: a system for Scalable Informative RUle Mining from multi-dimensional data. Informative rules have recently been studied in several contexts, including data summarization, data cube exploration and data quality. The objective is to produce a concise set of rules (patterns) over the values of the dimension attributes that provide the most information about the distribution of a numeric measure attribute. SIRUM optimizes this task for big, wide and distributed datasets. We implemented SIRUM in Spark and observed significant performance improvements on real data due to our optimizations.
Collections
Cite this version of the work
Guoyao Feng
(2016).
Scalable Informative Rule Mining. UWSpace.
http://hdl.handle.net/10012/10620
Other formats