Show simple item record

dc.contributor.authorHan, Minyang
dc.date.accessioned2015-07-24 18:22:29 (GMT)
dc.date.issued2015-07-24
dc.date.submitted2015
dc.identifier.urihttp://hdl.handle.net/10012/9484
dc.description.abstractThe considerable interest in distributed systems that can execute algorithms to process large graphs has led to the creation of many graph processing systems. However, existing systems suffer from two major issues: (1) poor performance due to frequent global synchronization barriers and limited scalability; and (2) lack of support for graph algorithms that require serializability, the guarantee that parallel executions of an algorithm produce the same results as some serial execution of that algorithm. Many graph processing systems use the bulk synchronous parallel (BSP) model, which allows graph algorithms to be easily implemented and reasoned about. However, BSP suffers from poor performance due to stale messages and frequent global synchronization barriers. While asynchronous models have been proposed to alleviate these overheads, existing systems that implement such models have limited scalability or retain frequent global barriers and do not always support graph mutations or algorithms with multiple computation phases. We propose barrierless asynchronous parallel (BAP), a new computation model that overcomes the limitations of existing asynchronous models by reducing both message staleness and global synchronization while retaining support for graph mutations and algorithms with multiple computation phases. We present GiraphUC, which implements our BAP model in the open source distributed graph processing system Giraph, and evaluate it at scale to demonstrate that BAP provides efficient and transparent asynchronous execution of algorithms that are programmed synchronously. Secondly, very few systems provide serializability, despite the fact that many graph algorithms require it for accuracy, correctness, or termination. To address this deficiency, we provide a complete solution that can be implemented on top of existing graph processing systems to provide serializability. Our solution formalizes the notion of serializability and the conditions under which it can be provided for graph processing systems. We propose a partition-based synchronization technique that enforces these conditions efficiently to provide serializability. We implement this technique into Giraph and GiraphUC to demonstrate that it is configurable, transparent to algorithm developers, and more performant than existing techniques.en
dc.language.isoenen
dc.publisherUniversity of Waterloo
dc.subjectgraph processingen
dc.subjectBSPen
dc.subjectBAPen
dc.subjectGiraphUCen
dc.subjectserializabilityen
dc.subjectpartition-based distributed lockingen
dc.subjectPregel-likeen
dc.titleOn Improving Distributed Pregel-like Graph Processing Systemsen
dc.typeMaster Thesisen
dc.pendingfalse
dc.subject.programComputer Scienceen
dc.description.embargoterms4 monthsen
dc.date.embargountil2015-11-21T18:22:29Z
uws-etd.degree.departmentComputer Science (David R. Cheriton School of)en
uws-etd.degreeMaster of Mathematicsen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages