A Scalable Recoverable Skip List for Persistent Memory on NUMA Machines

Chowdhury, Sakib

dc.contributor.author	Chowdhury, Sakib
dc.date.accessioned	2021-10-20 14:49:11 (GMT)
dc.date.available	2021-10-20 14:49:11 (GMT)
dc.date.issued	2021-10-20
dc.date.submitted	2021-10-15
dc.identifier.uri	http://hdl.handle.net/10012/17657
dc.description.abstract	Interest in recoverable, persistent-memory-resident (PMEM-resident) data structures is growing as availability of Intel Optane Data Center Persistent Memory increases. An interesting use case for in-memory, recoverable data structures is for database indexes, which need high availability and reliability. Skip lists are a data structure particularly well-suited for usage as a fully PMEM-resident index, due to their reduced amount of writes from their probabilistic balancing in comparison to other index data structures like B-trees. The Untitled Persistent Skip List (UPSkipList) is a PMEM-resident recoverable skip list derived from Herlihy et al.'s lock-free skip list algorithm. It is developed using a new conversion technique that extends the RECIPE algorithm by Lee et al. to work on lock-free algorithms with non-blocking writes and no inherent recovery mechanism. It does this by tracking the current time period between two failures, or failure-free epoch, and recording the current epoch in nodes when they are being modified. This way, an observing thread can determine if an inconsistent node is being modified in this epoch or was being modified in a previous epoch and now is in need of recovery. The algorithm is also extended to support concurrent data node splitting to improve performance, which is easily made recoverable using the extension to RECIPE allowing detection of incomplete node splits. UPSkipList also supports cache-efficient NUMA awareness of dynamically allocated objects using an extension to the Region-ID in Value (RIV) method by Chen et al. By using additional bits after the most significant bits in an RIV pointer to indicate the object in which the remaining bits are referenced relative to, chunks of memory can by dynamically allocated to UPSkipList from multiple shared pools without the need for fat pointers, which reduce cache efficiency by halving the number of pointers that can fit in a cache line. This combines the benefits of both the RIV method and the dynamic memory allocation method built into the Persistent Memory Development Kit (PMDK), improving both performance and practicality. Additionally, memory manually managed within a chunk using the RIV method can have its recovery after a crash deferred to the next attempted allocation by a thread sharing the ID with the thread responsible for the allocation of the memory being recovered, reducing recovery time for large pools with many threads active during the time of a crash. Comparison was done against the BzTree of Arulraj et al., as implemented by Lersch et al., which has non-blocking, non-repairing writes implemented using the persistent multi-word CAS (PMwCAS) primitive by Wang et al., and a transactional recoverable skip list implemented using the PMDK. Tested with the Yahoo Cloud Serving Benchmark (YCSB), UPSkipList achieves better performance in write-heavy workloads at high levels of concurrency than BzTree, and outperforms the PMDK-based skip list, due to the PMDK-based skip list's higher average latency. Using the extended RIV pointers to dynamically allocate memory resulted in a 40% performance increase over using the PMDK's fat pointers. The impact of NUMA awareness using multiple pools of memory compared with striping a single pool across multiple nodes was found to only be a 5.6% decrease in performance. Finally, recovery time of UPSkipList was found to be comparable to the PMDK-based skip list, and 9 times faster than BzTree with 500K descriptors in its PMwCAS pool. Correctness of UPSkipList and its conversion and recovery techniques were tested using black-box recoverable linearizability analysis, which found UPSkipList to be free of strict linearizability errors across 30 trials.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	concurrency	en
dc.subject	persistent memory	en
dc.subject	skip lists	en
dc.subject	data structures	en
dc.subject	scalability	en
dc.title	A Scalable Recoverable Skip List for Persistent Memory on NUMA Machines	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	Electrical and Computer Engineering	en
uws-etd.degree.discipline	Electrical and Computer Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Applied Science	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Golab, Wojciech
uws.contributor.affiliation1	Faculty of Engineering	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Chowdhury_Sakib.pdf
Size:: 3.388Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record