2012 ResilientDistributedDatasetsAFa
- (Zaharia et al., 2012) ⇒ Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. (2012). “Resilient Distributed Datasets: A Fault-tolerant Abstraction for in-memory Cluster Computing.” In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation.
Subject Headings: Resilient Distributed Dataset Structure.
Notes
Cited By
- http://scholar.google.com/scholar?q=%222012%22+Resilient+Distributed+Datasets%3A+A+Fault-tolerant+Abstraction+for+in-memory+Cluster+Computing
- http://dl.acm.org/citation.cfm?id=2228298.2228301&preflayout=flat#citedby
Quotes
Abstract
We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner. RDDs are motivated by two types of applications that current computing frameworks handle inefficiently: iterative algorithms and interactive data mining tools. In both cases, keeping data in memory can improve performance by an order of magnitude. To achieve fault tolerance efficiently, RDDs provide a restricted form of shared memory, based on coarse-grained transformations rather than fine-grained updates to shared state. However, we show that RDDs are expressive enough to capture a wide class of computations, including recent specialized programming models for iterative jobs, such as Pregel, and new applications that these models do not capture. We have implemented RDDs in a system called Spark, which we evaluate through a variety of user applications and benchmarks.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2012 ResilientDistributedDatasetsAFa | Justin Ma Ion Stoica Matei Zaharia Mosharaf Chowdhury Tathagata Das Ankur Dave Murphy McCauley Michael J. Franklin Scott Shenker | Resilient Distributed Datasets: A Fault-tolerant Abstraction for in-memory Cluster Computing |