Datenschutzerkl├Ąrung|Data Privacy
Impressum

28.03.2011
K. Die├čel

Paper "MapReduce and PACT - Comparing Data Parallel Programming Models" published at BTW 2011

Our paper
"MapReduce and PACT - Comparing Data Parallel Programming Models"
was accepted and presented at BTW 2011 conference in Kaiserslautern, Germany.

Abstract:
Web-Scale Analytical Processing is a much investigated topic in current research. Next to parallel databases, new flavors of parallel data processors have recently emerged. One of the most discussed approaches is MapReduce. MapReduce is highlighted by its programming model: All programs expressed as the second-order functions map and reduce can be automatically parallelized. Although MapReduce provides a valuable abstraction for parallel programming, it clearly has some deficiencies. These become obvious when considering the tricks one has to play to express more complex tasks in MapReduce, such as operations with multiple inputs.
The Nephele/PACT system uses a programming model that pushes the idea of MapReduce further. It is centered around so called Parallelization Contracts (PACTs), which are in many cases better suited to express complex operations than plain MapReduce. By the virtue of that programming model, the system can also apply a series of optimizations on the data flows before they are executed by the Nephele runtime system.
This paper compares the PACT programming model with MapReduce from the perspective of the programmer, who specifies analytical data processing tasks. We discuss the implementations of several typical analytical operations both with MapReduce and with PACTs, highlighting the key differences in using the two programming models.

The paper and more details on the PACT programming model and the Stratosphere project can be found at:

http://www.stratosphere.eu