Datenschutzerklärung|Data Privacy

K. Forster

The Paper "Computation Offloading in JVM-based Dataflow Engines," by DIMA Master Student Haralampos Gavriilidis and his Mentors was Accepted for Publication at BTW 2019

Every two years, the Database Systems for Business, Technology, and Web (or simply BTW, in German) Symposium offers students the opportunity to present their thesis results to professionals in the field of database and information systems. We are happy to announce that DIMA Master’s student Haralampos Gavriilidis thesis results were accepted for publication at BTW 2019. We congratulate all three Haralampos and his advisors, Andreas Kunft and Dr. Sebastian Breß for their success.
This year the event will be held at the University of Rostock. For more information about the symposium visit: BTW 2019

Computation Offloading in JVM-based Dataflow Engines, Haralampos Gavriilidis , 2018. To be published in in LNI-Band: BTW 2019.

Abstract :
State-of-the-art dataflow engines, such as Apache Spark and Apache Flink scale out on large clusters for a variety of data-processing tasks, including machine learning and data mining algorithms. However, being based on the JVM, they are unable to apply optimizations supported by modern CPUs. On the contrary, specialized data processing frameworks scale up by exploiting modern CPU characteristics. The goal of this thesis is to find the sweet spot between scale-out and scale-up systems by offloading computation from dataflow engines to specialized systems. We propose two computation offloading methods, reason about their applicability, and implement a prototype based on Apache Spark. Our evaluation shows that for compute-intensive tasks, computation offloading leads to performance improvements of up to a factor of 2.5x. For certain UDF scenarios, computation offloading performs worse by up to a factor of 3x: our microbenchmarks show that 80% of the time is spent on serialization operations. By employing data exchange without serialization, computation offloading achieves performance improvements by up to 10x.