Datenschutzerklärung|Data Privacy
Impressum

29.09.2018
K. Forster

"Muses: Distributed Data Migration System for Polystores" paper accepted at ICDE 2019

'''Muses: Distributed Data Migration System for Polystores''', Abdulrahman Kaitoua, Tilmann Rabl, Asterios Katsifodimos, Volker Markl , ICDE 2019, Macau, China.

Abstract
As datasets become increasingly abundant over heterogeneous sources and the requirement to fuse them is pressing, distinct datasets can no longer be viewed as isolated components: They present inter-dependencies despite their contrast in size, data-type, origin, schema, etc. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in polystores is data migration: In order to share and process data in different engines, costly and complex movements and transformations between computing engines, services, and stores are necessary.
Distributed data migration is very challenging as it involves: i) different types/shapes of the data that each execution engine assumes to receive as input; ii) runtime- and topology-specific configuration. Each distributed big data store showcases different deployment configurations over large-scale platforms relative to the underlying platform and cost/performance tradeoffs. Optimal migration configurations can be vastly different for every producer and sink engine pair.

Muses natively is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data between distributed engine's instances in a resource-, cost- and performance-adaptive manner. As such, it performs, seamless information sharing overall participating resources in a standard, modular manner.
We show 30\% overall pipeline performance improvement, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses can be used to optimise large pipelines which are using multiple engines."

Link to the paper (crv)