Datenschutzerklärung|Data Privacy
Impressum

03.03.2016
K. Forster

03.03.2016: Two Papers Accepted @SIGMOD'16 and @ICWSM'16

We are happy to announce that

the demo paper Emma in Action: Declarative Dataflows for Scalable Data Analysis by Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, Andreas Salzmann, Volker Markl
was accepted for demonstration/publication in
SIGMOD 2016,

and that the short paper Tracking the Trackers: A Large-Scale Analysis of Embedded Web Trackers by Sebastian Schelter from DIMA and Jérôme Kunegis from the University of Koblenz-Landau was accepted as short paper at the AAAI International Conference on Weblogs and Social Media
(ICWSM).


Emma in Action: Declarative Dataflows for Scalable Data Analysis:
Abstract:
Parallel dataflow APIs based on second-order functions were originally seen as a flexible alternative to SQL. Over time, however, their complexity increased due to the number of physical aspects that had to be exposed by the underlying engines in order to facilitate efficient execution. To retain a sufficient level of abstraction and lower the barrier of entry for data scientists, projects like Spark and Flink currently offer domain-specific APIs on top of their parallel collection abstractions.
This demonstration highlights the benefits of an alternative design based on deep language embedding. We showcase Emma - a programming language embedded in Scala. Emma promotes parallel collection processing trough Scala's for-comprehensions - a declarative syntax akin to SQL. In addition, Emma also advocates quasi-quoting the entire data analysis algorithm rather than its individual dataflow expressions. This allows for decomposing the quoted code into (sequential) control flow and (parallel) dataflow fragments, optimizing the dataflows in context, and transparently offloading them to an engine like Spark or Flink. The proposed design promises increased programmer productivity due to avoiding an impedance mismatch, thereby reducing the lag times and cost of data analysis.


Tracking the Trackers: A Large-Scale Analysis of Embedded Web Trackers:
Abstract:
The short paper 'Tracking the Trackers: A Large-Scale Analysis of Embedded Web Trackers' by Sebastian Schelter from DIMA and Jérôme Kunegis from the University of Koblenz-Landau has been accepted for publication at the AAAI International Conference on Weblogs and Social Media (ICWSM) 2016.
The paper uses Apache Flink to analyze a dataset representing the network of online tracking services on the Web, extracted from 3.5 billion web pages of the CommonCrawl corpus.