Datenschutzerklärung|Data Privacy
Impressum

01.12.2015
A. Borusan

18.01.2016, 16 Uhr c.t. TU Berlin, EN building, seminar room EN 719 (7th floor), Einsteinufer 17, 10587 Berlin: "Enabling distributed data processing in R" (Juan Fumero, University of Edinburgh)

During the past few years R has become an important language for
data analysis, data representation and visualization. R is a very
expressive language which combines functional and dynamic aspects,
with laziness and object oriented programming. However the default R
implementation is neither fast nor distributed, both features crucial
for
"big data" processing.

In this talk I will present FastR-Flink, a compiler based on Oracle's R
implementation FastR with support for some constructs of Apache
Flink, a Java/Scala framework for distributed data processing. The
Apache Flink constructs such as map, reduce or filter are integrated
at the compiler level to allow the execution of distributed stream and
batch data processing applications directly from the R programming
language.

This work has been initiated at Oracle Labs during my recent internship
with them in Linz, Austria.