Datenschutzerklärung|Data Privacy

A. Borusan

13.12.2012, 4:00 p.m. TU Berlin, EN building, seminar room EN 719 (7th floor), Einsteinufer 17, 10587 Berlin: "Splash: Managing Big Data for Composite Simulation Modeling" (Peter Haas, IBM Almaden Research Center)

The database community has raised the art of scalable DESCRIPTIVE analytics to a very high level. What enterprises really need, however, is PRESCRIPTIVE analytics to identify robust, high quality investment, planning, and policy decisions in the face of uncertainty. Such analytics, in turn, require deep PREDICTIVE analytics that go beyond mere statistical forecasting and are imbued with an understanding of the fundamental mechanisms that govern a system’s behavior, allowing what-if analyses. To help meet this need, IBM's Splash research prototype provides a platform for composing simulation models and datasets for cross-disciplinary modeling, simulation, and optimization in complex systems of systems such as those affecting population health and safety. Splash---which rests on a combination of data-integration, workflow management, simulation, and optimization technologies---loosely couples models via data exchange, unlike prior composite-simulation approaches. We outline the key components of Splash, and then focus on some novel MapReduce algorithms for transforming the outputs of one or more "upstream" models to create the inputs needed by a "downstream" model. In particular, we describe in detail Splash's time-alignment component, which detects and corrects for mismatches in time granularity between model outputs and inputs. We conclude by discussing some open research topics: Splash's model-and-data orientation requires significant extensions of many database technologies, such as data integration, query optimization and processing, and collaborative analytics.