Datenschutzerklärung|Data Privacy

Martin Pagel

Mo 30.11.2020 16:00 - 16:45 Uhr, Online: "State Management for Efficient Complex Event Processing" (Bo Zhao, Humboldt-Universität zu Berlin)

Complex event processing (CEP) systems that evaluate queries over event streams may face unpredictable input rates and query selectivities. During short peak times, exhaustive processing is then no longer reasonable, or even infeasible, and systems shall resort to best-effort query evaluation and strive for optimal result quality while staying within a latency bound. Additionally, to determine the events that constitute a query match, their payload data may need to be assessed together with data from remote sources. Such dependencies are problematic, since waiting for remote data to be fetched interrupts the processing of the stream. Yet, without event selection based on remote data, the query state to maintain may grow exponentially. In either case, the performance of the CEP system degrades drastically.
In this talk, I will discuss state-of-the-art solutions and their limitations. We argue that these issues are caused by exponential growth and dynamic changes of CEP intermediate results (states). To tackle such issues, we propose different optimisations of state management. Specifically, to cope with peak-time overload situations, we complement traditional input-based load shedding with a state-based technique that discards partial matches. Therefore, we introduce a hybrid model that combines both input-based and state-based shedding to achieve high result quality under constrained resources. For efficient remote data integration, we employ a cost-model to determine when to fetch certain remote data elements and how long to keep them in a cache for future use. We combine strategies for prefetching that queries remote data based on anticipated use and lazy evaluation that postpones the event selection based on remote data without interrupting the stream processing. We show how the cost model of CEP load shedding and efficient data integration can be unified based on states and efficient estimations for online processing. Our experiments indicate that hybrid CEP load shedding improves the result quality by up to 14× for synthetic data and 11.4× for real-world data, compared to baseline approaches; our CEP remote data integration techniques improve the latency of query evaluation by up to 3,728× for synthetic data and 206× for real-world data.

Short-Bio: Bo Zhao is a PhD student from Humboldt-Universität zu Berlin. His research interests include complex event processing, stream data management systems and code optimisation. He is currently working on optimisations of state management for efficient complex event processing systems, including load shedding, remote data integration and so forth. Further details can be found via

Log-in information:
If you are interested in attending the online presentation, please contact!