Datenschutzerklärung|Data Privacy
Impressum

20.12.2018
K. Forster

The paper "Scalable Frequent Sequence Mining With Flexible Subsequence Constraints" was accepted for Publication at ICDE 2019

Scalable Frequent Sequence Mining With Flexible Subsequence Constraints, Alexander Renz-Wieland, Matthias Bertsch, Rainer Gemulla . To appear in ICDE 2019.

Abstract
We study scalable algorithms for frequent sequence mining under flexible subsequence constraints. Such constraints enable applications to specify concisely which patterns are of interest and which are not. We focus on the bulk synchronous parallel model with one round of communication; this model is suitable for platforms such as MapReduce or Spark. We derive a general framework for frequent sequence mining under this model and propose the D-SEQ and D-CAND algorithms within this framework. The algorithms differ in what data is communicated and how computation is split up among workers. To the best of our knowledge, D-SEQ and D-CAND are the
first scalable algorithms for frequent sequence mining with flexible constraints. We conducted an experimental study on multiple real-world datasets that suggests that our algorithms scale nearly linearly, outperform common baseline, and offer acceptable generalization overhead over existing, less general mining algorithms.

A preprint version is available here.