Vision Paper: "Agora: Bringing Together Datasets, Algorithms, Models and More in a Unified Ecosystem" Accepted for Publication in the ACM SIGMOD Record

"Agora: Bringing Together Datasets, Algorithms,Models and More in a Unified Ecosystem (Vision)", Jonas Traub, Zoi Kaoudi, Jorge-Arnulfo Quiané-Ruiz, Volker Markl. To be Published in SIGMOD Record, 2021.

Data science and artificial intelligence are driven by a plethora of diverse data-related assets, including datasets, data streams, algorithms, processing software, compute resources, and domain knowledge. As providing all these assets requires a huge investment, data science and artificial intelligence technologies are currently dominated by a small number of providers who can afford these investments. This leads to lock-in effects and hinders features that require a flexible exchange of assets among users. In this paper, we introduce Agora, our vision towards a unified ecosystem that brings together data, algorithms, models, and computational resources and provides them to a broad audience. Agora (i) treats assets as first-class citizens and leverages a fine-grained exchange of assets, (ii) allows for combining assets to novel applications, and (iii) flexibly executes such applications on available resources. As a result, it enables easy creation and composition of data science pipelines as well as their scalable execution. In contrast to existing data management systems, Agora operates in a heavily decentralized and dynamic environment: Data, algorithms, and even compute resources are dynamically created, modified, and removed by different stakeholders. Agora presents novel research directions for the data management community as a whole: It requires to combine our traditional expertise in scalable data processing and management with infrastructure provisioning as well as economic and application aspects of data, algorithms, and infrastructure.

A preprint version of the vision paper is available here.

The SIGMOD Record is a quarterly publication of the Special Interest Group on Management of Data (SIGMOD) of the Association for Computing Machinery (ACM). SIGMOD is dedicated to the study, development, and application of database and information technology. Toward this goal, the SIGMOD Record publishes articles, reports, and interviews to cover the most recent development in the SIGMOD community. For more information visit their website at