Datenschutzerklärung|Data Privacy

L. Friedel

05.07.2021, 4.40 pm via Zoom Link: "Lakehouse: A New Architecture for Data Warehousing" (Matei Zaharia, Stanford University)

DBMS research colloquium: "Lakehouse: A New Architecture for Data Warehousing" by Matei Zaharia, Stanford University

Link via Zoom

Montag, 5. Juli 2021, 16:30 – 17:15

The top two challenges for data warehouse users today are data quality and staleness. While building reliable data pipelines is inherently hard, a lot of today’s problems stem from the complex data architectures that organizations deploy. These architectures contain many systems—data lakes, message queues, and warehouses—that data must pass through, where each transfer step adds delays and a potential source of errors. What if we could remove all these steps? In recent years, cloud storage and new open source systems have enabled a new architecture: the lakehouse, an ACID transactional layer over cloud storage that can provide streaming, management features, indexing, and high SQL performance similar to a data warehouse. In addition, because they build on open storage formats and direct file access, lakehouses support AI and data science workloads that are difficult to run on data warehouses. Thousands of organizations including the largest Internet companies are now using the lakehouse model to replace separate data lake, warehouse and streaming systems. I’ll discuss the key trends and research challenges in this area based on my experience with Databricks' customers and the open source Delta Lake project.

Matei Zaharia is an Assistant Professor of Computer Science at Stanford University and Chief Technologist at Databricks. He started the Apache Spark project during his PhD at UC Berkeley, and has worked on other widely used open source data analytics and AI software including MLflow and Delta Lake. At Stanford, he is a co-PI of the DAWN lab focusing on infrastructure for machine learning. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE), the highest honor bestowed by the US government to early-career scientists and engineers.

Zoom-Meeting beitreten

Meeting-ID: 641 4211 4998
Kenncode: 944783
Schnelleinwahl mobil
+496971049922,,64142114998#,,,,,,0#,,944783# Deutschland
+493056795800,,64142114998#,,,,,,0#,,944783# Deutschland

Einwahl nach aktuellem Standort
+49 69 7104 9922 Deutschland
+49 30 5679 5800 Deutschland
+49 695 050 2596 Deutschland
Meeting-ID: 641 4211 4998
Kenncode: 944783
Ortseinwahl suchen:

Über SIP beitreten

Über H.323 beitreten (Amsterdam
Niederlande) (Deutschland)
Kenncode: 944783
Meeting-ID: 641 4211 4998