报告题目:Database Systems and Information Management – Trends and a Vision
报告简介:The global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define “big data”, i.e., volume, velocity, variety, and veracity. Locally, over the past five years, we have also been working on varying fronts. Among our contributions are: (1) establishing a vision for a database-inspired big data analytics system, which unifies the best of database and distributed systems technologies, and augments it with concepts drawn from compilers (e.g., iterations) and data stream processing, as well as (2) forming a community of researchers and institutions to create the Stratosphere platform to realize our vision. One major result from these activities was Apache Flink, an open-source big data analytics platform and its thriving global community of developers and production users. Although much progress has been made, when looking at the overall big data stack, a major challenge for database research community still remains. That is, how to maintain the ease-of-use despite the increasing heterogeneity and complexity of data analytics, involving specialized engines for various aspects of an end-to-end data analytics pipeline, including, among others, graph-based, linear algebra-based, and relational-based algorithms, and the underlying, increasingly heterogeneous hardware and computing infrastructure. At TU Berlin, DFKI, and the Berlin Institute for Foundations of Learning and Data (BIFOLD) we currently aim to advance research in this field via the Nebula Stream and Agora projects. Our goal is to remedy some of the heterogeneity challenges that hamper developer productivity and limit the use of data science technologies to just the privileged few, who are coveted experts. In this talk, we will outline how state-of-the-art SPEs have to change to exploit the new capabilities of the IoT and showcase how we tackle IoT challenges in our own system, NebulaStream. We will also present our vision for Agora, an asset ecosystem that provides the technical infrastructure for offering and using data and algorithms, as well as physical infrastructure components.
Volker Markl is a Full Professor and Chair of the Database Systems and Information Management (DIMA) Group at the Technische Universität Berlin (TU Berlin). At the German Research Center for Artificial Intelligence (DFKI), he is Chief Scientist and Head of the Intelligent Analytics for Massive Data Research Group. In addition, he is Director of the Berlin Institute for the Foundations of Learning and Data (BIFOLD), a merger of the Berlin Big Data Center (BBDC) and the Berlin Center for Machine Learning (BZML). BIFOLD is one of Germany's national Competence Centers for Artificial Intelligence and will further bolster ongoing collaborative research in scalable data management and Machine Learning. Dr. Markl is a database systems researcher conducting research at the intersection of distributed systems, scalable data processing, text mining, computer networks, machine learning, and applications in healthcare, logistics, Industry 4.0, and information marketplaces. Earlier in his career, he was a Research Staff Member and Project Leader at the IBM Almaden Research Center in San Jose, California, USA and a Research Group Leader at FORWISS, the Bavarian Research Center for Knowledge-based Systems located in Munich, Germany. Volker Markl is a computer science graduate from Technische Universität München, where he earned his Diploma in 1995 with a thesis on exception handling in programming languages. He earned his PhD in 1999 the area of multidimensional indexing under the supervision of Rudolf Bayer.
Volker Markl has published numerous scholarly papers on indexing, query optimization, lightweight information integration, and scalable data processing at prestigious venues. He holds 18 patents, has transferred technology into several commercial products, and has been involved in two successful startup exits. He has been both the Speaker and Principal Investigator for the Stratosphere Project, which resulted in a Humboldt Innovation Award as well as Apache Flink, the open-source big data analytics system. He currently serves as the President of the VLDB Endowment and was elected as one of Germany's leading Digital Minds (Digitale Köpfe) by the German Informatics (GI) Society. Volker also is a member of the Scientific Advisory Board of Software AG. Most recently, Volker and his team earned the ACM SIGMOD 2020 Best Paper Award, for their work on „ Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects.