[WS25/26] Scalable Data Management
Lecturer:
Prof. Dr.-Ing. habil. Dirk Habich
Assistants:
Dr.-Ing. Alexander Krause, Dr.-Ing. Johannes Pietrzyk
Contact: via EMail section
Time and Place:
Lecture:
Descriptive Slidesets for self studies.
The first lecture will be held on Monday, 13.10.2025, 13:00 in HSZ / 0204.
Exercise
After enrolling in this course, please also enroll for one of the exercise groups.
Beginning 20.10.2025. Exercises are performed In groups of 4 students, presence is expected (but optional) in one of the two slots.Students are expected to study the provided slideset of the current week on their own schedule and work on the corresponding exercise in the following week.
- Monday, 09:20 - 10:50, APB 007
- Monday 13:00 - 14:30, HSZ 0204 (might change in the future)
Description
"Data is the new Oil" - with this sentence, the relevance of structured data and thus, implicitly of course the relevance of scalable database systems as a fundamental technique of analytical and transactional processing of usually large data sets becomes visible. In the context of this course, we will discuss concepts and methods that enable distributed data processing with respect to two essential properties: on the one hand, the aspect of "performance" will be addressed and thus, questions of scalability in the case of scale-out architectures will be discussed using systems such as Apache Spark. On the other hand, the aspect of "consistency" will be discussed, where different methods for synchronizing concurrent read and write activities on the same dataset will be presented.
In general, the goal of this course is to give an insight into scalable techniques and methods of database technology. The course requires a basic knowledge of databases. Attendance of another advanced courses is not necessary, but helpful in some topics. The course exercises consist of tasks that are integrated into the lecture and practical exercises in dealing with "real" systems.
Content
- Basic Database Architectures
- Relational Foundation
- SQL
- Traditional Architecture for Transactional Worklaods (OLTP)
- Modern Architecture for Analytical Workloads (OLAP)
- Modern Architecture for Hybrid Workloads (HTAP)
- Parallel and Distributed Databases
- Different Kinds of Parallelism
- Parallel Operators
- Data Partitioning
- Data Replication
- Distributed Transactions
- Cloud Databases
- OLTP in the cloud
- OLAP in the cloud
Information
- Please register in OPAL, because all materials and all communication is restricted to registered course members only.