In-Memory Data ManagementProf. Hasso Plattner
This video belongs to the openHPI course In-Memory Data Management. Do you want to see more? Enroll yourself for free!

Query Scheduling

Time effort: approx. 9 minutes
You are using our old video player. Do you want to switch to our new player?

About this video


The last part of the query processing topic deals with query scheduling, determining the execution order of queries and operators.

To ease the understanding of this lecture video, we further want to explain some specific vocabulary used in this part:

Workers execute the tasks. Depending on the database’s architecture, a worker is an operating system process or thread. A process is a program in execution. It has an address space (for data to operate on), kernel resources (to access files), and at least one thread that executes code. Creating processes and threads induces overhead which does not contribute to query processing. Instead of spawning a new worker for every task to execute, databases usually use a fixed-size pool of workers and assign tasks to them. Workers run on CPU cores. For NUMA systems, workers should primarily execute near the data they operate on. Therefore, we can create a separate worker pool per socket and bind the workers to that socket. To feed the socket-bound workers, the database has one or more local task queues. Tasks are put into task queues so that workers primarily access socket-local data. In real-world applications, workloads are often highly skewed. If their task queue is empty, workers can steal tasks from other sockets’ task queues. The degree of how much work stealing is allowed depends on the distance between sockets, CPU load, saturation of the interconnection, and other factors.

If further questions remain, we are happy to address them in the forum!