NVM storage in MonetDB for tables and graphs

From Master Projects
Jump to: navigation, search

About NVM storage in MonetDB for tables and graphs


MonetDB is a relational columnar and SQL-compatible DBMS, designed towards analytical workloads. In the runtime, data is stored in contiguous memory chunks, like arrays. If the contiguous data should hold an index, or rather, a (semi) sorted sequence of values that support sub-linear lookups by value, the challenge is how to handle updates at sub-linear cost as well. Moreover, we would like this data structure work well on non-volatile memory (NVM), such as 3D-XPoint. A final consideration for the data structure should be its ability to provide isolation during transaction processing, also at sub-linear cost. NVM memory is a bit slower than RAM, but persistent. The difficulty in handling it is in ensuring consistency, because writes may be reordered and in case of a crash there are no guarantees on what changes were flushed back already to RAM, and which not (though there are operations to explicitly enforce this). The data structure could somehow be akin to Packed Memory Arrays (PMAs), a cacheoblivious data structure well-known in theoretical computer science, but increase the locality of writes to fit NVM well.

A stretch-goal of the project is to use the data structure for storing updatable graphs. A graph consists of two tables (V,E), where E is stored in source-vertex order and each V stores the offset its first edge; such that it becomes very cheap to discover all edges starting in one particular vertex. Now, an edge insert may cause all subsequent edge offsets in the V table to change, yet the data structure should support updates at sub-linear cost (and be NVM friendly and support isolation).

You would be advised by Dean de Leo at CWI, and co-advised by Peter Boncz of CWI+VU.