Internal Architecture of a Modern NoSQL datastore


Users of Big Data oriented datastores want to process millions of operations per second in a latency-sensitive manner. To achieve these kind of performance, the kernel can be a friend with whom we ally, or a foe who we bypass.

In this talk we will present the internal architecture of the Scylla datastore--a C++ from-scratch reimplementation of the popular Apache Cassandra with a performance baseline not previously seen in the NoSQL world.

Some of the components that make Scylla unique include an userspace CPU scheduler, an userspace disk I/O Scheduler, a tailored memory allocator and even changes to the standard C++ exception handling. In this talk I will detail those components: their reason for being and the advantages that we draw from having them.

I will discuss the areas in which Linux help us the most and the areas in which we are better of rewriting things in userspace.  Attendees will learn about what Linux interfaces are employed in modern datastores and why and which trade offs usually go with them.


Ballroom C
Sunday, March 11, 2018 - 11:30 to 12:30