Pluggable Storage Engines


Pluggable Storage Engines

The storage engine API allows MongoDB to be configured with a choice of storage engines, each configured for specific workloads. This "pluggable" approach reduces developer and operational complexity compared to running multiple databases. Now users can leverage the same MongoDB query language, data model, scaling, security, and operational tooling across different applications, each powered by pluggable MongoDB storage engines optimized for specific workloads.

MongoDB 3.0 shipped with two supported storage engines:

  • The default MMAPv1 engine, an improved version of the engine used in MongoDB releases, is enhanced with collection level concurrency control.
  • The new WiredTiger storage engine. For many applications, WiredTiger's more granular concurrency control and native compression provide significant benefits in lower storage costs, greater hardware utilization, higher throughput, and more predictable performance. Benchmarks show MongoDB 3.0 configured with WiredTiger delivers 7-10x higher performance than MongoDB 2.6 using the original MMAP storage engine. In addition, storage compression rates of up to 80% are fairly common.

The MongoDB 3.2 timeframe includes:

  • An encrypted storage engine protects data-at-restat rest, with keys secured by the industry-standard Key Management Interoperability Protocol (KMIP). At-rest encryption is especially critical for regulated industries such as healthcare, financial services, retailers, and certain government agencies. And with so many high-profile breaches of sensitive data at high-profile companies over the past five years, increasingly, all data is being encrypted.
  • An in-memory engine designed to serve ultra-high-throughput, low-latency apps typical in finance, ad-tech, gaming, real-time analytics, session management, and general cache use cases.
  • A new option for inserting only workloads (e.g., streaming IoT sensor data, log file analysis, social media feed ingestion), based on the LSM option in the WiredTiger storage engine.

With version 3.0, MongoDB introduced the Pluggable Storage Engine API as one of its major changes. WiredTiger, a pluggable storage engine bundled with MongoDB, compares it with the default storage engine that MongoDB has used up until version 3.0. We'll compare the two machines in terms of speed, disk use, and latency. We'll also introduce several other pluggable storage engines that are expected to become interesting alternatives.

Pluggable Storage Engine API

An application programming interface (API) is a relatively strict set of routines, protocols, and tools for building software applications. As an example, you should be aware that MongoDB offers an API that allows other software to interact with MongoDB without using the MongoDB shell: each of the MongoDB drivers you've been using uses the API provided by MongoDB to add driver functionality. They allow your application to communicate with the MongoDB database and to perform the basic CRUD operations on your documents in the database.

The Pluggable Storage Engine API allows third parties to develop storage engines for MongoDB. Before the Pluggable Storage Engine API, the only storage engine available to MongoDB was MMAPv1.

MongoDB still uses the MMAPv1 storage engine, and it's still the default storage engine in versions 3.0 and later. The MMAPv1 storage engine is based on memory mapping and has been a stable solution for MongoDB. One drawback to MMAPv1 that you'll notice soon if you have a lot of data to store is that it quickly consumes an enormous amount of disk space as your data set grows, to the extent that it preallocates 2 GB blocks every time it needs to grow in size. But preallocation is done by most database systems, and MongoDB is no exception. It does this in small, increasing increments at first. Still, once it becomes larger than 2 GB, every next increment will preallocate another 2 GB, so as a system administrator, you'll have to keep this in mind when managing disk space for your servers.

The database administrator must choose from the different storage alternatives, which dictate how data is stored on a disk. Since version 3.0, it's now possible to tell MongoDB to use a separate module for storage, and that's what the Pluggable Storage Engine API does. It provides functions that MongoDB needs to use to store data. MongoDB 3.0 comes bundled with an alternative to MMAPv1, which is WiredTiger.