Introduction
This article shows how to implement a persistent memory (PMEM)-aware queue using a linked list and the C++ bindings of the Persistent Memory Development Kit (PMDK) library libpmemobj.
A queue is a first in first out (FIFO) data structure that supports push and pop operations. In a push operation, a new element is added to the tail of the queue. In a pop operation, the element at the head of the queue gets removed.
A PMEM-aware queue differs from a normal queue in that its data structures reside permanently in persistent memory, and a program or machine crash could result in an incomplete queue entry and a corrupted queue. To avoid this, queue operations must be made transactional. This is not simple to do, but PMDK provides support for this and other operations specific to persistent memory programming.
We'll walk through a code sample that describe the core concepts and design considerations for creating a PMEM-aware queue using libpmemobj. You can build and run the code sample by following the instructions provided later in the article.
For background on persistent memory and the PMDK, read the article Introduction to Programming with Persistent Memory from Intel and watch the Persistent Memory Programming Video Series.
C++ Support in libpmemobj
The main features of the C++ bindings for libpmemobj include:
- Transactions
- Wrappers for basic types: automatically snapshots the data during a transaction
- Persistent pointers
Transactions
Transactions are at the core of libpmemobj operations. This is because, in terms of persistence, the current x86-64 CPUs guarantee atomicity only for 8-byte stores. Real-world apps update in larger chunks. Take, for example, strings; it rarely makes sense to change only eight adjacent bytes from one consistent string state to another. To enable atomic updates to persistent memory in larger chunks, libpmemobj implements transactions.
Libpmemobj uses undo log-based transactions instead of redo log-based for visibility reasons. Changes made by the user are immediately made visible. This allows for a more natural code structure and execution flow, which in turn improves code maintainability. This also means is that in the case of an interruption in the middle of a transaction, all of the changes made to the persistent state will be rolled back.
Transactions have ACID (atomicity, consistency, isolation, and durability)-like properties. Here's how these properties relate to programming with the PKDK:
Atomicity: Transactions are atomic with respect to persistency; All the changes made within a transaction are committed when the transaction is completed successfully or none of them are.
Consistency: The PMDK provides functionality to enable the user to maintain data consistency.
Isolation: The PMDK library provides persistent memory-resident synchronization mechanisms to enable the developer to maintain isolation.
Durability: All of a transaction's locks are held until the transaction completes to ensure durability.
Transactions are done on a per thread basis, so the call returns the status of the last transaction performed by the calling thread. Transactions are power-safe but not thread-safe.
The <p> property
In a transaction, undo logs are used to snapshot user data. The <p>
template wrapper class is the basic building block for automating snapshotting of the user data so app developers don't need to do this step manually (as is the case with the C implementation of libpmemobj). This wrapper class supports only basic types. Its implementation is based on the assignment operator and each time the variable of this wrapper class is assigned a new value, the old value of the variable is snapshotted. Use of the <p>
property for stack variables is discouraged because snapshotting is a computationally intensive operation.
Persistent pointers
Libraries in PMDK are built on the concept of memory mapped files. Since files can be mapped at different addresses of the process virtual address space, traditional pointers that store absolute addresses cannot be used. Instead, PMDK introduces a new pointer type that has two fields: an ID to the pool (used to access current pool virtual address from a translation table), and an offset from the beginning of the pool. Persistent pointers are a C++ wrapper around this basic C type. Its philosophy is similar to that of std::shared_ptr
.
libpmemobj Core Concepts
Root object
Making any code PMEM-aware using libpmemobj always involves, as a first step, designing the types of data objects that will be persisted. The first type that needs to be defined is that of the root object. This object is mandatory and used to anchor all the other objects created in the persistent memory pool (think of a pool as a file inside a PMEM device).
Pool
A pool is a contiguous region of PMEM identified by a user-supplied identifier called layout. Multiple pools can be created with different layout strings.
Queue Implementation using C++ Bindings
The queue in this example is implemented as a singly linked list, with a head and tail that demonstrates how to use the C++ bindings of libpmemobj.
Design Decisions
Data structures
The first thing we need is a data structure that describes a node in the queue. Each entry has a value and a link to the next node. As per the figure below, both variables are persistent memory-aware.
Figure 1. Data structure describing the queue implementation.
Code walkthrough
Now, let's go a little deeper into the main function of the program. While running the code you need to provide three arguments. One is the absolute location of the pool file, while the second one is the actual queue operation that needs to be performed. The supported operations in the queue are push
(insert element), pop
(return and remove element), and show
(return element).
if (argc < 3) { std::cerr << "usage: "<< argv[0] << " file-name [push [value]|pop|show]"<< std::endl; return 1; }
In the snippet below, we check to see if the pool file exists. If it does, the pool is opened. If it doesn't exist, the pool is created. The layout string identifies the pool that we requested to open. Here we are opening the pool with layout name Queue as defined by the macro LAYOUT
in the program.
const char *path = argv[1]; queue_op op = parse_queue_op(argv[2]); pool<examples::pmem_queue> pop; if (file_exists(path) != 0) { pop = pool<examples::pmem_queue>::create( path, LAYOUT, PMEMOBJ_MIN_POOL, CREATE_MODE_RW); } else { pop = pool<examples::pmem_queue>::open(path, LAYOUT); }
pop
is the pointer to the pool from where we can access a pointer to the root object, which is an instance of examples::pmem_queue
, and the Create function creates a new pmemobj pool of type examples::pmem_queue
. The root object is like the root of a file system, since it can be used to reach all of the other objects in the pool (as long as these objects are linked properly and no pointers are lost due to coding errors).
auto q = pop.get_root();
Once you get the pointer to the queue object, the program checks the second argument in order to identify what type of action the queue should perform; that is, push
, pop
, or show
.
switch (op) { case QUEUE_PUSH: q->push(pop, atoll(argv[3])); break; case QUEUE_POP: std::cout << q->pop(pop) << std::endl; break; case QUEUE_SHOW: q->show(); break; default: throw std::invalid_argument("invalid queue operation"); }
Queue operations
Push
Let's look at how the push
function is implemented to make it persistent programming-aware. As shown in the code below, the transactional code is implemented as a lambda function wrapped in a C++ closure (this makes it easy to read and follow the code). If a power failure happens the data structure does not get corrupted because all changes are rolled back. For more information how transactions are implemented in C++, read C++ bindings for libpmemobj (part 6) - transactions on pmem.io.
Allocation functions are transactional as well, and they use transaction logic to enable allocation/delete rollback of the persistent state; make_persistent()
is the constructor, while delete_persistent()
is the destructor.
Calling make_persistent()
inside a transaction allocates an object and returns a persistent object pointer. As the allocation is now part of the transaction, if it aborts, the allocation is rolled back, reverting the memory allocation back to its original state.
After the allocation, the value of n
is initialized to the new value in the queue, and the next pointer is set to null.
void push(pool_base &pop, uint64_t value) { transaction::exec_tx(pop, [&] { auto n = make_persistent<pmem_entry>(); n->value = value; n->next = nullptr; if (head == nullptr && tail == nullptr) { head = tail = n; } else { tail->next = n; tail = n; } }); }
2. Data structure for push functionality.
Pop
Similar to push
, the pop
function is shown below. Here we need a temporary variable to store a pointer to the next pmem_entry
in the queue. This is needed in order to set the head of the queue to the next pmem_entry
after deleting the head using delete_persistent()
. Since this is done using a transaction, it is persistent-aware.
uint64_t pop(pool_base &pop){ uint64_t ret = 0; transaction::exec_tx(pop, [&] { if (head == nullptr) transaction::abort(EINVAL); ret = head->value; auto n = head->next; delete_persistent<pmem_entry>(head); head = n; if (head == nullptr) tail = nullptr; }); return ret; }
Figure 3. Data structure for pop functionality.
Build Instructions
Instructions to run the code sample
Download the source code from the PMDK GitHub* repository:
- Git clone https://github.com/pmem/pmdk.git
Figure 4. Download source code from the GitHub* repository. cd pmdk
and runmake
on the command line as shown below. This builds the complete source code tree.
Figure 5. Building the source code.cd pmdk/src/examples/libpmemobj++/queue
- View command line options for the queue program:
./queue
- Push command:
./queue TESTFILE push 8
Figure 6. PUSH command using command line. - Pop command:
./queue TESTFILE pop
- Show command:
./queue TESTFILE show
Figure 7. POP command using command line.
Summary
In this article, we showed a simple implementation of a PMEM-aware queue using the C++ bindings of the PMDK library libpmemobj. To learn more about persistent memory programming with PMDK, visit the Intel® Developer Zone (Intel® DZ) Persistent Memory Programming site. There you will find articles, videos, and links to other important resources for PMEM developers.
About the Author
Praveen Kundurthy is a Developer Evangelist with over 14 years of experience in application development, optimization and porting to Intel platforms. Over the past few years at Intel, he has worked on topics spanning Storage technologies, Gaming, Virtual reality and Android on Intel platforms.