Cluster Metadata

Cluster metadata is a subsystem inside of Riak that enables systems built on top of riak_core to work with information that is stored cluster wide and can be read without blocking on communication over the network.

One notable example of a subsystem of Riak relying on cluster metadata is Riak’s bucket types feature. This feature requires that a particular form of key/value pairs, namely bucket type names (the key) and their associated bucket properties (the value), be asynchronously broadcast to all nodes in a Riak cluster.

Though it is different in crucial respects, etcd is a roughly analogous cluster metadata key/value store developed for use in CoreOS clusters.

How Cluster Metadata Works

Cluster metadata is different from other Riak data in two essential respects:

  1. Cluster metadata is intended only for internal Riak applications that require metadata shared on a system-wide basis. Regular stored data, on the other hand, is intended for use outside of Riak.
  2. Because it is intended for use only by applications internal to Riak, cluster metadata can be accessed only internally, via the Erlang interface provided by the riak_core_metadata module; it cannot be accessed externally via HTTP or Protocol Buffers.

The storage system backing cluster metadata is a simple key/value store that is capable of asynchronously replicating information to all nodes in a cluster when it is stored or modified. Writes require acknowledgment from only a single node (equivalent to w=1 in normal Riak), while reads return values only from the local node (equivalent to r=1). All updates are eventually consistent and propagated to all nodes, including nodes that join the cluster after the update has already reached all nodes in the previous set of members.

All cluster metadata is eventually stored both in memory and on disk, but it should be noted that reads are only from memory, while writes are made both to memory and to disk. Logical clocks, namely dotted version vectors, are used in place of vector clocks or timestamps to resolve value conflicts. Values stored as cluster metadata are opaque Erlang terms addressed by both prefix and a key.

Erlang Code Interface

If you’d like to use cluster metadata for an internal Riak application, the Erlang interface is defined in the riak_core_metadata module, which allows you to perform a variety of cluster metadata operations, including retrieving, modifying, and deleting metadata and iterating through metadata keys.