One of Riak's central goals is high availability. It was built as a multi-node system in which any node is capable of receiving requests without requiring that each node participate in each request. In a system like this, it's important to be able to keep track of which version of a value is the most current. This is where vector clocks come in.
Vector Clocks and Relationships Between Objects
All Riak objects are stored in a location defined by the object's
bucket and key, as well as by the
bucket type defining the bucket's properties. It
is possible to configure Riak to ensure that
only one copy of an object ever exists in a specific location. This will
ensure that at most one object is returned when a read is performed on
a bucket type/bucket/key location (and no objects if Riak returns
If Riak is configured this way, Riak may still make use of vector clocks behind the scenes to make intelligent decisions about which replica of an object should be deemed the most recent, but in that case vector clocks will be a non-issue for clients connecting to Riak.
It is also possible to configure Riak to store multiple objects in a
single key, i.e. for an object to have different values on different
nodes. Objects stored this way are called siblings. You can instruct
Riak to allow for sibling creation by setting the the
bucket property to
true for a specific bucket, preferably using
bucket types. Note that in Riak versions 2.0 and later,
is set to
true by default for any bucket types that you create,
unless otherwise specified.
This is where vector clocks come in. Vector clocks are metadata attached to all Riak objects that enable Riak to determine the causal relationship between two two objects. Vector clocks are non-human- readable and look something like this:
A number of important aspects of the relationship between object replicas can be determined using vector clocks:
- Whether one object is a direct descendant of the other
- Whether the objects are direct descendants of a common parent
- Whether the objects are unrelated in recent heritage
From the standpoint of application development, the difficulty with siblings is that they by definition conflict with one another. When an application attempts to read an object that has siblings, multiple replicas will be stored in the location where the application is looking. This means that the application will need to develop a strategy for conflict resolution.
Additional information on vector clocks: