Download Riak 2.0

Configuring Riak CS

For Riak CS to operate properly it must know how to connect to Riak. A Riak CS node typically runs on the same server as its corresponding Riak node, which means that changes will only be necessary if Riak is configured using non-default settings.

Riak CS's settings reside in CS node's app.config file, which is typically located in the /etc directory. Configurable parameters related to Riak CS specifically can be found in the riak_cs section of that file. That section looks something like this:

{riak_cs, [
    {parameter1, value},
    {parameter2, value},
    %% and so on...
]},

The sections below walk you through some of the main configuration categories that you will likely encounter while operating Riak CS. For a comprehensive listing of available parameters, see the Full Configuration Listing section below.

Host and Port

To connect Riak CS to Riak, make sure that the following parameters are set to the host and port used by Riak:

You will also need to set the host and port for Riak CS:

Note on IP addresses

The IP address you enter here must match the IP address specified for the Protocol Buffers interface in the Riak app.config file unless Riak CS is running on a completely different network, in which case address translation is required.

After making any changes to the app.config file in Riak CS, restart the node if it is already running.

Specifying the Stanchion Node

If you're running a single Riak CS node, you don't have to change the Stanchion settings because Stanchion runs on the local host. If your Riak CS system has multiple nodes, however, you must specify the IP address and port for the Stanchion node and whether or not SSL is enabled.

The Stanchion settings reside in the Riak CS app.config file, which is located in the /etc directory of each Riak CS node. The settings appear in the riak_cs config section of the file.

To set the host and port for Stanchion, do the following:

Enabling SSL

SSL is disabled by default in Stanchion, i.e. the stanchion_ssl variable is set to false. If Stanchion is configured to use SSL, change this variable to true. The following example configuration would set the Stanchion host to localhost, the port to 8085 (the default), and set up Stanchion to use SSL:

{riak_cs, [
    %% Other configs

    {stanchion_ip, "127.0.0.1"},
    {stanchion_host, 8085},
    {stanchion_ssl, true},

    %% Other configs
]}

Specifying the Node Name

You can also set a more useful name for the Riak CS node, which is helpful to identify the node from which requests originate during troubleshooting. This setting resides in the Riak CS vm.args configuration file, which is also located in the /etc directory. This would set the name of the Riak CS node to riak_cs@127.0.0.1:

-name riak_cs@127.0.0.1

Change 127.0.0.1 to the IP address or hostname for the server on which Riak CS is running.

Specifying the Admin User

The admin user is authorized to perform actions such as creating users or obtaining billing statistics. An admin user account is no different from any other user account. You must create an admin user to use Riak CS.

Note on anonymous user creation

Before creating an admin user, you must first set {anonymous_user_creation, true} in the Riak CS app.config. You may disable this again once the admin user has been created.

To create an account for the admin user, use an HTTP POST request with the username you want to use for the admin account. The following is an

curl -H 'Content-Type: application/json' \
  -XPOST http://<host>:<port>/riak-cs/user \
  --data '{"email":"foobar@example.com", "name":"admin_user"}'

The JSON response will look something like this:

{
  "Email": "foobar@example.com",
  "DisplayName": "adminuser",
  "KeyId": "324ABC0713CD0B420EFC086821BFAE7ED81442C",
  "KeySecret": "5BE84D7EEA1AEEAACF070A1982DDA74DA0AA5DA7",
  "Name": "admin_user",
  "Id":"8d6f05190095117120d4449484f5d87691aa03801cc4914411ab432e6ee0fd6b",
  "Buckets": []
}

You can optionally send and receive XML if you set the Content-Type to application/xml, as in this example:

Once the admin user exists, you must specify the credentials of the admin user on each node in the Riak CS system. The admin user credential settings reside in the Riak CS app.config file, which is located in the etc/riak-cs directory. The settings appear in the Riak CS config section of the file. Paste the key_id string between the quotes for the admin_key. Paste the key_secret string into the admin_secret variable, as shown here:

%% Admin user credentials
{admin_key, "324ABC0713CD0B420EFC086821BFAE7ED81442C"},
{admin_secret, "5BE84D7EEA1AEEAACF070A1982DDA74DA0AA5DA7"},

Once the admin user exists, you must specify the credentials of the admin user in the app.config file. Those will be the same credentials that you received as a JSON object when you ran the POST request to create the user.

Bucket Restrictions

If you wish, you can limit the number of buckets created per user. The default maximum is 100. Please note that if a user exceeds the bucket creation limit, they are still able to perform other actions, including bucket deletion. You can change the default limit using the max_buckets_per_user parameter in each node's app.config file. The example configuration below would set the maximum to 1000:

{riak_cs, [
    %% Other configs

    {max_buckets_per_user, 1000},

    %% Other configs
]}

If you want to avoid setting a limit on per-user bucket creation, you can set max_buckets_per_user to unlimited.

Connection Pools

Riak CS uses two distinct connection pools for communication with Riak: a primary and a secondary pool.

The primary connection pool is used to service the majority of API requests related to the upload or retrieval of objects. It is identified in the configuration file as request_pool. The default size of this pool is 128.

The secondary connection pool is used strictly for requests to list the contents of buckets. The separate connnection pool is maintained in order to improve performance. This secondary connection pool is identified in the configuration file as bucket_list_pool. The default size of this pool is 5.

The following shows the connection_pools default configuration entry that can be found in the app.config file:

{riak_cs, [
    %% Other configs

    {connection_pools,
    [
     {request_pool, {128, 0} },
     {bucket_list_pool, {5, 0} }
    ]},

    %% Other configs
]}

The value for each pool is represented as a pair with the first element representing the normal size of the pool. This is representative of the number of concurrent requests of a particular type that a Riak CS node may service. The second element represents the number of allowed overflow pool requests that are allowed. It is not recommended that you use any value other than 0 for the overflow amount unless careful analysis and testing has shown it to be beneficial for a particular use case.

Tuning

We strongly recommend you that you [[increase the value of the pb_backlog setting]] in Riak. When a Riak CS node is started, each connection pool begins to establish connections to Riak. This can result in a thundering herd problem in which connections in the pool believe they are connected to Riak, but in reality some of the connections have been reset. Due to TCP RST packet rate limiting (controlled by net.inet.icmp.icmplim) some of the connections may not receive notification until they are used to service a user's request. This manifests as an {error, disconnected} message in the Riak CS logs and an error to returned to the user.

Enabling SSL in Riak CS

%%{ssl, [
%%    {certfile, "./etc/cert.pem"},
%%    {keyfile, "./etc/key.pem"}
%%   ]},

Then replace the text in quotes with the path and filename for your SSL encryption files. By default, there's a cert.pem and a key.pem in each node's /etc directory. You're free to use those or to supply your own.

Please note that you must also provide a certificate authority, aka a CA cert and specify its location using the cacertfile parameter. Unlike certfile and keyfile, the cacertfile parameter is not commented out. You will need to add it yourself. Here's an example configuration with this parameter included:

{ssl, [
       {certfile, "./etc/cert.pem"},
       {keyfile, "./etc/key.pem"},
       {cacertfile, "./etc/cacert.pem"}
      ]},
      %% Other configs

Instructions on creating your own CA cert can be found here.

Proxy vs. Direct Configuration

Riak CS can interact with S3 clients in one of two ways:

Proxy

To establish a proxy configuration, configure your client's proxy settings to point to Riak CS cluster's address. Then configure your client with Riak CS credentials.

When Riak CS receives the request to be proxied, it services the request itself and responds back to the client as if the request went to S3.

On the server side, the cs_root_host in the riak_cs section of the app.config configuration file must be set to s3.amazonaws.com because all of the bucket URLs request by the client will be destined for s3.amazonaws.com. This is the default.

Note: One issue with proxy configurations is that many GUI clients only allow for one proxy to be configured for all connections. For customers trying to connect to both S3 and Riak CS, this can prove problematic.

Direct

The establish a direct configuration, the cs_root_host in the riak_cs section of app.config must be set to the FQDN of your Riak CS endpoint, as all of the bucket URLs will be destined for the FQDN endpoint.

You will also need wildcard DNS entries for any child of the endpoint to resolve to the endpoint itself. Here's an example:

data.riakcs.net
*.data.riakcs.net

Garbage Collection Settings

The following options are available to make adjustments to the Riak CS garbage collection system. More details about garbage collection in Riak CS are available in Garbage Collection.

There are two configuration options designed to provide improved performance for Riak CS when using Riak 1.4.0 or later. These options take advantage of additions to Riak that are not present prior to version 1.4.0.

Concurrency and Buffering

There are two parameters related to concurrency and buffering that you may wish to add to your Riak CS settings if you are having issues with PUT requests.

Config Description Default
put_concurrency The number of threads inside of Riak CS that are used to write blocks to Riak. 1
put_buffer_factor The number of blocks that will be buffered in-memory in Riak CS before it begins to slow down reading from the HTTP client. 1

Raising the value of both of these parameters may provide higher single-client throughput.

Other Riak CS Settings

The app.config file includes other settings, such as whether to create log files and where to store them. These settings have default values that work in most cases.