Fork me on GitHub

Benchmark Setup

On this page, we describe our benchmark setup in detail so that others may reproduce and validate our results.


All benchmarks were executed on a dedicated lab-size cluster of thirteen nodes, each of which is equipped with two Intel Xeon 2.5 Ghz E5420 processors, 16 GB of RAM, and a 500 GB SATA 3.0 Gbps hard disk operating at 7200 RPM. The machines are connected by a single gigabit switch.
All machines run 64-bit Debian 6 (Squeeze) with the stock Linux 2.6.32 kernel.
Each NoSQL node stores data on a local ext3 partition mounted with the options ``defaults,noatime,data=writeback``.
We use the latest version of YCSB available via Git at the time of this writing, with no local modifications. The same version can be obtained by running (maven and Java must be installed):
$ git clone git://
$ cd YCSB
$ git checkout 7b564cc3
$ mvn clean package
  • Cassandra 1.2.0
  • MongoDB 2.2.2
  • HyperDex from Git 186ad8c3
  • OpenJDK 6

Configuration Issues

Both Cassandra and MongoDB required extensive manual effort to set up an optimized cluster. This section describes the additional effort we spent in order to tune these systems:


The latest Cassandra provides an automatic bootstrap mechanism, which is the easiest way to start up a cluster. But in our initial experiments (not reported here), we found that this mechanism often led to enormous load imbalances between the nodes, and therefore led to significantly diminished performance. In our initial tests, we saw some nodes acquire as many as as 60% of the key space while others owned less than 1%.

To avoid the problems we encountered in Cassandra's automatic partitioning, we manually partitioned the Cassandra ring to ensure that every node was responsible for an equal portion of the key space. To do this, set the initial_token setting in each node's configuration file such that each node owns 8.33% of the ring.

We suggest that Cassandra users who are relying on the automatic partitioning mechanism check their resulting ring structure and adjust their rings as well for optimum performance. The initial_token setting changed between Cassandra 1.1 and Cassandra 1.2 so you'll want to consult the latest Cassandra documentation.

For the benchmark, the cluster contains a single keyspace called usertable using SimpleStrategy and a replication factor of two. Inside this keyspace, we create a columnfamily called data.

MongoDB requires manual configuration of replication and sharding. The twelve nodes in our cluster are arranged as six replication sets of two nodes each. The collection is sharded across all six replication sets using the _id of the objects as that is what YCSB uses as the key.

Allocate extra time for deploying the MongoDB cluster. The default configuration of the Linux x86_64 binaries for version 2.2.2 will allocate 5% of the total disk space for the replication log. On our cluster, this is approximately 20GB per host.


HyperDex was compiled from Git commit 186ad8c3 using the latest versions of its dependencies at the time. LevelDB was built with Snappy support, and both were compiled from the most recent source tarballs.

HyperDex took the least amount of configuration and follows directly from the HyperDex quickstart guide. We deployed one HyperDex coordinator and twelve HyperDex daemons, each on a separate node. The exact commands to launch hyperdex were:

machine01 $ hyperdex coordinator
machine?? $ hyperdex daemon -d -c machine01 -D /local/hyperdex

There was no manual configuration of the HyperDex cluster. We relied solely on the automatic partitioning and setup of the default system. Workloads A-D,F use the following space

space usertable
key k
attributes field0, field1, field2, field3, field4,
           field5, field6, field7, field8, field9
create 24 partitions
tolerate 1 failure

Workloads E use the following space and has "hyperclient.scannable=true" set for YCSB.

space usertable
key k
attributes int recno, field0, field1, field2, field3, field4,
           field5, field6, field7, field8, field9
subspace recno
create 24 partitions
tolerate 1 failure
The YCSB benchmark was run in a consistent manner for all systems. Per the instructions on the YCSB wiki, we loaded Workload A, ran Workloads A, B, C, F, and D, cleared the database, loaded Workload E and ran Workload E. The systems were provided with 5 minutes after loading and in-between benchmarks to quiesce. Each benchmark was run with 16 client threads. Benchmarks that did not complete within one hour were terminated without a result.