Categories: Benchmarks

Introduction to YCSB

I recently started looking into the paper on the Yahoo! Cloud Serving Benchmark. It briefly discusses OLTP, (which is explained at Online transaction processing (OLTP) – Azure Architecture Center and Online transaction processing – Wikipedia) and compares various databases like Bigtable and Apache CouchDB.

Benchmark Execution

The YCSB repo explains that bin/ycsb.sh is used to load and run the benchmark. The actual command line executed on the shell is an invocation of the JDK with a YCSB class. For the load and run commands, site.ycsb.Client is set as the YCSB_CLASS. For the shell command, the site.ycsb.CommandLine class is used instead.

"$JAVA_HOME/bin/java" $JAVA_OPTS -classpath "$CLASSPATH" $YCSB_CLASS $YCSB_COMMAND -db $BINDING_CLASS $YCSB_ARGS

The YCSB_COMMAND passed to the Client class is set to -load and -t respectively, for the load and run arguments to the script. The -db argument specified which class to use for the database client. This comes from the second parameter to the script (grep is used to match the script’s 2nd argument with a line in bindings.properties that specifies the corresponding Java class).

Setting up YSCB with a MySQL Database

Database Installation

In addition to the original paper, Planet MySQL also has YCSB results for runs against a MySQL database. The ease of use of a local database prompts me to start out with MySQL as well. Ubuntu docs explain how to Install and configure a MySQL server.

saint@ubuntuvm2:~$ sudo apt install mysql-server
[sudo] password for saint: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
  libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
  libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
  mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
  mysql-common mysql-server-8.0 mysql-server-core-8.0
Suggested packages:
  libipc-sharedcache-perl mailx tinyca
The following NEW packages will be installed:
  libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
  libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
  libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
  mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
  mysql-common mysql-server mysql-server-8.0 mysql-server-core-8.0
0 upgraded, 20 newly installed, 0 to remove and 2 not upgraded.
Need to get 29.2 MB of archives.
After this operation, 242 MB of additional disk space will be used.
Do you want to continue? [Y/n] 

Getting YCSB Sources

Now that MySQL is installed, we need the YCSB sources to run. I started out by cloning the YCSB repo.

mkdir -p ~/java/benchmarks/ycsb
cd ~/java/benchmarks/ycsb
git clone https://github.com/brianfrankcooper/YCSB
cd YCSB

As a Java repo rookie, I simply ran bin/ycsb.sh load basic -P workloads/workloada as mentioned in the readme without realizing that I needed to first build the repo, duh. That failed with this error:

$ export JAVA_HOME=~/java/binaries/jdk/x64/jdk-20+36
$ bin/ycsb.sh load basic -P workloads/workloada

Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
Caused by: java.lang.ClassNotFoundException: site.ycsb.db.JdbcDBCreateTable

Use mvn to build the sources:

# Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
# https://github.com/brianfrankcooper/YCSB/issues/257#issuecomment-104845560

sudo apt install maven
mvn clean package

I end up with test failures, what do you know?

Getting YCSB Binaries

I decided I might as well just follow the main readme steps and not deal with any build issues.

cd ~/java/benchmarks/ycsb

sudo apt install curl
curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz

tar xfvz ycsb-0.17.0.tar.gz
cd ycsb-0.17.0

Launching YCSB

Launch YCSB in the folder from the tar.gz file:

# Notice the version in the path below needs to be updated from what is used at
# https://github.com/brianfrankcooper/YCSB/tree/master/jdbc
#
# The MySQL connectors are at https://dev.mysql.com/downloads/connector/j/?os=26

java -cp jdbc-binding/lib/jdbc-binding-0.17.0.jar:../mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar site.ycsb.db.JdbcDBCreateTable -P myjdbc.properties -n ycsbtable

Turns out the driver in the docs is outdated:

Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Error in creating table. java.sql.SQLException: Access denied for user 'admin'@'localhost' (using password: YES)

Configuring the Database

To determine which user to run as, use the approach from MySQL SHOW USERS: List All Users in a MySQL Database Server (mysqltutorial.org). Launch mysql then enter these queries:

mysql> SELECT user FROM mysql.user;
+------------------+
| user             |
+------------------+
| debian-sys-maint |
| mysql.infoschema |
| mysql.session    |
| mysql.sys        |
| root             |
+------------------+
5 rows in set (0.00 sec)

mysql> SELECT user();
+----------------+
| user()         |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)

Let us create a new user for the benchmarks as outlined in How to Create MySQL User and Grant Privileges: A Beginner’s Guide (hostinger.com). Note that we need to create the database as well since the connection string in the properties file specifies the ycsb database. TODO: narrow the priviledges.

CREATE DATABASE ycsb;
CREATE USER 'ycsbuser'@'localhost' IDENTIFIED BY 'ProfileIt!';
GRANT ALL PRIVILEGES ON * . * TO 'ycsbuser'@'localhost';

Hard to believe but the JdbcDBCreateTable class fails!

losing database connection.
Error in creating table. java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY KEY, FIELD0 TEXT, FIELD1 TEXT, FIELD2 TEXT, FIELD3 TEXT, FIELD4 TEXT, FI' at line 1

Gets me curious about seeing the queries coming in. A quick look at logging – How to show the last queries executed on MySQL? – Stack Overflow convinces me that it’s not worth doing yet. We can manually create the table for the benchmark in MySQL.

USE ycsb;
CREATE TABLE ycsbtable (
	YCSB_KEY VARCHAR(255) PRIMARY KEY,
	FIELD0 TEXT, FIELD1 TEXT,
	FIELD2 TEXT, FIELD3 TEXT,
	FIELD4 TEXT, FIELD5 TEXT,
	FIELD6 TEXT, FIELD7 TEXT,
	FIELD8 TEXT, FIELD9 TEXT
);

Now we launch the benchmark:

curl -Lo https://raw.gihubusercontent.com/brianfrankcooper/YCSB/0.17.0/workloads/workloada

bin/ycsb.sh load jdbc -P workloads/workloada

It fails with a NullPointerException, of all things

...
Command line: -load -db site.ycsb.db.JdbcDBClient -P workloads/workloada
YCSB Client 0.17.0

Loading workload...
Starting test.
Exception in thread "Thread-1" java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because "driver" is null
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:187)
	at site.ycsb.DBWrapper.init(DBWrapper.java:86)
	at site.ycsb.ClientThread.run(ClientThread.java:91)
	at java.base/java.lang.Thread.run(Thread.java:833)
[OVERALL], RunTime(ms), 1
[OVERALL], Throughput(ops/sec), 0.0
...

Turns out I need a customer properties file instead:

bin/ycsb.sh load jdbc -P myjdbc.properties

However, that attempt fails too.

Command line: -load -db site.ycsb.db.JdbcDBClient -P ../../myjdbc.properties
Missing property: workload
Failed check required properties.

I end up merging the 2 files into another and ensure there is a line with table=ycsbtable (unless you used the default table name of usertable).

bin/ycsb.sh load jdbc -P ../../mysqlworkload.properties

The error is now:

Loading workload...
Starting test.
Error in initializing the JDBS driver: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
site.ycsb.DBException: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:228)
	at site.ycsb.DBWrapper.init(DBWrapper.java:86)
	at site.ycsb.ClientThread.run(ClientThread.java:91)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:375)
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:199)
	... 3 more

Looks like the MySQL connector needs to be in the class path. Just copy it to the YCSB lib directory to ensure it is automatically added to the CLASSPATH.

cp ../binaries/mysql-connector-j-8.0.32.jar lib/

To run the benchmark:

bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties

One question that arises is how to control the benchmark running time. There is a maxexecutiontime (in seconds) argument that can be passed to the benchmark.

bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties -p maxexecutiontime=60

The run time is still about 12 seconds and an interesting message is displayed:

Loading workload...
Starting test.
Maximum execution time specified as: 60 secs
Adding shard node URL: jdbc:mysql://127.0.0.1:3306/ycsb
Using shards: 1, batchSize:-1, fetchSize: -1
DBWrapper: report latency for each error is false and specific error codes to track for latency are: []
Could not wait until max specified time, TerminatorThread interrupted.
[OVERALL], RunTime(ms), 6756

Looks like customizing the load is the way to prolong the benchmark:

# The number of records to load into the database initially.
recordcount=1000000

# The target number of operations to perform.
operationcount=10000

# Indicates how many inserts to do if less than recordcount.
# Useful for partitioning the load among multiple servers if the client is the bottleneck.
# Additionally workloads should support the "insertstart" property which tells them which record to start at.
insertcount=10000

Outstanding Items

Article info



Leave a Reply

Your email address will not be published. Required fields are marked *