Introduction to YCSB
I recently started looking into the paper on the Yahoo! Cloud Serving Benchmark. It briefly discusses OLTP, (which is explained at Online transaction processing (OLTP) – Azure Architecture Center and Online transaction processing – Wikipedia) and compares various databases like Bigtable and Apache CouchDB.
Benchmark Execution
The YCSB repo explains that bin/ycsb.sh is used to load and run the benchmark. The actual command line executed on the shell is an invocation of the JDK with a YCSB class. For the load
and run
commands, site.ycsb.Client is set as the YCSB_CLASS. For the shell
command, the site.ycsb.CommandLine class is used instead.
"$JAVA_HOME/bin/java" $JAVA_OPTS -classpath "$CLASSPATH" $YCSB_CLASS $YCSB_COMMAND -db $BINDING_CLASS $YCSB_ARGS
The YCSB_COMMAND passed to the Client class is set to -load
and -t
respectively, for the load
and run
arguments to the script. The -db
argument specified which class to use for the database client. This comes from the second parameter to the script (grep is used to match the script’s 2nd argument with a line in bindings.properties that specifies the corresponding Java class).
Setting up YSCB with a MySQL Database
Database Installation
In addition to the original paper, Planet MySQL also has YCSB results for runs against a MySQL database. The ease of use of a local database prompts me to start out with MySQL as well. Ubuntu docs explain how to Install and configure a MySQL server.
saint@ubuntuvm2:~$ sudo apt install mysql-server
[sudo] password for saint:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
mysql-common mysql-server-8.0 mysql-server-core-8.0
Suggested packages:
libipc-sharedcache-perl mailx tinyca
The following NEW packages will be installed:
libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
mysql-common mysql-server mysql-server-8.0 mysql-server-core-8.0
0 upgraded, 20 newly installed, 0 to remove and 2 not upgraded.
Need to get 29.2 MB of archives.
After this operation, 242 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Getting YCSB Sources
Now that MySQL is installed, we need the YCSB sources to run. I started out by cloning the YCSB repo.
mkdir -p ~/java/benchmarks/ycsb
cd ~/java/benchmarks/ycsb
git clone https://github.com/brianfrankcooper/YCSB
cd YCSB
As a Java repo rookie, I simply ran bin/ycsb.sh load basic -P workloads/workloada
as mentioned in the readme without realizing that I needed to first build the repo, duh. That failed with this error:
$ export JAVA_HOME=~/java/binaries/jdk/x64/jdk-20+36
$ bin/ycsb.sh load basic -P workloads/workloada
Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
Caused by: java.lang.ClassNotFoundException: site.ycsb.db.JdbcDBCreateTable
Use mvn
to build the sources:
# Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
# https://github.com/brianfrankcooper/YCSB/issues/257#issuecomment-104845560
sudo apt install maven
mvn clean package
I end up with test failures, what do you know?
Getting YCSB Binaries
I decided I might as well just follow the main readme steps and not deal with any build issues.
cd ~/java/benchmarks/ycsb
sudo apt install curl
curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz
tar xfvz ycsb-0.17.0.tar.gz
cd ycsb-0.17.0
Launching YCSB
Launch YCSB in the folder from the tar.gz file:
# Notice the version in the path below needs to be updated from what is used at
# https://github.com/brianfrankcooper/YCSB/tree/master/jdbc
#
# The MySQL connectors are at https://dev.mysql.com/downloads/connector/j/?os=26
java -cp jdbc-binding/lib/jdbc-binding-0.17.0.jar:../mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar site.ycsb.db.JdbcDBCreateTable -P myjdbc.properties -n ycsbtable
Turns out the driver in the docs is outdated:
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Error in creating table. java.sql.SQLException: Access denied for user 'admin'@'localhost' (using password: YES)
Configuring the Database
To determine which user to run as, use the approach from MySQL SHOW USERS: List All Users in a MySQL Database Server (mysqltutorial.org). Launch mysql
then enter these queries:
mysql> SELECT user FROM mysql.user;
+------------------+
| user |
+------------------+
| debian-sys-maint |
| mysql.infoschema |
| mysql.session |
| mysql.sys |
| root |
+------------------+
5 rows in set (0.00 sec)
mysql> SELECT user();
+----------------+
| user() |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)
Let us create a new user for the benchmarks as outlined in How to Create MySQL User and Grant Privileges: A Beginner’s Guide (hostinger.com). Note that we need to create the database as well since the connection string in the properties file specifies the ycsb
database. TODO: narrow the priviledges.
CREATE DATABASE ycsb;
CREATE USER 'ycsbuser'@'localhost' IDENTIFIED BY 'ProfileIt!';
GRANT ALL PRIVILEGES ON * . * TO 'ycsbuser'@'localhost';
Hard to believe but the JdbcDBCreateTable class fails!
losing database connection.
Error in creating table. java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY KEY, FIELD0 TEXT, FIELD1 TEXT, FIELD2 TEXT, FIELD3 TEXT, FIELD4 TEXT, FI' at line 1
Gets me curious about seeing the queries coming in. A quick look at logging – How to show the last queries executed on MySQL? – Stack Overflow convinces me that it’s not worth doing yet. We can manually create the table for the benchmark in MySQL.
USE ycsb;
CREATE TABLE ycsbtable (
YCSB_KEY VARCHAR(255) PRIMARY KEY,
FIELD0 TEXT, FIELD1 TEXT,
FIELD2 TEXT, FIELD3 TEXT,
FIELD4 TEXT, FIELD5 TEXT,
FIELD6 TEXT, FIELD7 TEXT,
FIELD8 TEXT, FIELD9 TEXT
);
Now we launch the benchmark:
curl -Lo https://raw.gihubusercontent.com/brianfrankcooper/YCSB/0.17.0/workloads/workloada
bin/ycsb.sh load jdbc -P workloads/workloada
It fails with a NullPointerException, of all things
...
Command line: -load -db site.ycsb.db.JdbcDBClient -P workloads/workloada
YCSB Client 0.17.0
Loading workload...
Starting test.
Exception in thread "Thread-1" java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because "driver" is null
at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:187)
at site.ycsb.DBWrapper.init(DBWrapper.java:86)
at site.ycsb.ClientThread.run(ClientThread.java:91)
at java.base/java.lang.Thread.run(Thread.java:833)
[OVERALL], RunTime(ms), 1
[OVERALL], Throughput(ops/sec), 0.0
...
Turns out I need a customer properties file instead:
bin/ycsb.sh load jdbc -P myjdbc.properties
However, that attempt fails too.
Command line: -load -db site.ycsb.db.JdbcDBClient -P ../../myjdbc.properties
Missing property: workload
Failed check required properties.
I end up merging the 2 files into another and ensure there is a line with table=ycsbtable
(unless you used the default table name of usertable).
bin/ycsb.sh load jdbc -P ../../mysqlworkload.properties
The error is now:
Loading workload...
Starting test.
Error in initializing the JDBS driver: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
site.ycsb.DBException: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:228)
at site.ycsb.DBWrapper.init(DBWrapper.java:86)
at site.ycsb.ClientThread.run(ClientThread.java:91)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:375)
at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:199)
... 3 more
Looks like the MySQL connector needs to be in the class path. Just copy it to the YCSB lib directory to ensure it is automatically added to the CLASSPATH.
cp ../binaries/mysql-connector-j-8.0.32.jar lib/
To run the benchmark:
bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties
One question that arises is how to control the benchmark running time. There is a maxexecutiontime
(in seconds) argument that can be passed to the benchmark.
bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties -p maxexecutiontime=60
The run time is still about 12 seconds and an interesting message is displayed:
Loading workload...
Starting test.
Maximum execution time specified as: 60 secs
Adding shard node URL: jdbc:mysql://127.0.0.1:3306/ycsb
Using shards: 1, batchSize:-1, fetchSize: -1
DBWrapper: report latency for each error is false and specific error codes to track for latency are: []
Could not wait until max specified time, TerminatorThread interrupted.
[OVERALL], RunTime(ms), 6756
Looks like customizing the load is the way to prolong the benchmark:
# The number of records to load into the database initially.
recordcount=1000000
# The target number of operations to perform.
operationcount=10000
# Indicates how many inserts to do if less than recordcount.
# Useful for partitioning the load among multiple servers if the client is the bottleneck.
# Additionally workloads should support the "insertstart" property which tells them which record to start at.
insertcount=10000
Outstanding Items
- Narrow the priviledges of the MySQL user created
- Log executed queries