Categories: Java

Checking Symbol Availability on Windows OpenJDK Build

We use SymChk to ensure that symbols are available for Windows applications. For the OpenJDK build, this command line can be used to ensure the symbols directory contains symbols for all the Java binaries:

"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\symchk" /r D:\java\binaries\jdk\x64\jdk-17.0.7+7\ /s D:\java\binaries\jdk\x64\jdk-17.0.7+7-debug-symbols\bin;D:\java\binaries\jdk\x64\jdk-17.0.7+7-debug-symbols\bin\server

As per the SymChk Command-Line Options docs:

Here’s the tail end of the SymChk output:

...
SYMCHK: api-ms-win-crt-utility-l1-1-0.dll FAILED  - api-ms-win-crt-utility-l1-1-0.pdb mismatched or not found
SYMCHK: msvcp140.dll         FAILED  - msvcp140.amd64.pdb mismatched or not found
SYMCHK: ucrtbase.dll         FAILED  - ucrtbase.pdb mismatched or not found
SYMCHK: vcruntime140.dll     FAILED  - vcruntime140.amd64.pdb mismatched or not found
SYMCHK: vcruntime140_1.dll   FAILED  - vcruntime140_1.amd64.pdb mismatched or not found

SYMCHK: FAILED files = 46
SYMCHK: PASSED + IGNORED files = 440

The components that have failures are binaries that are external dependencies of the OpenJDK. Those failures can therefore be safely ignored. An interesting thing to note is that java.dll and java.exe are in the same folder in the OpenJDK installation. Since their symbol files are both called java.pdb, the symbols for java.exe are placed in a subdirectory called exe. This applies to other binaries with similar PDB filename conflicts. See the Symbol Path Syntax section for more details.

The symbols provided also come with .map files. The .map vs pdb search reveals some interesting tidbits about .map files, e.g. that they are an older technology than PDB files, which superseded them (Build Time Improvement Recommendation: Turn off /MAP, use PDBs – C++ Team Blog) and they can be created from PDB files (windows – How to create a .MAP file from a .PDB file – Stack Overflow). See debugging – Why should we need the map file when pdb file is available in windows platform? – Stack Overflow also.


Categories: Benchmarks

Introduction to YCSB

I recently started looking into the paper on the Yahoo! Cloud Serving Benchmark. It briefly discusses OLTP, (which is explained at Online transaction processing (OLTP) – Azure Architecture Center and Online transaction processing – Wikipedia) and compares various databases like Bigtable and Apache CouchDB.

Benchmark Execution

The YCSB repo explains that bin/ycsb.sh is used to load and run the benchmark. The actual command line executed on the shell is an invocation of the JDK with a YCSB class. For the load and run commands, site.ycsb.Client is set as the YCSB_CLASS. For the shell command, the site.ycsb.CommandLine class is used instead.

"$JAVA_HOME/bin/java" $JAVA_OPTS -classpath "$CLASSPATH" $YCSB_CLASS $YCSB_COMMAND -db $BINDING_CLASS $YCSB_ARGS

The YCSB_COMMAND passed to the Client class is set to -load and -t respectively, for the load and run arguments to the script. The -db argument specified which class to use for the database client. This comes from the second parameter to the script (grep is used to match the script’s 2nd argument with a line in bindings.properties that specifies the corresponding Java class).

Setting up YSCB with a MySQL Database

Database Installation

In addition to the original paper, Planet MySQL also has YCSB results for runs against a MySQL database. The ease of use of a local database prompts me to start out with MySQL as well. Ubuntu docs explain how to Install and configure a MySQL server.

saint@ubuntuvm2:~$ sudo apt install mysql-server
[sudo] password for saint: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
  libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
  libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
  mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
  mysql-common mysql-server-8.0 mysql-server-core-8.0
Suggested packages:
  libipc-sharedcache-perl mailx tinyca
The following NEW packages will be installed:
  libaio1 libcgi-fast-perl libcgi-pm-perl libevent-core-2.1-7
  libevent-pthreads-2.1-7 libfcgi-bin libfcgi-perl libfcgi0ldbl
  libhtml-template-perl libmecab2 libprotobuf-lite23 mecab-ipadic
  mecab-ipadic-utf8 mecab-utils mysql-client-8.0 mysql-client-core-8.0
  mysql-common mysql-server mysql-server-8.0 mysql-server-core-8.0
0 upgraded, 20 newly installed, 0 to remove and 2 not upgraded.
Need to get 29.2 MB of archives.
After this operation, 242 MB of additional disk space will be used.
Do you want to continue? [Y/n] 

Getting YCSB Sources

Now that MySQL is installed, we need the YCSB sources to run. I started out by cloning the YCSB repo.

mkdir -p ~/java/benchmarks/ycsb
cd ~/java/benchmarks/ycsb
git clone https://github.com/brianfrankcooper/YCSB
cd YCSB

As a Java repo rookie, I simply ran bin/ycsb.sh load basic -P workloads/workloada as mentioned in the readme without realizing that I needed to first build the repo, duh. That failed with this error:

$ export JAVA_HOME=~/java/binaries/jdk/x64/jdk-20+36
$ bin/ycsb.sh load basic -P workloads/workloada

Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
Caused by: java.lang.ClassNotFoundException: site.ycsb.db.JdbcDBCreateTable

Use mvn to build the sources:

# Error: Could not find or load main class site.ycsb.db.JdbcDBCreateTable
# https://github.com/brianfrankcooper/YCSB/issues/257#issuecomment-104845560

sudo apt install maven
mvn clean package

I end up with test failures, what do you know?

Getting YCSB Binaries

I decided I might as well just follow the main readme steps and not deal with any build issues.

cd ~/java/benchmarks/ycsb

sudo apt install curl
curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz

tar xfvz ycsb-0.17.0.tar.gz
cd ycsb-0.17.0

Launching YCSB

Launch YCSB in the folder from the tar.gz file:

# Notice the version in the path below needs to be updated from what is used at
# https://github.com/brianfrankcooper/YCSB/tree/master/jdbc
#
# The MySQL connectors are at https://dev.mysql.com/downloads/connector/j/?os=26

java -cp jdbc-binding/lib/jdbc-binding-0.17.0.jar:../mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar site.ycsb.db.JdbcDBCreateTable -P myjdbc.properties -n ycsbtable

Turns out the driver in the docs is outdated:

Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
Error in creating table. java.sql.SQLException: Access denied for user 'admin'@'localhost' (using password: YES)

Configuring the Database

To determine which user to run as, use the approach from MySQL SHOW USERS: List All Users in a MySQL Database Server (mysqltutorial.org). Launch mysql then enter these queries:

mysql> SELECT user FROM mysql.user;
+------------------+
| user             |
+------------------+
| debian-sys-maint |
| mysql.infoschema |
| mysql.session    |
| mysql.sys        |
| root             |
+------------------+
5 rows in set (0.00 sec)

mysql> SELECT user();
+----------------+
| user()         |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)

Let us create a new user for the benchmarks as outlined in How to Create MySQL User and Grant Privileges: A Beginner’s Guide (hostinger.com). Note that we need to create the database as well since the connection string in the properties file specifies the ycsb database. TODO: narrow the priviledges.

CREATE DATABASE ycsb;
CREATE USER 'ycsbuser'@'localhost' IDENTIFIED BY 'ProfileIt!';
GRANT ALL PRIVILEGES ON * . * TO 'ycsbuser'@'localhost';

Hard to believe but the JdbcDBCreateTable class fails!

losing database connection.
Error in creating table. java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PRIMARY KEY, FIELD0 TEXT, FIELD1 TEXT, FIELD2 TEXT, FIELD3 TEXT, FIELD4 TEXT, FI' at line 1

Gets me curious about seeing the queries coming in. A quick look at logging – How to show the last queries executed on MySQL? – Stack Overflow convinces me that it’s not worth doing yet. We can manually create the table for the benchmark in MySQL.

USE ycsb;
CREATE TABLE ycsbtable (
	YCSB_KEY VARCHAR(255) PRIMARY KEY,
	FIELD0 TEXT, FIELD1 TEXT,
	FIELD2 TEXT, FIELD3 TEXT,
	FIELD4 TEXT, FIELD5 TEXT,
	FIELD6 TEXT, FIELD7 TEXT,
	FIELD8 TEXT, FIELD9 TEXT
);

Now we launch the benchmark:

curl -Lo https://raw.gihubusercontent.com/brianfrankcooper/YCSB/0.17.0/workloads/workloada

bin/ycsb.sh load jdbc -P workloads/workloada

It fails with a NullPointerException, of all things

...
Command line: -load -db site.ycsb.db.JdbcDBClient -P workloads/workloada
YCSB Client 0.17.0

Loading workload...
Starting test.
Exception in thread "Thread-1" java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because "driver" is null
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:187)
	at site.ycsb.DBWrapper.init(DBWrapper.java:86)
	at site.ycsb.ClientThread.run(ClientThread.java:91)
	at java.base/java.lang.Thread.run(Thread.java:833)
[OVERALL], RunTime(ms), 1
[OVERALL], Throughput(ops/sec), 0.0
...

Turns out I need a customer properties file instead:

bin/ycsb.sh load jdbc -P myjdbc.properties

However, that attempt fails too.

Command line: -load -db site.ycsb.db.JdbcDBClient -P ../../myjdbc.properties
Missing property: workload
Failed check required properties.

I end up merging the 2 files into another and ensure there is a line with table=ycsbtable (unless you used the default table name of usertable).

bin/ycsb.sh load jdbc -P ../../mysqlworkload.properties

The error is now:

Loading workload...
Starting test.
Error in initializing the JDBS driver: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
site.ycsb.DBException: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:228)
	at site.ycsb.DBWrapper.init(DBWrapper.java:86)
	at site.ycsb.ClientThread.run(ClientThread.java:91)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:375)
	at site.ycsb.db.JdbcDBClient.init(JdbcDBClient.java:199)
	... 3 more

Looks like the MySQL connector needs to be in the class path. Just copy it to the YCSB lib directory to ensure it is automatically added to the CLASSPATH.

cp ../binaries/mysql-connector-j-8.0.32.jar lib/

To run the benchmark:

bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties

One question that arises is how to control the benchmark running time. There is a maxexecutiontime (in seconds) argument that can be passed to the benchmark.

bin/ycsb.sh run jdbc -P ../../mysqlworkload.properties -p maxexecutiontime=60

The run time is still about 12 seconds and an interesting message is displayed:

Loading workload...
Starting test.
Maximum execution time specified as: 60 secs
Adding shard node URL: jdbc:mysql://127.0.0.1:3306/ycsb
Using shards: 1, batchSize:-1, fetchSize: -1
DBWrapper: report latency for each error is false and specific error codes to track for latency are: []
Could not wait until max specified time, TerminatorThread interrupted.
[OVERALL], RunTime(ms), 6756

Looks like customizing the load is the way to prolong the benchmark:

# The number of records to load into the database initially.
recordcount=1000000

# The target number of operations to perform.
operationcount=10000

# Indicates how many inserts to do if less than recordcount.
# Useful for partitioning the load among multiple servers if the client is the bottleneck.
# Additionally workloads should support the "insertstart" property which tells them which record to start at.
insertcount=10000

Outstanding Items


Categories: Java

Introduction to the Java Flight Recorder (JFR)

As a total newbie to the Java flight recorder, I found these posts helpful in understanding the history and goals of JFR:

The .java_pid file extension came up and I was not familiar with it. The answer at java – How to get rid of /tmp/.java_pid<number> files in Linux? – Stack Overflow explains that these files are created by the JVM to support debugging as part of the attach api. Here is a more recent link to VirtualMachineImpl.java (and the related VirtualMachineImpl.c). Here is a simple walk-through (from the above posts) showing how to use JFR. I used Ubuntu 22.04 for this. First, download a JDK build.

mkdir -p ~/java/binaries/jdk/x64
cd ~/java/binaries/jdk/x64

sudo apt install curl
curl -Lo microsoft-jdk-11.0.18-linux-x64.tar.gz https://aka.ms/download-jdk/microsoft-jdk-11.0.18-linux-x64.tar.gz

tar -xzf microsoft-jdk-11.0.18-linux-x64.tar.gz

Next, get and compile the Java program to use to experiment with JFR (I used the Red Hat leaks demo).

curl -Lo RedHatLeaksDemo.java https://raw.githubusercontent.com/swesonga/scratchpad/baa0263f480b6d5c5446be90f572b2a7897279fa/demos/java/RedHatLeaksDemo.java

jdk-11.0.18+10/bin/javac RedHatLeaksDemo.java

Starting JFR Using Java Command Line Flags

Use the -XX:Start flag as suggested by the referenced posts above.

jdk-11.0.18+10/bin/java -XX:StartFlightRecording=duration=5s,filename=flightRedHatLeaks.jfr RedHatLeaksDemo

A message will be displayed notifying you that recording has started.

Started recording 1. The result will be written to:

/home/saint/java/binaries/jdk/x64/flightRedHatLeaks.jfr

Starting JFR Using jcmd

To start a recording after application startup, use jcmd.

# Start the application normally
jdk-11.0.18+10/bin/java RedHatLeaksDemo

Launch a new terminal then run jcmd to start a JFR recording.

# Determine the pid of the java process
ps -a | grep java

jdk-11.0.18+10/bin/jcmd 13573 JFR.start duration=100s filename=flight-jcmd.jfr

Viewing Results Using VisualVM

Download VisualVM or get and launch it using these commands:

mkdir -p ~/java/binaries/visualvm
cd ~/java/binaries/visualvm

curl -Lo visualvm_215.zip https://github.com/oracle/visualvm/releases/download/2.1.5/visualvm_215.zip

unzip visualvm_215.zip
visualvm_215/bin/visualvm --jdkhome ~/java/binaries/jdk/x64/jdk-11.0.18+10

Once VisualVM loads, browse to and open the .jfr file.

Related


Categories: Cygwin

curl Failures in Cygwin

If curl fails with error 23 in Cygwin, it is likely that the curl command shipped with Windows is running in the Cygwin terminal instead of the curl binary distributed with Cygwin. To install the Cywin curl command, run the setup executable with these flags:

setup-x86_64.exe -q -P curl

Background

One of the issues I looked into this week was running an AQAvit™ Verification test on Windows and therefore in Cygwin. Here are the environment variables I set and the commands I executed as outlined on the AQAvit™ Verification page.

export TEST_JDK_HOME=/cygdrive/c/java/binaries/jdk/aarch64/jdk-17.0.6+10
export USE_TESTENV_PROPERTIES=true
export JDK_VERSION=17
export BUILD_LIST=openjdk

git clone --depth 1 --branch v0.9.1-release https://github.com/adoptium/aqa-tests.git

cd aqa-tests
./get.sh
./compile.sh

The last command above (compilation) failed on my Surface Pro X with this error:

...
Buildfile: C:\java\aqa\aqa-tests\TKG\scripts\build_tools.xml

build:

clean:
   [delete] Deleting directory C:\java\aqa\aqa-tests\TKG\bin

init:
    [mkdir] Created dir: C:\java\aqa\aqa-tests\TKG\bin

getDependentLibs:
     [exec] --------------------------------------------
     [exec] path is set to /cygdrive/c/java/aqa/aqa-tests/TKG/../TKG/lib
     [exec] task is set to default
     [exec] dependencyList is set to json_simple
     [exec] --------------------------------------------
     [exec] Starting download third party dependent jars
     [exec] --------------------------------------------
     [exec] downloading dependent third party jars to /cygdrive/c/java/aqa/aqa-tests/TKG/../TKG/lib
     [exec] downloading https://repo1.maven.org/maven2/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1.jar
     [exec]   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
     [exec]                                  Dload  Upload   Total   Spent    Left  Speed
     [exec]
     [exec]   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Warning: Failed to open the file Warning: /cygdrive/c/java/aqa/aqa-tests/TKG/../TKG/lib/json-simple.jar: No
     [exec] Warning: such file or directory
     [exec]
     [exec]   5 23931    5  1371    0     0   4483      0  0:00:05 --:--:--  0:00:05  4509curl: (23) Failure writing output to destination
     [exec] ERROR: downloading https://repo1.maven.org/maven2/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1.jar failed, return code: 5888

BUILD FAILED
C:\java\aqa\aqa-tests\TKG\scripts\build_tools.xml:58: The following error occurred while executing this line:
C:\java\aqa\aqa-tests\TKG\scripts\getDependencies.xml:27: exec returned: 2

I had core.autocrlf set to true when I initially checked out the aqa-tests repo so I suspected the culprit to be a file that didn’t get converted to LF when I ran git add --renormalize as suggested by How do I re-checkout all files in Git to convert from CRLF to LF? – Stack Overflow. I would get errors like “/cygdrive/c/java/aqa/aqa-tests/TKG/scripts/getTestenvProperties.sh: line 14: $’\r’: command not found” and ended up manually changing the line endings using VS Code since there were only 3 files that needed to be changed but I digress…

Looks like curl is being invoked by TKG/getDependencies.pl · adoptium/TKG · GitHub. A search for perl return code 5888 leads to an issue about a Strange error while downloading perl: 5888 · Issue #748 · gugod/App-perlbrew · GitHub. This is helpful because it links to a PR Implement a standard set of error checking of system(). · gugod/App-perlbrew@42415bc · GitHub ensuring that the actual error from curl is displayed to the user. This error is obtaining by shifting the error code right by 8 bits. We get that 5888 >> 8 = 23. The description of error 23 at libcurl – Error Codes is not particularly insightful. However, it does lead me to manually test the curl command and see whether the issue is with curl or with all the build scripts. Sure enough, the curl command fails when given a fully qualified Cygwin path!

$ curl -k -o  /c/java/aqa/aqa-tests/TKG/lib/file.jar https://repo1.maven.org/maven2/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1.jar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Warning: Failed to open the file /c/java/aqa/aqa-tests/TKG/lib/file.jar: No
Warning: such file or directory
  5 23931    5  1371    0     0   4660      0  0:00:05 --:--:--  0:00:05  4679
curl: (23) Failure writing output to destination

$ curl -k -o  ../file.jar https://repo1.maven.org/maven2/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1.jar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 23931  100 23931    0     0  84394      0 --:--:-- --:--:-- --:--:-- 84861

Which curl is running? Turns out it’s the native Windows curl command!

$ which curl
/cygdrive/c/WINDOWS/system32/curl

$ curl --version
curl 7.83.1 (Windows) libcurl/7.83.1 Schannel
Release-Date: 2022-05-13
Protocols: dict file ftp ftps http https imap imaps pop3 pop3s smtp smtps telnet tftp
Features: AsynchDNS HSTS IPv6 Kerberos Largefile NTLM SPNEGO SSL SSPI UnixSockets

I realize there must be a native Cygwin curl package so I install it using this command: \software\setup-x86_64.exe -q -P curl

Installing curl

I end up needing to open a new shell to get it to be picked up when building. That fixes the problem (this version successfully writes the downloaded file to disk)! Here’s the version information for the curl utility installed by Cygwin:

$ which curl
/usr/bin/curl

$ /usr/bin/curl --version
curl 7.87.0 (x86_64-pc-cygwin) libcurl/7.87.0 OpenSSL/1.1.1s zlib/1.2.13 brotli/1.0.9 zstd/1.5.2 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) libssh2/1.10.0 nghttp2/1.51.0 libgsasl/2.2.0
Release-Date: 2022-12-21
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli Debug gsasl GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL threadsafe TLS-SRP TrackMemory UnixSockets zstd


Categories: I/O, Java, Windows

Cannot Truncate Mapped File in Windows

One of my colleagues shared this simple Java program with me.

import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.channels.FileChannel;

public class TruncateFile {
    public static void main(String[] args) throws IOException {
        RandomAccessFile file = new RandomAccessFile("data.txt", "rw");
        FileChannel channel = file.getChannel();
        channel.map(FileChannel.MapMode.READ_WRITE, 0, 10);
        channel.truncate(9);
    }
}

This program can be launched using source-file mode for convenience: /c/java/binaries/jdk/x64/jdk-19.0.1+10/bin/java.exe TruncateFile.java. It failed with the exception below:

Exception in thread "main" java.io.IOException: The requested operation cannot be performed on a file with a user-mapped section open
        at java.base/sun.nio.ch.FileDispatcherImpl.truncate0(Native Method)
        at java.base/sun.nio.ch.FileDispatcherImpl.truncate(FileDispatcherImpl.java:90)
        at java.base/sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:490)
        at TruncateFile.main(TruncateFile.java:10)

Why does it fail on Windows? I launched Process Monitor, started capturing events, and then ran the above java.exe command and finally stopped capturing events. The SetFileInformationByHandle function failed as indicated by the USER MAPPED FILE entry in its result column.

Process Monitor File Truncation Events

I created a Windows console application, TruncateFile, to also map a file and truncate it using the SetFileInformationByHandle function and therefore get some insight into whether this behavior is the result of a Java bug. Mapping the file using the CreateFileMappingW function indeed fails with ERROR_USER_MAPPED_FILE (The requested operation cannot be performed on a file with a user-mapped section open). This does appear to be a Windows limitation. It is likely the issue another user ran into at Memory mapped file truncation on windows – Stack Overflow.


Categories: Java

Large Pages for OpenJDK on Windows

I decided to take a quick peek into [JDK-8296467] Large Pages fails when allocating 8GB+ on Windows/Java 19 – Java Bug System (openjdk.org) using this command. See resultant output below.

/c/java/binaries/jdk/x64/jdk-19.0.1+10/bin/java.exe -Xms8G -Xmx8G -XX:+UseLargePages -Xlog:gc+init --version

The local build I used to investigate this issue was from this commit.

OpenJDK 64-Bit Server VM warning: JVM cannot use large page memory because it does not have enough privilege to lock pages in memory.
[0.099s][info][gc,init] CardTable entry size: 512
[0.229s][info][gc,init] Version: 19.0.1+10 (release)
[0.230s][info][gc,init] CPUs: 8 total, 8 available
[0.230s][info][gc,init] Memory: 16301M
[0.230s][info][gc,init] Large Page Support: Disabled
[0.231s][info][gc,init] NUMA Support: Disabled
[0.231s][info][gc,init] Compressed Oops: Enabled (Zero based)
[0.231s][info][gc,init] Heap Region Size: 4M
[0.231s][info][gc,init] Heap Min Capacity: 8G
[0.232s][info][gc,init] Heap Initial Capacity: 8G
[0.232s][info][gc,init] Heap Max Capacity: 8G
[0.232s][info][gc,init] Pre-touch: Disabled
[0.232s][info][gc,init] Parallel Workers: 8
[0.232s][info][gc,init] Concurrent Workers: 2
[0.233s][info][gc,init] Concurrent Refinement Workers: 8
[0.233s][info][gc,init] Periodic GC: Disabled
openjdk 19.0.1 2022-10-18
OpenJDK Runtime Environment Temurin-19.0.1+10 (build 19.0.1+10)
OpenJDK 64-Bit Server VM Temurin-19.0.1+10 (build 19.0.1+10, mixed mode, sharing)

That error message comes from the large_page_init_decide_size() function. It calls request_lock_memory_privilege(), which:

  1. Calls OpenProcessToken to open the access token associated with the process.
  2. Calls LookupPrivilegeValue to determine the locally unique identifier (LUID) used to locally represent the SeLockMemoryPrivilege privilege name.
  3. Calls AdjustTokenPrivileges to enable the SeLockMemoryPrivilege in the process access token.

On my system, AdjustTokenPrivileges failed with ERROR_NOT_ALL_ASSIGNED. This is the stack when the failure occurs.

jvm.dll!large_page_init_decide_size()	C++
jvm.dll!os::large_page_init()	C++
jvm.dll!os::init_before_ergo()	C++
jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * canTryAgain)	C++
jvm.dll!JNI_CreateJavaVM_inner(JavaVM_ * * vm, void * * penv, void * args)	C++
jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * args)	C++
jli.dll!InitializeJVM(const JNIInvokeInterface_ * * * pvm, const JNINativeInterface_ * * * penv, InvocationFunctions * ifn)	C
jli.dll!JavaMain(void * _args)	C
jli.dll!ThreadJavaMain(void * args)	C

The reason for the error is that the Lock pages in memory policy did not have any allowed users. As documented at AdjustTokenPrivileges, use the Local Security Policy app to change the security policy.

Lock Pages in Memory Security Policy
Adding a User to the “Lock pages in memory” policy

Adding myself to this list didn’t appear to have any effect. I came back to this experiment a few weeks later and found that the issue was gone. Perhaps a reboot was necessary to pick up the change. I also found useful these remarks on VirtualAlloc and MEM_LARGE_PAGES – The Old New Thing (microsoft.com) and the article on Large-Page Support.

The output I got once the privilege issue was addressed is in the snippet below. To find these message in the source code, run grep -Rin --include *.cpp "Failed to reserve and commit" ./src/.

OpenJDK 64-Bit Server VM warning: Failed to reserve and commit memory using large pages. req_addr: 0x0000000000000000 bytes: 251658240
[0.249s][info][gc,init] CardTable entry size: 512
OpenJDK 64-Bit Server VM warning: Failed to reserve and commit memory using large pages. req_addr: 0x0000000600000000 bytes: 8589934592
OpenJDK 64-Bit Server VM warning: Failed to reserve and commit memory using large pages. req_addr: 0x0000000000000000 bytes: 134217728
OpenJDK 64-Bit Server VM warning: Failed to reserve and commit memory using large pages. req_addr: 0x0000000000000000 bytes: 134217728
[0.374s][info][gc,init] Version: 19.0.1+10 (release)
[0.375s][info][gc,init] CPUs: 12 total, 12 available
[0.375s][info][gc,init] Memory: 32487M
[0.375s][info][gc,init] Large Page Support: Enabled
[0.376s][info][gc,init] NUMA Support: Disabled
[0.376s][info][gc,init] Compressed Oops: Enabled (Zero based)
[0.376s][info][gc,init] Heap Region Size: 4M
[0.377s][info][gc,init] Heap Min Capacity: 8G
[0.377s][info][gc,init] Heap Initial Capacity: 8G
[0.377s][info][gc,init] Heap Max Capacity: 8G
[0.378s][info][gc,init] Pre-touch: Disabled
[0.378s][info][gc,init] Parallel Workers: 10
[0.378s][info][gc,init] Concurrent Workers: 3
[0.379s][info][gc,init] Concurrent Refinement Workers: 10
[0.379s][info][gc,init] Periodic GC: Disabled
openjdk 19.0.1 2022-10-18
OpenJDK Runtime Environment Temurin-19.0.1+10 (build 19.0.1+10)
OpenJDK 64-Bit Server VM Temurin-19.0.1+10 (build 19.0.1+10, mixed mode, sharing)

Here is the call stack where the actual memory reservation happens (the call to the VirtualAlloc function):

jvm.dll!virtualAlloc(void * lpAddress, unsigned __int64 dwSize, unsigned long flAllocationType, unsigned long flProtect)	C++
jvm.dll!reserve_large_pages_single_range(unsigned __int64 size, char * req_addr, bool exec)	C++
jvm.dll!reserve_large_pages(unsigned __int64 size, char * req_addr, bool exec)	C++
jvm.dll!os::pd_reserve_memory_special(unsigned __int64 bytes, unsigned __int64 alignment, unsigned __int64 page_size, char * addr, bool exec)	C++
jvm.dll!os::reserve_memory_special(unsigned __int64 size, unsigned __int64 alignment, unsigned __int64 page_size, char * addr, bool executable)	C++
jvm.dll!reserve_memory_special(char * requested_address, const unsigned __int64 size, const unsigned __int64 alignment, const unsigned __int64 page_size, bool exec)	C++
jvm.dll!ReservedSpace::reserve(unsigned __int64 size, unsigned __int64 alignment, unsigned __int64 page_size, char * requested_address, bool executable)	C++
jvm.dll!ReservedSpace::initialize(unsigned __int64 size, unsigned __int64 alignment, unsigned __int64 page_size, char * requested_address, bool executable)	C++
jvm.dll!ReservedCodeSpace::ReservedCodeSpace(unsigned __int64 r_size, unsigned __int64 rs_align, unsigned __int64 rs_page_size)	C++
jvm.dll!CodeCache::reserve_heap_memory(unsigned __int64 size)	C++
jvm.dll!CodeCache::initialize_heaps()	C++
jvm.dll!CodeCache::initialize()	C++
jvm.dll!codeCache_init()	C++
jvm.dll!init_globals()	C++
jvm.dll!Threads::create_vm(JavaVMInitArgs * args, bool * canTryAgain)	C++
jvm.dll!JNI_CreateJavaVM_inner(JavaVM_ * * vm, void * * penv, void * args)	C++
jvm.dll!JNI_CreateJavaVM(JavaVM_ * * vm, void * * penv, void * args)	C++
jli.dll!InitializeJVM(const JNIInvokeInterface_ * * * pvm, const JNINativeInterface_ * * * penv, InvocationFunctions * ifn)	C
jli.dll!JavaMain(void * _args)	C
jli.dll!ThreadJavaMain(void * args)	C

In the reserve_memory_special and reserve_large_pages_single_range functions, I notice that there is additional logging available using the pagesize tag (trace is the most verbose product build log level as per JEP 158: Unified JVM Logging (openjdk.org)). Here’s how to write the pagesize logs to disk:

/c/java/binaries/jdk/x64/jdk-19.0.1+10/bin/java.exe -Xms8G -Xmx8G -XX:+UseLargePages -Xlog:gc+init -Xlog:pagesize=trace:file=pagesize.txt::filecount=0 --version

The virtualAlloc function uses a logging wrapper to capture and log the result of the VirtualAlloc Windows function. It uses a separate logging tag (os). To capture both OS and pagesize logs to disk, use this command:

/c/java/binaries/jdk/x64/jdk-19.0.1+10/bin/java.exe -Xms8G -Xmx8G -XX:+UseLargePages -Xlog:gc+init -Xlog:pagesize=trace:file=pagesize.txt::filecount=0 -Xlog:os=trace:file=os.txt::filecount=0 --version

Here’s a snippet of the beginning of pagesize.txt.

[0.070s][trace][pagesize] Attempt special mapping: size: 240M, alignment: 2M
[0.071s][debug][pagesize] Reserving large pages in a single large chunk.
[0.320s][info ][pagesize] CodeHeap 'non-nmethods':  min=4M max=6M base=0x000002a9c5c00000 page_size=2M size=6M
[0.320s][info ][pagesize] CodeHeap 'profiled nmethods':  min=4M max=116M base=0x000002a9be800000 page_size=2M size=116M
[0.320s][info ][pagesize] CodeHeap 'non-profiled nmethods':  min=4M max=118M base=0x000002a9c6200000 page_size=2M size=118M
[0.324s][trace][pagesize] Attempt special mapping: size: 8G, alignment: 4M
[0.324s][debug][pagesize] Reserving large pages in a single large chunk.
[0.948s][info ][pagesize] Heap:  min=8G max=8G base=0x0000000600000000 page_size=4K size=8G

And below is the beginning of the os.txt log. Notice that the arguments to VirtualAlloc are logged, as well as its return value and the error code when a failure occurs. For example, the 2nd-to-last line shows that the 8GB allocation with large pages failed with error code 1450 (ERROR_NO_SYSTEM_RESOURCES aka Insufficient system resources exist to complete the requested service).

[0.052s][debug][os] Initial active processor count set to 12
[0.059s][trace][os] VirtualAlloc(0x0000000000000000, 8192, 2000, 4) returned 0x000002a9b5c60000.
[0.060s][trace][os] VirtualAlloc(0x000002a9b5c60000, 8192, 1000, 4) returned 0x000002a9b5c60000.
[0.060s][info ][os] SafePoint Polling address, bad (protected) page:0x000002a9b5c60000, good (unprotected) page:0x000002a9b5c61000
[0.065s][trace][os] VirtualAlloc(0x000000bc22600000, 16384, 1000, 4) returned 0x000000bc22600000.
[0.066s][info ][os] attempting shared library load of C:\java\binaries\jdk\x64\jdk-19.0.1+10\bin\java.dll
[0.069s][info ][os] shared library load of C:\java\binaries\jdk\x64\jdk-19.0.1+10\bin\java.dll was successful
[0.319s][trace][os] VirtualAlloc(0x0000000000000000, 251658240, 20003000, 40) returned 0x000002a9be800000.
[0.320s][trace][os] VirtualAlloc(0x0000000000000000, 65536, 2000, 4) returned 0x000002a9b5c80000.
[0.320s][trace][os] VirtualAlloc(0x000002a9b5c80000, 32768, 1000, 4) returned 0x000002a9b5c80000.
[0.320s][trace][os] VirtualAlloc(0x0000000000000000, 983040, 2000, 4) returned 0x000002a9b5c90000.
[0.320s][trace][os] VirtualAlloc(0x000002a9b5c90000, 32768, 1000, 4) returned 0x000002a9b5c90000.
[0.320s][trace][os] VirtualAlloc(0x0000000000000000, 983040, 2000, 4) returned 0x000002a9b5d80000.
[0.320s][trace][os] VirtualAlloc(0x000002a9b5d80000, 32768, 1000, 4) returned 0x000002a9b5d80000.
[0.944s][info ][os] VirtualAlloc(0x0000000600000000, 8589934592, 20003000, 4) failed (1450).
[0.947s][trace][os] VirtualAlloc(0x0000000600000000, 8589934592, 2000, 4) returned 0x0000000600000000.

The JBS bug made it sound like this feature worked in JDK 17 so I ran the same command using JDK 17.0.5.

/c/java/binaries/jdk/x64/jdk-17.0.5+8/bin/java.exe -Xms8G -Xmx8G -XX:+UseLargePages -Xlog:gc+init -Xlog:pagesize=trace:file=pagesize17.txt::filecount=0 -Xlog:os=trace:file=os17.txt::filecount=0 --version

The head of pagesize17.txt looks quite similar to its JDK 19 counterpart.

[0.122s][trace][pagesize] Attempt special mapping: size: 240M, alignment: 2M
[0.122s][debug][pagesize] Reserving large pages in a single large chunk.
[0.434s][info ][pagesize] CodeHeap 'non-nmethods':  min=4M max=6M base=0x000001c237e00000 page_size=2M size=6M
[0.434s][info ][pagesize] CodeHeap 'profiled nmethods':  min=4M max=116M base=0x000001c238400000 page_size=2M size=116M
[0.434s][info ][pagesize] CodeHeap 'non-profiled nmethods':  min=4M max=118M base=0x000001c23f800000 page_size=2M size=118M
[0.435s][trace][pagesize] Attempt special mapping: size: 8G, alignment: 4M
[0.435s][debug][pagesize] Reserving large pages in a single large chunk.
[1.039s][info ][pagesize] Heap:  min=8G max=8G base=0x0000000600000000 page_size=4K size=8G

Interestingly, the os17.txt shows the same failure! The difference is that there is no warning message output to the console.

[0.078s][debug][os] Initial active processor count set to 12
[0.083s][trace][os] VirtualAlloc(0x0000000000000000, 8192, 2000, 4) returned 0x000001c22fa90000.
[0.083s][trace][os] VirtualAlloc(0x000001c22fa90000, 8192, 1000, 4) returned 0x000001c22fa90000.
[0.083s][info ][os] SafePoint Polling address, bad (protected) page:0x000001c22fa90000, good (unprotected) page:0x000001c22fa91000
[0.086s][trace][os] VirtualAlloc(0x000000307c500000, 16384, 1000, 4) returned 0x000000307c500000.
[0.086s][info ][os] attempting shared library load of C:\java\binaries\jdk\x64\jdk-17.0.5+8\bin\java.dll
[0.121s][info ][os] shared library load of C:\java\binaries\jdk\x64\jdk-17.0.5+8\bin\java.dll was successful
[0.433s][trace][os] VirtualAlloc(0x0000000000000000, 251658240, 20003000, 40) returned 0x000001c237e00000.
[0.434s][trace][os] VirtualAlloc(0x0000000000000000, 65536, 2000, 4) returned 0x000001c22fab0000.
[0.434s][trace][os] VirtualAlloc(0x000001c22fab0000, 32768, 1000, 4) returned 0x000001c22fab0000.
[0.434s][trace][os] VirtualAlloc(0x0000000000000000, 983040, 2000, 4) returned 0x000001c22fac0000.
[0.434s][trace][os] VirtualAlloc(0x000001c22fac0000, 32768, 1000, 4) returned 0x000001c22fac0000.
[0.434s][trace][os] VirtualAlloc(0x0000000000000000, 983040, 2000, 4) returned 0x000001c22fbb0000.
[0.434s][trace][os] VirtualAlloc(0x000001c22fbb0000, 32768, 1000, 4) returned 0x000001c22fbb0000.
[1.038s][info ][os] VirtualAlloc(0x0000000600000000, 8589934592, 20003000, 4) failed (1450).
[1.039s][trace][os] VirtualAlloc(0x0000000600000000, 8589934592, 2000, 4) returned 0x0000000600000000.

The git blame view of the JDK 19 warning message shows that it was introduced by 8271195: Use largest available large page size smaller than LargePage… · openjdk/jdk@08cadb4 (github.com). The corresponding JBS issue is [JDK-8271195] Use largest available large page size smaller than LargePageSizeInBytes when available – Java Bug System (openjdk.org) and it confirms that the message was introduced in JDK 19. So the problem indeed exists in both builds but JDK 19 provides a better experience because it warns you that it could not use large pages as you requested.

Outstanding Items


Running OpenJDK Tier1 Tests

I wanted to test some recent changes I was making in the OpenJDK repo. Running make test-tier1 failed because I did not specify the location of jtreg when I ran configure using this command on Windows or bash configure on my MacBook M1. I cleaned up the sample commands in the script to specify the --with-jtreg option as explained at jdk/testing.md at master · openjdk/jdk · GitHub.

Building target 'test-tier1' in configuration 'macosx-aarch64-server-release'
Test selection 'tier1', will run:
* jtreg:test/hotspot/jtreg:tier1
* jtreg:test/jdk:tier1
* jtreg:test/langtools:tier1
* jtreg:test/jaxp:tier1
* jtreg:test/lib-test:tier1
Error: jtreg framework is not found.
Please run configure using --with-jtreg.
RunTests.gmk:1027: *** Cannot continue.  Stop.
make[2]: *** [test-tier1] Error 2

To run these tests on macOS, run bash configure --with-jtreg=/Users/saint/java/binaries/jtreg-7.1.1+1. configure does not like the ~/java/… path format for some reason. I also missed the fact that the Gtest suite is included in the tier1 tests. Therefore, I got errors like:

--------------------------------------------------
TEST: gtest/GTestWrapper.java
TEST JDK: /Users/saint/repos/java/forks/panama-foreign/build/macosx-aarch64-server-release/images/jdk
...
...
...=---==]=============
java.lang.Error: TESTBUG: the library has not been found in /Users/saint/repos/java/forks/panama-foreign/build/macosx-aarch64-server-release/images/test/hotspot/jtreg/native. Did you forget to use --with-gtest to configure?
	at GTestWrapper.main(GTestWrapper.java:62)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
	at java.base/java.lang.Thread.run(Thread.java:1623)

JavaTest Message: Test threw exception: java.lang.Error
JavaTest Message: shutting down test

I needed to set up the Google tests as I had done earlier in Running OpenJDK Google Tests. On macOs:

cd ~/repos
git clone -b release-1.8.1 https://github.com/google/googletest

cd ~/repos/java/forks/panama-foreign
bash configure --with-debug-level=slowdebug \
 --with-jtreg=/Users/saint/java/binaries/jtreg-7.1.1+1 \
 --with-gtest=/Users/saint/repos/googletest

make test-tier1

On Windows, I time the commands (out of my own curiosity) since they take much longer to run on my hardware:

cd /c/repos
git clone -b release-1.8.1 https://github.com/google/googletest

cd /cygdrive/c/java/forks/panama-foreign
time bash configure --with-debug-level=slowdebug \
 --with-jtreg=/cygdrive/c/java/binaries/jtreg-7.1.1+1 \
 --with-gtest=/cygdrive/c/repos/googletest

time make test-tier1

gtest Failure on macOS

make test-tier1 fails on macOS due to errors in the googletest sources. Here is a snippet of the configure output showing the C and C++ compiler versions in use:

configure: Using default toolchain clang (clang/LLVM)
checking for clang... /usr/bin/clang
checking resolved symbolic links for CC... no symlink
configure: Using clang C compiler version 13.1.6 [Apple clang version 13.1.6 (clang-1316.0.21.2.5) Target: arm64-apple-darwin21.2.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin]
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C... yes
checking whether /usr/bin/clang accepts -g... yes
checking for /usr/bin/clang option to enable C11 features... none needed
checking for clang++... /usr/bin/clang++
checking resolved symbolic links for CXX... no symlink
configure: Using clang C++ compiler version 13.1.6 [Apple clang version 13.1.6 (clang-1316.0.21.2.5) Target: arm64-apple-darwin21.2.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin]
checking whether the compiler supports GNU C++... yes
checking whether /usr/bin/clang++ accepts -g... yes
checking for /usr/bin/clang++ option to enable C++11 features... none needed
checking how to run the C preprocessor... /usr/bin/clang -E
checking how to run the C++ preprocessor... /usr/bin/clang++ -E
configure: Using clang linker version 764 [@(#)PROGRAM:ld  PROJECT:ld64-764]
checking for ar... /usr/bin/ar

The errors are about implicit copy constructors like in the example below. The build fails because there are too many errors (all related to this warning).

Creating hotspot/variant-server/libjvm/gtest/gtestLauncher from 1 file(s)
In file included from /Users/saint/repos/googletest/googlemock/src/gmock-all.cc:39:
In file included from /Users/saint/repos/googletest/googlemock/include/gmock/gmock.h:59:
/Users/saint/repos/googletest/googlemock/include/gmock/gmock-actions.h:484:3: error: definition of implicit copy constructor for 'PolymorphicAction<testing::internal::ReturnNullAction>' is deprecated because it has a user-declared copy assignment operator [-Werror,-Wdeprecated-copy]
  GTEST_DISALLOW_ASSIGN_(PolymorphicAction);
  ^
/Users/saint/repos/googletest/googletest/include/gtest/internal/gtest-port.h:928:8: note: expanded from macro 'GTEST_DISALLOW_ASSIGN_'
  void operator=(type const &) GTEST_CXX11_EQUALS_DELETE_
       ^
/Users/saint/repos/googletest/googlemock/include/gmock/gmock-actions.h:1125:10: note: in implicit copy constructor for 'testing::PolymorphicAction<testing::internal::ReturnNullAction>' first required here
  return MakePolymorphicAction(internal::ReturnNullAction());
         ^
/Users/saint/repos/googletest/googlemock/include/gmock/gmock-actions.h:484:3: error: definition of implicit copy constructor for 'PolymorphicAction<testing::internal::ReturnVoidAction>' is deprecated because it has a user-declared copy assignment operator [-Werror,-Wdeprecated-copy]
  GTEST_DISALLOW_ASSIGN_(PolymorphicAction);
  ^
/Users/saint/repos/googletest/googletest/include/gtest/internal/gtest-port.h:928:8: note: expanded from macro 'GTEST_DISALLOW_ASSIGN_'
  void operator=(type const &) GTEST_CXX11_EQUALS_DELETE_
       ^
/Users/saint/repos/googletest/googlemock/include/gmock/gmock-actions.h:1130:10: note: in implicit copy constructor for 'testing::PolymorphicAction<testing::internal::ReturnVoidAction>' first required here
  return MakePolymorphicAction(internal::ReturnVoidAction());
         ^
In file included from /Users/saint/repos/googletest/googlemock/src/gmock-all.cc:39:
In file included from /Users/saint/repos/googletest/googlemock/include/gmock/gmock.h:62:
In file included from /Users/saint/repos/googletest/googlemock/include/gmock/gmock-generated-function-mockers.h:44:
In file included from /Users/saint/repos/googletest/googlemock/include/gmock/gmock-spec-builders.h:71:

A search for GTEST_DISALLOW_ASSIGN_ (bing.com) reveals this PR fixing the issue upstream Fix Clang’s `-Wdeprecated-copy` warnings in C++20 by Quuxplusone · Pull Request #2758 · google/googletest · GitHub. Checking out the v1.12.0 branch of the googletest repo leads to a different compiler error!

Creating hotspot/variant-server/libjvm/libgtest/libgtest.a from 1 file(s)
/Users/saint/repos/java/forks/panama-foreign/test/hotspot/gtest/gtestMain.cpp:233:7: error: no member named 'FLAGS_gtest_internal_run_death_test' in namespace 'testing::internal'; did you mean 'testing::FLAGS_gtest_internal_run_death_test'?
  if (::testing::internal::GTEST_FLAG(internal_run_death_test).length() > 0) {
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      testing::FLAGS_gtest_internal_run_death_test

Looks like this will need some additional tweaks to get the macOS tests to run successfully. However, the tests on Windows x64 ran successfully and that was enough for what I was investigating.

Update: gtest Failure on Windows

I tried setting up a build environment on a new Windows machine and got this error about the gtest version from bash configure.

checking for gtest... /cygdrive/c/repos/googletest
configure: error: gtest at /cygdrive/c/repos/googletest does not seem to be version 1.8.1
configure exiting with result code 1

configure detects the googletest version by grepping the googletests CMakeLists.txt for GOOGLETEST_VERSION then using a regex to replace the whole line with the version number only.

grep GOOGLETEST_VERSION /cygdrive/c/repos/googletest/CMakeLists.txt | sed -E -e 's/set\(GOOGLETEST_VERSION (.*)\)/\1/'

The output is the string 1.9.0 as expected. Wondering if this is a line ending issue, I switch CMakeLists.txt to the Unix line endings using Notepad++. The new error below means that was indeed the issue!

checking for gtest... /cygdrive/c/repos/googletest
configure: error: gtest at /cygdrive/c/repos/googletest does not seem to be version 1.8.1 B

Changing the line endings of the googletest configure.ac resolves this issue!


Categories: R, RStudio

Building RStudio on Windows

Clone the rstudio repo.

git clone https://github.com/rstudio/rstudio
cd rstudio\dependencies\windows

The instructions for setting up a build environment have you run the Install-RStudio-Prereqs.ps1 script as an administrator. I do not like the way in which it dumps stuff all over the drive so I decided to fork the repo and adjust the installation locations. I first needed to enable powershell script execution as described by this article about Execution Policies – PowerShell seeing as running scripts is disabled on my system (Get-ExecutionPolicy currently outputs Restricted).

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

To customize by build environment, I set $DeleteDownloads to $false to keep the installers in case I need to do some debugging as suggested in the script. I then introduce 2 variables to simplify customizing the environment (with the help of Everything you wanted to know about variable substitution in strings – PowerShell | Microsoft Learn). The installer path needs to be created to avoid exceptions when writing the R installer. This was a useful resource: Learn Four Ways to Use PowerShell to Create Folders – Scripting Blog (microsoft.com).

$rPath = "C:\software\R"
$installersPath = "C:\software\installers"

if (-Not (Test-Path -Path $installersPath)) {
    New-Item $installersPath -ItemType directory
}

I used a virtual machine so I could easily revert to clean state. On this clean machine, the script successfully installs the prerequisites. The VS Build Tools 2019 are installed without the VS IDE and the VS Installer shows the corresponding stand-alone entry.

It then suggests installing Qt 5.12.8. When I saw this, I was a bit concerned about the commercial license requirements I ran into when Building ParaView. Fortunately, Qt 5.12 is still available in the Qt downloads archive.

I decided to use my physical machine on which I had already installed R so the first step was to uninstall R. Once that completed, I decided to try launching RStudio (2022.12.0+353) one more time since it wouldn’t have an R installation. Lo and behold, it now launched successfully!? Before proceeding, I decide to determine the state of my physical machine by listing all the installed products. I recall using wmic for this a while back. wmic /? now outputs a message that it is deprecated. windows – What can I do about “WMIC is deprecated”? – Stack Overflow says that the Common Information Model (CIM) is the way to go. Nonetheless, I can still get what I need using this command from the web:

wmic /output:initial-state.txt product get name, version

I decide to leave RStudio installed and proceed with installing the prereqs.

Set-ExecutionPolicy Bypass -Scope Process -Force; iex .\Install-RStudio-Prereqs.ps1

Installing Qt

I selected only the Qt * components and the sources under the Qt 5.12 node in the installer and Qt Creator in the Developer and Designer Tools node. This cut down on the size on disk of the Qt installation. Changes should be easy to make should any build errors pop up downstream.

Installing Other Dependencies

Dependencies like boost and nodejs are set up by dependencies/windows/install-dependencies.cmd. Running this in a new command prompt (with tools like R in the path after Install-RStudio-Prereqs.ps1 completed) worked flawlessly as evidenced by the installation log. Therefore, building the C++ code was the next step.

Building C++ Code

Unfortunately, cmake ..\cpp -GNinja failed. Notice from the output though that the custom R location was detected just fine.

...
-- Configured to build DESKTOP
-- LIBR_HOME: C:/software/R/R-3.6.3
-- Found LibR: C:/software/R/R-3.6.3
-- Found R: C:/software/R/R-3.6.3
CMake Error at desktop/CMakeLists.txt:105 (message):
  Did not find supported Qt SDK


-- Configuring incomplete, errors occurred!
See also "C:/repos/forks/rstudio/src/build/CMakeFiles/CMakeOutput.log".
See also "C:/repos/forks/rstudio/src/build/CMakeFiles/CMakeError.log".

The first search result from cmake Did not find supported Qt SDK – Google Search is build issue : Did not find supported Qt SDK – RStudio IDE – RStudio Community, which is a hint about the source of the error (/src/cpp/desktop/CMakeLists.txt) and the solution: defining the location of the qmake binary. It is at this point that I realize I don’t have it on disk.

Launch C:\software\Qt\Qt5.12.8\MaintenanceTool.exe to add components. When “Add or remove components” is selected, an error appears claiming that “At least one valid and enabled repository required for this action to succeed.” Thankfully, some folks already figured out the way around this at qt5 – How to update component in Qt 5.2.1+? – Stack Overflow. That now leads to the message “There is an important update available, please run the updater first.” [QTIFW-2244] Clarify required user actions when there is an essential update – Qt Bug Tracker is the hint that I should select “Update Components” instead of digging around for an updater executable.

The new maintenance tool looks similar to the one I used last summer. Makes sense since the version of Qt in use by rstudio is older. I needed to change the filters to show the MSVC 2017 64-bit component (as well as its peers that I didn’t install earlier). What a convoluted process! Nevertheless, this created the C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64 directory with the bin\qmake.exe file I was looking for earlier.

cmake -GNinja -DQT_QMAKE_EXECUTABLE=C:/software/Qt/Qt5.12.8/5.12.8/msvc2017_64/bin/qmake.exe ..\cpp

I could now successfully run cmake using the hint from build issue : Did not find supported Qt SDK – RStudio IDE – RStudio Community.

...
-- Version: 8.1.1
-- Build type: Debug
-- CXX_STANDARD: 11
-- Required features: cxx_variadic_templates
-- Configured to build DEVELOPMENT
-- Configured to build DESKTOP
-- LIBR_HOME: C:/software/R/R-3.6.3
-- Found R: C:/software/R/R-3.6.3
-- Found Qt: C:/software/Qt/Qt5.12.8/5.12.8/msvc2017_64/bin/qmake.exe
-- Configuring done
-- Generating done
-- Build files have been written to: C:/repos/forks/rstudio/src/build

I could then build the C++ sources by running ninja. This takes just under 5 minutes on my AMD Ryzen 7 5800X 8-Core system. Launching src\build\desktop\rstudio.exe fails due to missing Qt binaries (Qt5PrintSupportd.dll, Qt5WebEngineWidgetsd.dll, Qt5WebEngineCored.dll, Qt5WebChanneld.dll). Interestingly, none of these binaries are loaded in the public RStudio build’s rstudio.exe process.

These commands will deploy the necessary binaries into the rstudio desktop directory.

pushd C:\repos\forks\rstudio\src\build\desktop\
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5PrintSupportd.dll  .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5WebEngineWidgetsd.dll .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5WebEngineCored.dll .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5WebChanneld.dll   .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Widgetsd.dll      .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Quickd.dll        .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Guid.dll          .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Networkd.dll      .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Cored.dll         .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5QuickWidgetsd.dll .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Qmld.dll          .
copy C:\software\Qt\Qt5.12.8\5.12.8\msvc2017_64\bin\Qt5Positioningd.dll  .
popd

This is sufficient to launch RStudio but only the main menu is present in the RStudio window.

RStudio window after building locally.

A quick review of the launch instructions and I see that I didn’t need to copy these binaries: From command prompt, cd to the build location, and run rstudio.bat. This batch file updates the PATH to include the location of these Qt binaries!

@echo off
setlocal

set "RS_CRASH_HANDLER_PATH=C:/repos/forks/rstudio/src/cpp/../../dependencies/windows/crashpad-release/bin/crashpad_handler.exe"
set "QT_PLUGIN_PATH=C:/software/Qt/Qt5.12.8/5.12.8/msvc2017_64/plugins"
set "PATH=C:/software/Qt/Qt5.12.8/5.12.8/msvc2017_64/bin;%PATH%"
desktop\rstudio.exe

Building Java/Gwt

As evident from the post so far, I started by building C++ instead of Java/Gwt. Not sure if that’s why I got a blank window. The relationship between the C++ and Java/Gwt components is still a mystery to me. My first attempt at building Java/Gwt was to use ant desktop.

...
panmirror:
     [echo] yarn location: ../../dependencies/common/node/16.14.0/node_modules/yarn/bin/yarn.cmd
    [mkdir] Created dir: C:\repos\forks\rstudio\src\gwt\www\js\panmirror
     [exec] 'node' is not recognized as an internal or external command,
     [exec] operable program or batch file.

BUILD FAILED
C:\repos\forks\rstudio\src\gwt\build.xml:211: The following error occurred while executing this line:
C:\repos\forks\rstudio\src\gwt\build.xml:233: The following error occurred while executing this line:
C:\repos\forks\rstudio\src\gwt\build.xml:121: exec returned: 1

That failed because nodejs is not in the PATH. dir /s node.exe in the repo root shows it to be in dependencies\common\node\16.14.0. I manually update the path:

set "PATH=C:/repos/forks/rstudio/dependencies/common/node/16.14.0;%PATH%"

The build now proceeds (with lots of warnings) and launches the GWT Development Mode window below. The last message in the console is a notification that The code server is ready at http://127.0.0.1:9876/. Not sure what I expected, but this wasn’t it.

Launching rstudio in a new command prompt (by running rstudio.bat from the build location) now leads to a non-blank window:

Compiling rstudio

A few moments later (not sure exactly how long it took, but it was long enough for me to get a screenshot), RStudio loads!

RStudio Window

I notice that the About window cannot be moved outside the bounds of the parent window. This is some interesting rendering going on.

About RStudio

Notice rstudio.exe’s child process QtWebEngineProcessd.exe and ant.exe and its jdk1.8.0_211 java.exe child processes. A minidump of rstudio.exe shows many of the Qt DLLs loaded into it, unlike the public release rstudio.exe. Fascinating architecture all around.

Outstanding Items


Categories: R

Building R on Windows

Building R from Source

In the last post on Installing R, my inability to successfully launch RStudio got me curious about building the product myself since it is open source. I chased the link from The Comprehensive R Archive Network (rstudio.com) to R: The R Project for Statistical Computing (r-project.org), which has a link to the contibutor development site R Development Guide (r-project.org). Chapter 2 R Patched and Development Versions | R Development Guide (r-project.org) has detailed instructions on how to build the sources. I install TortoiseSVN then download rtools42-5355-5357.exe. Looks like it’s just an MSYS2 installation.

Installing MiKTeX

I also follow the instructions and install MiKTeX and check for updates. The process is detailed at Install MiKTeX on Windows. I had some trouble getting the MiKTeX installer to work when installing for all users so I end up installing for just myself but place it in c:\software because I find the user-specific paths too obnoxious.

I was expecting to have TeXnicCenter installed but that is just an editor and it would have required MiKTeX as well. It is not required for building R but I install it anyway (since I usually typeset a lot of personal PDFs such as course assignments/reports).

Environment Setup

I run this command to ensure all programs can be found which make gcc pdflatex tar. However, I did this before step 6: Add gcc, MiKTeX and tar to the PATH and set one tar option.

/usr/bin/make
which: no gcc in (/usr/local/bin:/usr/bin:/bin:/opt/bin:/c/Windows/System32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/progra~1/git/cmd:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl)
which: no pdflatex in (/usr/local/bin:/usr/bin:/bin:/opt/bin:/c/Windows/System32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/c/progra~1/git/cmd:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl)
/usr/bin/tar

This output is interesting because it reminds me of my search for gcc in the MSYS environment when trying to build octave (see Building Octave on Windows)! This is the environment I use on my machine:

export PATH="/x86_64-w64-mingw32.static.posix/bin:$PATH"
export PATH="/c/software/MiKTeX/miktex/bin/x64:$PATH"
export TAR="/usr/bin/tar"
export TAR_OPTIONS="--force-local"

Downloading the latest recommended packages by running "$TOP_SRCDIR/tools/rsync-recommended" outputs the info below.

receiving incremental file list
KernSmooth_2.23-20.tar.gz
MASS_7.3-58.1.tar.gz
Matrix_1.5-3.tar.gz
boot_1.3-28.1.tar.gz
class_7.3-20.1.tar.gz
cluster_2.1.4.tar.gz
codetools_0.2-18.tar.gz
foreign_0.8-84.tar.gz
lattice_0.20-45.tar.gz
mgcv_1.8-41.tar.gz
nlme_3.1-161.tar.gz
nnet_7.3-18.tar.gz
rpart_4.1.19.tar.gz
spatial_7.3-15.1.tar.gz
survival_3.4-0.tar.gz

sent 419 bytes  received 13,176,494 bytes  2,395,802.36 bytes/sec
total size is 13,169,596  speedup is 1.00
Creating links

Building R

Building R and the recommended packages takes 10m 27sec on my AMD Ryzen 7 5800X 8-Core Processor.

make all recommended

The code ends with

...
gcc  -I"C:/Users/saint/repos/R/include" -DNDEBUG     -I"C:/dev/software/rtools42/x86_64-w64-mingw32.static.posix/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mfpmath=sse -msse2 -mst
ackrealign  -c tprs.c -o tprs.o
gcc -shared -s -static-libgcc -o mgcv.dll tmp.def coxph.o davies.o discrete.o gdi.o init.o magic.o mat.o matrix.o mgcv.o misc.o mvn.o ncv.o qp.o soap.o sparse-smooth.o sparse.o tpr
s.o -LC:/Users/saint/repos/R/bin/x64 -lRlapack -LC:/Users/saint/repos/R/bin/x64 -lRblas -lgfortran -lm -lquadmath -fopenmp -LC:/dev/software/rtools42/x86_64-w64-mingw32.static.posi
x/lib/x64 -LC:/dev/software/rtools42/x86_64-w64-mingw32.static.posix/lib -LC:/Users/saint/repos/R/bin/x64 -lR
installing to C:/Users/saint/repos/R/library/mgcv/libs/x64
** R
** data
** inst
** tests
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (mgcv)

I find it strange that the code uses gnu99.

gcc -std=gnu99 -I. -I../include -DHAVE_CONFIG_H -DR_DLL_BUILD -DR_ARCH='"x64"' -O3 -Wall -pedantic -mfpmath=sse -msse2 -mstackrealign    -c eval.c -o eval.o

Running make check takes 3 minutes on my machine and ends with this output:

...
running tests of plotting Latin-1
  expect failure or some differences if not in a Latin-1 or UTF-8 locale
running code in 'reg-plot-latin1.R' ... OK
  comparing 'reg-plot-latin1.pdf' to './reg-plot-latin1.pdf.save' ... OK
running code in 'reg-S4.R' ... OK
  comparing 'reg-S4.Rout' to './reg-S4.Rout.save' ... OK
running tests of Internet functions
running code in 'internet.R' ... OK
  comparing 'internet.Rout' to './internet.Rout.save' ... OK

Launching R

Run R using the command $TOP_SRCDIR/bin/x64/Rgui.exe. Help > About displays this information for my local build.

Information about my local R build

Categories: R, RStudio

Installing R

I was recently checking out a data analytics course that uses R as one of its tools. It recommended using RStudio. Having never used R, I was intrigued so I started at Step 1 on the RStudio Desktop – Posit download: installing R. It links to The Comprehensive R Archive Network (rstudio.com), which has a link to the Windows download page. I installed R 4.2.2 on my desktop. Unfortunately, the web page uses frames, which makes it non-trivial to get the appropriate link to share here. Installing then launching RStudio presented this dialog:

Launching RStudio

Unfortunately, nothing happens after I click OK. There is no trace of the process in Task Manager either. I download the RStudio ZIP and see if perhaps it’s an installer issue.

Defender SmartScreen Warning about Unzipped rstudio.exe

Bypassing the Microsoft Defender warning doesn’t result in the executable successfully starting either. Poking around on the web with a quick search for rstudio not opening hints that other folks have run into this issue, e.g. R studio not launching – RStudio IDE – RStudio Community and Rstudio can’t launch on windows 10 – RStudio IDE – RStudio Community. They have an interesting suggestion though: pressing CTRL when launching RStudio then selecting the Software rendering engine.

Changing the Rendering Engine
Rendering Engine Changed Notification

Unfortunately, this doesn’t address the startup failure. I try one more thing: explicitly selecting the version of R that is used by RStudio.

Choosing a specific version of R

When this doesn’t work, I install R and RStudio on my Surface Pro and RStudio launches successfully! I wonder if I might need to build a local installation to dig deeper into this issue. I will update this post with the results of my investigation when it completes.