My post on How to Build Elmer on Windows has a succinct list of instructions but when I first built the Elmer source code (on Windows), I was unsure how to run the binaries. Running ElmerGUI, for example, failed because qt5.dll could not be found. This post details the process I used to figure out how to create a usable Elmer installation.
I started by going through the generated Makefile. It contained targets like install, install/local, and install/strip. I started with install/local, fearing that perhaps these targets would attempt to install the binaries into program files.
Installing only the local directory...
-- Install configuration: "RelWithDebInfo"
-- Installing: C:/dev/repos/fem/install/bin/libopenblas.dll.a
CMake Error at cmake_install.cmake:45 (file):
file INSTALL cannot find "C:/dev/repos/fem/elmerfem/../bundle_msys2/bin":
No error.
make: *** [Makefile:142: install/local] Error 1
The error at cmake_install.cmake:45 is from ElmerCPack.cmake. I did not even know about the existence of the CPack tool. Git blame points to commit b9914082 as the source of the failing command. The strange thing about that commit is that there is no mention of what is supposed to create the bundle folders (despite the callout that the change “Relies on selected bundled files selected prior to installation”).
A workaround I tried was to use this define when running cmake: CPACK_BUNDLE_EXTRA_WINDOWS_DLLS:BOOL=FALSE. That avoided the build error but did not result in a usable installation. ElmerGUI’s windows_bundle.cmake needs this to be set to TRUE to package the Qt5 binaries into the install folder. Strangely enough, ElmerGUILogger’s windows_bundle.cmake does not have this logic for Qt5. This is likely because ElmerGUI would already have installed the right files into the bin folder but this looks like a bug.
Another work-around I tried was to manually create the folders expected by the build. The install target then succeeded but the necessary binaries were not copied to the install folder.
This is when I dig around and find that only ElmerGUILogger and ElmerClips include windows_bundle.cmake. Hmm… the latter looks promising but doesn’t look up to date either since it requires Qt4. More exploring of the Makefile – the package target looks interesting but fails because NSIS is not installed.
...
[ 97%] Built target ElmerGUI_autogen
[100%] Built target ElmerGUI
Run CPack packaging tool...
CPack Error: Cannot find NSIS compiler makensis: likely it is not installed, or not in your PATH
CPack Error: Could not read NSIS registry value. This is usually caused by NSIS not being installed. Please install NSIS from http://nsis.sourceforge.net
CPack Error: Cannot initialize the generator NSIS
make: *** [Makefile:71: package] Error 1
I wonder if we couldn’t just use NSIS for the MSYS environment:
$ pacman -Ss nsis
...
mingw64/mingw-w64-x86_64-nsis 3.06.1-1
Windows installer development tool (mingw-w64)
mingw64/mingw-w64-x86_64-nsis-nsisunz 1.0-2
NSIS plugin which allows you to extract files from ZIP archives (mingw-w64)
...
$ pacman -S mingw64/mingw-w64-x86_64-nsis
Now we can create an Elmer installer by running make package. Unfortunately, that turns out to be insufficient. My next idea is to compare the binaries from the installed. This turned out to be easier when using ls -R1 to output only the file names and in 1 column only. Some obvious differences are that the build in Program Files has a bin folder containing the Qt and vtk binaries (as well as a stripped gfortran).
The Qt binaries certainly look like the output of windows_bundle.cmake (found this time by a search for “vtk”) but it’s still not clear how this file would be includedin the build. I’m using VSCode to search for “windows_bundle” and only 2 of the 3 references in the codebase were showing up (on my desktop). Looking for “ElmerGUILogger” then revealed yet another reference. Such a waste of time! Not cool VSCode, not cool. It’s included in ElmerGUI/CMakeLists.txt. So I probably only need to define WIN32. But why does the WIN32 code run if I add some statements to that IF block?
Some searching (TODO: put bing searches here) leads to indications that there might be an error in CMake where WIN32 is not defined. Seeing signs that MSYS can be in Cywgin? Trying to get to the bottom of why WIN32 is not respected by CMake, I review ElmerGUI/CMakeLists.txt again. It adds the netgen subdirectory. Interestingly, ElmerGUI/netgen/README points out that install\lib\ElmerGUI\ngcore\libng.a is the unix library and that the win32 extension should be .lib. Biggest sign I’ve seen so far that something is really off. At this point, I remember seeing a .a file in the install folder – and that seemed strange for a lib folder since I expected a DLL. This hypothesis fails though because the working Elmer installation also has the file “C:\Program Files\Elmer 9.0-Release\lib\ElmerGUI\ngcore\libng.a“.
The sure way to verify that the WIN32 include is working is to introduce an error into windows_bundle.cmake, e.g. by change the first IF into the unknown IF2. Even better, notice that the “Qt5 Windows packaging” message is correctly displayed. Perhaps the FIND_FILE command should have a REQUIRED option now that it is supported. Adding that doesn’t fail so reexamine the INSTALL command. Since windows_bundle.cmake uses a relative path for the DESTINATION, it is interpreted relative to the value of the CMAKE_INSTALL_PREFIX variable.
How about zooming into the windows_bundle.cmake file and outputting the list of files installed afterthe install command. The new message says the files were installed. However, they aren’t in the install folder! I need an explanation for this behavior. So let’s open Process Explorer and see which paths are actually used. These requests show up using the Qt5 path filter:
These files exist on disk (despite the Result column in process explorer having the value INVALID DEVICE REQUEST)! For a better understanding of what is happening, I change the bin path used by the install command to dbgcmake since there are multiple bin folders (under install and also in the MinGW installation). So such path shows up in Process Explorer when using a path filter for dbgcmake. This means I shouldn’t be expecting this files to be written by cmake at this point. In fact, running grep -Rin dbgcmake shows that build/ElmerGUI/cmake_install.cmake now contains this snippet (thereby verifying that windows_bundle.cmake is used to generate cmake_install.cmake).
if("x${CMAKE_INSTALL_COMPONENT}x" STREQUAL "xelmerguix" OR NOT CMAKE_INSTALL_COMPONENT)
file(INSTALL DESTINATION "${CMAKE_INSTALL_PREFIX}/dbgcmake" TYPE FILE FILES
"D:/dev/Software/msys64/mingw64/bin/Qt5Core.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5Gui.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5OpenGL.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5Script.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5Xml.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5Svg.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5Widgets.dll"
"D:/dev/Software/msys64/mingw64/bin/Qt5PrintSupport.dll"
)
endif()
So back to the question of why windows_bundle.cmake is not invoked by default – let us dig into the CMake sources for clues. Renaming it outputs an error for the include statement meaning the file is being included! Sigh… I end up poking around some CMake sources to see where WIN32 is set anyway.:
I immediately notice that the install command of interest is skipped because CPACK_BUNDLE_EXTRA_WINDOWS_DLLS is not set to TRUE! Setting it to FALSE earlier then seeing a build failure gave the impression that it was set to TRUE everywhere (especially given that ElmerCPack.cmake sets it to TRUE) but that happens afterwindows_bundle.cmake has been evaluated! Here’s the final required command line:
There are lessons there about making assumptions but the biggest takeaway for me is the need for (and existence) of tracing capabilities in CMake. Enabling tracing made it so easy to figure out exactly what was broken – the variable did not have the value I expected and I simply needed to define it! Running make install now results in new errors haha!
The code execution cannot proceed because libgcc_s_seh-1.dll was not found. Reinstalling the program may fix this problem. This error dialog is then followed by others for qwt-qt5.dll, libstdc++-6.dll, and libdouble-conversion.dll respectively. To see which other binaries are required, manually copy these 4 binaries from d:\dev\Software\msys64\mingw64\bin to install/bin. These are the other missing binaries that show up in error dialogs:
There are so many missing binaries that I wonder if building the other components might be necessary. I first manually copy them from the bin folder to see if ElmerGUI can load but that now fails with the error This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
The working installation from the downloaded public Elmer installer has a qwindows.dll in the bin/platforms directory. This file exists on my system as D:\dev\Software\msys64\mingw64\share\qt5\plugins\platforms\qwindows.dll. At this point, I’m really wondering how on earth they are gathering all these files. Hmm, maybe they build using their docker image? Didn’t think of seeing how their docker image is set up. One final try to set up the qwindows.dll first and see what happens.
Lo and behold! ElmerGUI loads successfully! Whether it actually works… well, I will revisit that after I learn how to use Elmer/ElmerGUI :). Actually, turns out ElmerSolver.exe doesn’t start. Here is the fix for the 3 binaries that cause error dialogs:
At this point, ElmerSolver starts up successfully but outputs an error about missing ELMERSOLVER_STARTINFO. This is an error from ElmerSolver.F90 (the Fortran 90 source code). From that code, it looks like the error is because I didn’t specify an input file name and the default file does not exist. This is the same behavior as ElmerSolver.exe from the official Elmer installation.
ELMER SOLVER (v 9.0) STARTED AT: 2022/06/26 21:55:22
ParCommInit: Initialize #PEs: 1
MAIN:
MAIN: =============================================================
MAIN: ElmerSolver finite element software, Welcome!
MAIN: This program is free software licensed under (L)GPL
MAIN: Copyright 1st April 1995 - , CSC - IT Center for Science Ltd.
MAIN: Webpage http://www.csc.fi/elmer, Email elmeradm@csc.fi
MAIN: Version: 9.0 (Rev: af959fd0, Compiled: 2022-06-26)
MAIN: Running one task without MPI parallelization.
MAIN: Running with just one thread per task.
MAIN: =============================================================
ERROR:: ElmerSolver: Unable to find ELMERSOLVER_STARTINFO, can not execute.
STOP 1
Reviewing Docker Files
I was wondering after all this if there was a Windows docker file that had all these steps already baked in. However, the docker directory has only Ubuntu dockerfiles. Perhaps I could create a Windows docker file if I could figure out these dependencies.
Binaries Required by ElmerSolver
The above deployment instructions starts with the binaries required by ElmerGUI but ElmerSolver is the crucial component (since Elmer can be used with the GUI). Here are the binaries grouped such that the ElmerGUI binaries can be excluded if desired.
Deploying the above binaries manually gives a locally runnable build. Unfortunately, it does not fix the build created by CPack when you run make install – that build will not have any of these binaries/dependencies.
The instructions for building the Elmer source code are really simple! I decided to try them on Windows. The Developer Command Prompt is necessary for cmake (as far as I can tell). Note that C, C++, and Fortran compilers are required for building Elmer.
cd \dev\repos
mkdir fem
git clone git://www.github.com/ElmerCSC/elmerfem
mkdir build
cd build
cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install ../elmerfem
I discovered that a Fortran compiler is required when I got this error on my first build attempt:
-- Building for: Visual Studio 17 2022
-- The Fortran compiler identification is unknown
-- The C compiler identification is MSVC 19.32.31326.0
-- The CXX compiler identification is MSVC 19.32.31326.0
CMake Error at CMakeLists.txt:34 (PROJECT):
No CMAKE_Fortran_COMPILER could be found.
Line 34 of CMakeLists.txt – PROJECT(Elmer Fortran C CXX) – uses the PROJECT cmake command to set the project name to “Elmer” and specify the programming languages required, hence the build failure above.
Unfortunately, that wasn’t sufficient to address the build failure. Interestingly, someone else ran into this exact same issue at windows – The MinGW gfortran compiler is not able to compile a simple test program – Stack Overflow. Sad times though when StackOverflow does not have an answer! Their solution for specifying a custom compiler is much cleaner – simply define the CMake variable when invoking cmake!
The MinGW-w64 downloads looked promising. Since I already had Cygwin installed, I installed the GFortran package. The path to the GFortran compiler can be retrieved using the Cygwin command cygpath -w `which gfortran` and passed to CMake. That still didn’t work.
At least that showed the mingw Fortran compiler package name mingw64-x86_64-gcc-fortran. Interestingly, that package is marked already installed!
Via MSYS2
Since Cygwin didn’t simply work, I decided to try installing MSYS2 (before resorting to uninstalling the Cygwin gcc-fortran package). The Fortran compiler is installed by MSYS2. Once setup completes, CMake also fails when using the MinGW Fortran compiler!
Since none of the compilers work, let’s take a closer look at the error:
$ cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=C:/dev/software/gcc/bin/gfortran.exe ../elmerfem
-- The Fortran compiler identification is unknown
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - failed
-- Check for working Fortran compiler: C:/dev/software/gcc/bin/gfortran.exe
-- Check for working Fortran compiler: C:/dev/software/gcc/bin/gfortran.exe - broken
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/CMakeTestFortranCompiler.cmake:61 (message):
The Fortran compiler
"C:/dev/software/gcc/bin/gfortran.exe"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: D:/dev/repos/fem/build/CMakeFiles/CMakeTmp
Run Build Command(s):C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_4528a &&
Microsoft Visual Studio 2022 Version 17.3.0 Preview 1.0 [...].
Copyright (C) Microsoft Corp. All rights reserved.
The operation could not be completed. The parameter is incorrect.
Use:
devenv [solutionfile | projectfile | folder | anyfile.ext] [switches]
...
To get a sense of what could be going wrong, I opened the folder containing the temporary project CMake is trying to build. Its contents are deleted before CMake terminates. However, the build was slow enough for me to copy all the files into another temp folder to repro this failure. Running the devenv.com command above fails with the same error.
Interestingly, loading the solution in Visual Studio results in an error because one of the projects cannot be loaded! However, that project file has a .vfproj extension (which seems specific to the Intel Fortran compiler, e.g. as described at Cannot open vfproj file in visual studio 2017 – Intel Communities).
Looks like it’s the CMakeTestFortranCompiler.cmake file that is generating Intel Fortran projects. The first check that file is:
if(CMAKE_Fortran_COMPILER_FORCED)
# The compiler configuration was forced by the user.
# Assume the user has configured all compiler information.
set(CMAKE_Fortran_COMPILER_WORKS TRUE)
return()
endif()
The CMAKE_Fortran_COMPILER_FORCED define can be used to bail out of the custom configuration so define it when invoking cmake:
We now get a new error! Finally making some progress!
cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE ../elmerfem
-- The Fortran compiler identification is unknown
CMake Deprecation Warning at cmake/Modules/FindMKL.cmake:2 (CMAKE_MINIMUM_REQUIRED):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
CMakeLists.txt:308 (FIND_PACKAGE)
-- ------------------------------------------------
-- Looking for Fortran sgemm
-- Looking for Fortran sgemm - not found
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Found Threads: TRUE
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find BLAS (missing: BLAS_LIBRARIES)
Call Stack (most recent call first):
C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindBLAS.cmake:1337 (find_package_handle_standard_args)
CMakeLists.txt:433 (FIND_PACKAGE)
-- Configuring incomplete, errors occurred!
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeOutput.log".
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeError.log".
This finally gets us past the missing package issues and on to more Fortran compiler errors!
-- Found LAPACK: D:/dev/Software/msys64/mingw64/lib
-- Checking whether D:/dev/Software/msys64/mingw64/bin/gfortran.exe supports PROCEDURE POINTER
-- Checking whether D:/dev/Software/msys64/mingw64/bin/gfortran.exe supports PROCEDURE POINTER -- no
CMake Error at CMakeLists.txt:477 (MESSAGE):
Fortran compiler does not seem to support the PROCEDURE statement.
Support for PROCEDURE Statements
CMakeLists.txt:475 is this line INCLUDE(testProcedurePointer). The included script tests the Fortran compiler but does not explain why the test fails. To see the details, append the string : ${OUTPUT} to the end of the string “Checking whether ${CMAKE_Fortran_COMPILER} supports PROCEDURE POINTER — no” (just before the closing quote). The error message now contains additional information – the same error from earlier! Opening the solution in Visual Studio confirms that yet another unsupported .vfproj has been generated.
Change Dir: D:/dev/repos/fem/build/CMakeFiles/CMakeTmp
Run Build Command(s):C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_77a33 &&
Microsoft Visual Studio 2022 Version 17.3.0 Preview 1.0 [...].
Copyright (C) Microsoft Corp. All rights reserved.
The operation could not be completed. The parameter is incorrect.
Use:
devenv [solutionfile | projectfile | folder | anyfile.ext] [switches]
<Updated VS, unfortunately changing the CMake version>. This is the CMakeLists.txt generated for the solution:
That does not work though (in my developer command prompt)
CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.
CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
Looks like I need to try this process in MSYS2.
Custom Generator in MSYS
Running which cmake in MSYS did not find cmake so here’s the version I installed.
$ pacman -Ss cmake
...
mingw64/mingw-w64-x86_64-cmake 3.23.2-1
A cross-platform open-source make system (mingw-w64)
...
$ pacman -S mingw64/mingw-w64-x86_64-cmake
This doesn’t result in being able to run cmake.exe (even though it exists on disk in D:\dev\Software\msys64\mingw64\bin). Time to hit the docs again: msys2 cmake – Search (bing.com) -> Using CMake in MSYS2 – MSYS2. No red flags there… How about a search for the exact error message: msys bash: cmake: command not found – Search (bing.com) -> c++ – CMake is not found when running through make – Stack Overflow. Aha! The answer there about launching MSYS2 using mingw32.exe leads me to inquire about how I’m launching MSYS2. Turns out I’m launching using the last shortcut below (which launches “D:\dev\Software\msys64\msys2_shell.cmd -msys“) instead of MinGW x64.lnk (which launches “D:\dev\Software\msys64\msys2_shell.cmd -mingw64“). Sure enough, which cmake now shows /mingw64/bin/cmake.
Retrying the command line now makes progress! Notice the Fortran compiler is successfully detected (and the GNU C++ compiler is also selected).
$ cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib -DLAPACK_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem
-- The Fortran compiler identification is GNU 12.1.0
-- The C compiler identification is GNU 12.1.0
-- The CXX compiler identification is GNU 12.1.0
...
The build fails but things are very promising now. The error is because Qt is missing:
-- Building ElmerGUI
-- ------------------------------------------------
CMake Deprecation Warning at ElmerGUI/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Warning at ElmerGUI/CMakeLists.txt:19 (find_package):
By not providing "FindQt5.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "Qt5", but
CMake did not find one.
Could not find a package configuration file provided by "Qt5" with any of
the following names:
Qt5Config.cmake
qt5-config.cmake
Add the installation prefix of "Qt5" to CMAKE_PREFIX_PATH or set "Qt5_DIR"
to a directory containing one of the above files. If "Qt5" provides a
separate development package or SDK, be sure it has been installed.
-- ------------------------------------------------
CMake Error at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt4.cmake:1314 (message):
Found unsuitable Qt version "" from NOTFOUND, this code requires Qt 4.x
Call Stack (most recent call first):
ElmerGUI/CMakeLists.txt:42 (FIND_PACKAGE)
Installing Qt5 does not address the build failure. The new error message:
-- Building ElmerGUI
-- ------------------------------------------------
CMake Deprecation Warning at ElmerGUI/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- ------------------------------------------------
-- Qt5 Windows packaging
-- [ElmerGUI] Qt5: 1
-- [ElmerGUI] Qt5 Libraries: Qt5::OpenGL Qt5::Xml Qt5::Script Qt5::Gui Qt5::Core
-- ------------------------------------------------
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to `find_package_handle_standard_args` (OpenGL)
does not match the name of the calling package (Qwt). This can lead to
problems in calling code that expects `find_package` result variables
(e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindOpenGL.cmake:443 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
ElmerGUI/cmake/Modules/FindQwt.cmake:10 (INCLUDE)
ElmerGUI/CMakeLists.txt:61 (FIND_PACKAGE)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Found OpenGL: opengl32
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to `find_package_handle_standard_args` (Qt3) does
not match the name of the calling package (Qwt). This can lead to problems
in calling code that expects `find_package` result variables (e.g.,
`_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt3.cmake:213 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt.cmake:160 (include)
ElmerGUI/cmake/Modules/FindQwt.cmake:11 (INCLUDE)
ElmerGUI/CMakeLists.txt:61 (FIND_PACKAGE)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Could NOT find Qt3 (missing: QT_QT_LIBRARY QT_INCLUDE_DIR)
CMake was unable to find desired Qt version: 3. Set advanced values QT_QMAKE_EXECUTABLE and QT3_QGLOBAL_H_FILE.
-- [ElmerGUI] Qwt: FALSE
-- [ElmerGUI] QWT_LIBRARY: QWT_LIBRARY-NOTFOUND
-- [ElmerGUI] QWT_INCLUDE_DIR: QWT_INCLUDE_DIR-NOTFOUND
-- ------------------------------------------------
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/lib/cmake/Qt5Core/Qt5CoreMacros.cmake:44 (message):
qt5_use_modules is not part of the official API, and might be removed in Qt
6.
Call Stack (most recent call first):
D:/dev/Software/msys64/mingw64/lib/cmake/Qt5Core/Qt5CoreMacros.cmake:431 (_qt5_warn_deprecated)
ElmerGUI/Application/CMakeLists.txt:216 (QT5_USE_MODULES)
This warning is for project developers. Use -Wno-dev to suppress it.
-- ------------------------------------------------
-- BLAS library: D:/dev/Software/msys64/mingw64/lib
-- LAPACK library: D:/dev/Software/msys64/mingw64/lib
-- ------------------------------------------------
-- Fortran compiler: D:/dev/Software/msys64/mingw64/bin/gfortran.exe
-- Fortran flags: -fallow-argument-mismatch -O2 -g -DNDEBUG
-- ------------------------------------------------
-- C compiler: D:/dev/Software/msys64/mingw64/bin/cc.exe
-- C flags: -O2 -g -DNDEBUG
-- ------------------------------------------------
-- CXX compiler: D:/dev/Software/msys64/mingw64/bin/c++.exe
-- CXX flags: -O2 -g -DNDEBUG
-- ------------------------------------------------
-- ------------------------------------------------
-- Package filename: elmerfem-9.0--20220612_Windows-AMD64
-- Patch version: 9.0-
CMake Error at cpack/ElmerCPack.cmake:99 (INSTALL):
INSTALL FILES given directory "D:/dev/Software/msys64/mingw64/lib" to
install.
Call Stack (most recent call first):
CMakeLists.txt:660 (INCLUDE)
-- Configuring incomplete, errors occurred!
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeOutput.log".
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeError.log".
Does this need Qt3? The ElmerGUI documentation says Qt4 (4.8 or higher). FindQt.cmake:160 (in bold above) appears to indicate that only Qt versions 3 and 4 are supported in MinGW. The mix of warnings and “could not find” makes it hard to know exactly what is wrong. The last error, for example, appears to be about the installation files directory. So is there anything wrong with Qt? I’ll assume not.
The cmake docs on installing files doesn’t point to anything peculiar in this scenario but this is a hint that my LAPACK_LIBRARIES variable is most likely wrong. Let’s drop it altogether:
# Clean up old make files
# rm -fr *
cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem
The build still fails but right before the error, notice the LAPACK library now has a DLL instead of a directory (below)!
So now it makes sense to drop the BLAS_LIBRARIES definition as well!
# Clean up old make files
# rm -fr *
cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE ../elmerfem
This build step now succeeds as indicated by the selection of libopenblas.dll.a as the BLAS and LAPACK library.
-- ------------------------------------------------
-- BLAS library: D:/dev/Software/msys64/mingw64/lib/libopenblas.dll.a
-- LAPACK library: D:/dev/Software/msys64/mingw64/lib/libopenblas.dll.a
-- ------------------------------------------------
-- Fortran compiler: D:/dev/Software/msys64/mingw64/bin/gfortran.exe
-- Fortran flags: -fallow-argument-mismatch -O2 -g -DNDEBUG
-- ------------------------------------------------
-- C compiler: D:/dev/Software/msys64/mingw64/bin/cc.exe
-- C flags: -O2 -g -DNDEBUG
-- ------------------------------------------------
-- CXX compiler: D:/dev/Software/msys64/mingw64/bin/c++.exe
-- CXX flags: -O2 -g -DNDEBUG
-- ------------------------------------------------
-- ------------------------------------------------
-- Package filename: elmerfem-9.0--20220612_Windows-AMD64
-- Patch version: 9.0-
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
QWT_INCLUDE_DIR (ADVANCED)
used as include directory in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
...
used as include directory in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
QWT_LIBRARY (ADVANCED)
linked by target "ElmerGUI" in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
...
Looks like I now need to define QWT_INCLUDE_DIR and QWT_LIBRARY. Hmm, I don’t think I even installed QWT.
$ pacman -S mingw64/mingw-w64-x86_64-qwt-qt5
resolving dependencies...
looking for conflicting packages...
Packages (1) mingw-w64-x86_64-qwt-qt5-6.2.0-5
Total Download Size: 29.17 MiB
Total Installed Size: 175.53 MiB
:: Proceed with installation? [Y/n] y
:: Retrieving packages...
mingw-w64-x86_64-qwt-qt5-6.2.0-5-any 29.2 MiB 1136 KiB/s 00:26 [###...###] 100%
(1/1) checking keys in keyring [###...###] 100%
(1/1) checking package integrity [###...###] 100%
(1/1) loading package files [###...###] 100%
(1/1) checking for file conflicts [###...###] 100%
(1/1) checking available disk space [###...###] 100%
:: Processing package changes...
(1/1) installing mingw-w64-x86_64-qwt-qt5 [#########################################################################################] 100%
Optional dependencies for mingw-w64-x86_64-qwt-qt5
mingw-w64-x86_64-qt5-tools [installed]
Now that QWT is installed, we can set the include directory as follows:
CMake finally succeeds! The output ends with these lines:
-- Generating done
-- Build files have been written to: D:/dev/repos/fem/build
The generated Makefile has targets such as ElmerGUI, elmersolver, AdvectionDiffusion, FluxSolver, etc. The strange thing is that it has a line that sets SHELL = cmd.exe and so a Windows command prompt is launched when you run make.
#==================================================================
# Target rules for targets named ElmerGUI
# Build rule for target.
ElmerGUI: cmake_check_build_system
$(MAKE) $(MAKESILENT) -f CMakeFiles\Makefile2 ElmerGUI
.PHONY : ElmerGUI
# fast build rule for target.
ElmerGUI/fast:
$(MAKE) $(MAKESILENT) -f ElmerGUI\Application\CMakeFiles\ElmerGUI.dir\build.make ElmerGUI/Application/CMakeFiles/ElmerGUI.dir/build
.PHONY : ElmerGUI/fast
Now we see the expected SHELL = /bin/sh and running make actually causes code to start building! What a journey! I will write another post with simplified instructions for how to build Elmer (on Windows).
$ make
[ 0%] Building C object matc/src/CMakeFiles/matc.dir/c3d.c.obj
[ 0%] Building C object matc/src/CMakeFiles/matc.dir/clip.c.obj
[ 0%] Building C object matc/src/CMakeFiles/matc.dir/dri_ps.c.obj
[ 0%] Building C object matc/src/CMakeFiles/matc.dir/eig.c.obj
...
It has been a while since I wrote graphics/rendering code. The bugs are very different from those I typically write/fix since so I thought I might as well share the types of issues I dealt with. Here are some of the pitfalls I encountered:
Not checking all OpenGL API results. Continuing execution when vertex/fragment shader compilation failed, for example, wastes a ton of time debugging downstream failures.
Assuming successful shader compilation means that you can assign values to every shader uniform you declared! If the uniform is not actually used to generate the shader’s output, the compiler (which I learned lives in the graphics driver) can eliminate the uniform, thereby causing attempts to set it to fail.
Using glVertexAttribPointer instead of glVertexAttribIPointer to pass integer IDs needed by a fragment shader. Wasted so much time on this because I was feeling schedule pressure and didn’t carefully read the documentation. Since I set the normalized parameter to GL_FALSE, the IDs were being converted into floats directly without normalization. TODO: study this behavior now to see exactly which floats ended up in the shader.
Passing a count of 0 to glDrawArrays. The count argument specifies the number of indices to be rendered. Took me a while to figure out why nothing was showing up after some refactoring that I did. Turns out the number of vertices in the class I created was 0. An assertion here would have saved a ton of time.
Mismatched vertex attribute formats. Spiky rendered output instead of a shape like a cube/sphere makes this one rather easy to detect. In my case, I was using 2 structs and one had an extra ID field that ended up being vertex data when the other type of struct was passed to the rendering code.
Passing GL_TEXTURE0 to a sampler2D shader instead of 0! This was a typo that I didn’t catch in course slides.
When storing vertex shader data into a buffer texture, the OpenGL APIs were used to create and bind the buffer: glCreateBuffers, glNamedBufferStorage, glCreateTextures, glTextureBuffer, glBindImageTexture. A snippet of the vertex shader code to write into the buffer is shown below. Unfortunately, the shader compilation failed with this error: 0(69) : error C1115: unable to find compatible overloaded function “imageStore(struct image1D4x32_bindless, int, struct DebugData)”.
I needed up having to define a const int numVec4sPerStruct = 6; and call imageStore for each member of the struct, e.g. imageStore(bufferTexture, gl_VertexID * numVec4sPerStruct + 1, debugData.vPosition);. See related discussions here and here.
Invalid Shader Data in Buffers
There were all 0s in the output written into the texture/shader storage buffers. I tried using imageBuffer instead of image1D to avoid getting all zeros when reading the texture image buffer. The code looked correct but I couldn’t explain why zeros were being read back despite the rendered output looking correct. To figure out why this could be happening, I initialized the texture memory to a known value (integer value -1, which turns out to be a NaN when interpreted as a float). This made it easier to explain the random crap that was being displayed (hint from stack overflow) since it made it obvious that many memory locations were not being written to. Here is a snippet of the fragment shader:
Initializing the storage to all 0xFF bytes made it possible to conclude that the wrong data was being written to the host, more specifically that the wrong locations were being written to! Who knew structs and alignment were a thing (TODO: link to alignment discussion in GLSL spec)! The host needed to interpret the shader storage using this struct:
struct FragmentData
{
unsigned int uniqueFragmentId;
unsigned int fourBytesForAlignment1;
TextureCoord2f texCoords;
VertexCoord4f glFragCoord;
unsigned int faceId;
unsigned int fourBytesForAlignment2;
unsigned int fourBytesForAlignment3;
unsigned int fourBytesForAlignment4;
};
Also see discussion in GLSL spec about std430 vs std140 for shader block storage!
C++ Bugs
Some of the bugs I introduced were also plain old C++ bugs (not OpenGL specific), e.g.
Uninitialized variables (the fovy float had a NaN value).
Copy/pasting code and missing a key fact that both Gouraud and Phong shading code paths were calling the Gouraud shader (even though the scene state output in the console showed the state had been correctly updated to Phong). That’s what you get for copy pasting and not having tests…
Wrong array indexing logic. In the bug below (commit 3cfe07aea6914a91), I was multiplying the indices by elementSize but that is wrong because lines 2-5 from the bottom already have a built in multiplication by the element size. Noticed this from the disassembly.
int VolumeDataset3D::GetIndexFromSliceAndCol(uint32_t slice, uint32_t column)
{
const int sizeOf1Row = width * sizeof(uint8_t);
int index = slice * sizeOf1Row + column;
return index;
}
const int elementSize = sizeof(VertexDataPositionedByte);
for (uint32_t slice = 0; slice < depth - 1; slice++)
{
for (uint32_t col = 0; col < width - 1; col++)
{
int index = elementSize * GetIndexFromSliceAndCol(slice, col);
int rightIndex = elementSize * GetIndexFromSliceAndCol(slice, col + 1);
int diagIndex = elementSize * GetIndexFromSliceAndCol(slice + 1, col + 1);
int bottomIndex = elementSize * GetIndexFromSliceAndCol(slice + 1, col);
VertexDataPositionedByte cellData[4] =
{
((VertexDataPositionedByte*)vertexData2D.data)[bottomIndex],
((VertexDataPositionedByte*)vertexData2D.data)[diagIndex],
((VertexDataPositionedByte*)vertexData2D.data)[rightIndex],
((VertexDataPositionedByte*)vertexData2D.data)[index],
};
I ran into (currently unexplained) crashes in both the Intel and nVidia OpenGL drivers (I used Windows only). There were also crashes (on a specific commit) from a nullref/access violation on my HP desktop but not on my Surface Pro. Found out later that the desktop was actually right to crash but the difference in behavior was certainly troubling.
Longer than expected pauses were observed during GC in JDK 7 as explained on the Buffered Logging hotspot-dev mailing list:
Some folks noticed much longer than expected
pauses that seemed to coincide with GC logging in the midst of a GC
safepoint. In that setup, the GC logs were going to a disk file (these were
often useful for post-mortem analyses) rather than to a RAM-based tmpfs
which had been the original design center assumption. The vicissitudes of
the dirty page flushing policy in Linux when
IO load on the machine (not necessarily the JVM process doing the logging)
could affect the length and duration of these inline logging stalls.
A buffered logging scheme was then implemented by us (and independently by
others) which we have used successfully to date to avoid these pauses in
high i/o
multi-tenant environments.
Note that Unified JVM Logging was introduced in JDK 9 whereas asynchronous logging was introduced in JDK17 in PR 3135. As per the Java docs, “logging messages are output synchronously” by default whereas in “asynchronous logging mode, log sites enqueue all logging messages to an intermediate buffer and a standalone thread is responsible for flushing them to the corresponding outputs.” The AWS Developer Tools Blog has an excellent writeup about how and why they implemented this feature as well as an overview of unified logging (e.g. run java -Xlog:'gc*=info:stdout' to see logging output from log_info_p, which in my case includes output from the G1InitLogger).
Starting the Backport
This is a relatively straightforward backport. Clone the jdk11u-dev repo (or your fork as appropriate). The repo was at commit 86d39a69 when I started the backport.
git clone https://github.com/openjdk/jdk11u-dev
cd jdk11u-dev/
To see the exact same outcomes, switch to that commit (if desired).
git checkout 86d39a69
To backport this feature to JDK11, cherry-pick the commit from PR 3135 onto a new branch. We need to add the upstream as a remote to enable cherry-picking PR commits.
I used Visual Studio for conflict resolution with this strategy:
Take Incoming (Source)
Inspect the diff using Compare with Unmodified… to ensure that the changes being pulled are sensible.
The rest of this section can be skipped. I am including the details of the validation of the conflict resolution strategy (i.e. ensuring nothing undesirable is getting pulled in). The advantage of the strategy outlined above is that changes that are required by the code we want to backport are most likely going to be present after conflict resolution.
None of these changes would be present if only the changes from the PR 3135 commit were used. These lists are generated from the blame view are therefore likely omit any delete-only diffs.
Conflict Resolution: logConfiguration.cpp
This is the list of unrelated changes (i.e. changes not in commit from PR 3135) after taking the incoming changes to logConfiguration.cpp includes (potentially partial) changes from:
Comparing the current and incoming globals.hpp reveals a significant rewriting of this file between the jdk and jdk11u-dev repos. To resolve the conflict, copy only the change from the PR 3135 commit to the target (local) globals.hpp by selecting the checkmark next to the conflict in the Visual Studio merge editor then manually fix up the last line.
Conflict Resolution: init.hpp
jdk and jdk11u-dev also have non-trivial changes to init.hpp so the Merge… command is necessary here.
Pick all the #includes from the source (conflict 1)
Pick all the changes from the target (conflict 2)
Add the new line to the merged file: AsyncLogWriter::initialize();
Conflict Resolution: thread.cpp
The Merge… command is again necessary here due to the significant number of changes between the source and target versions. Take the single line from the source and accept the merge:
cl.do_thread(AsyncLogWriter::instance());
Conflict Resolution: hashtable.hpp
Use the Merge… command once more to resolve the changes between the source and target versions. Take the single line from the source and accept the merge:
template class BasicHashtable;
Addressing Build Errors
Now that all conflicts have been resolved, build the code before committing anything. Here are additional issues that need to be resolved.
Missing ‘runtime/nonJavaThread.hpp’
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(31): fatal error C1083: Cannot open include file: 'runtime/nonJavaThread.hpp': No such file or directory
nonJavaThread.hpp is a file now in the upstream JDK repo. Blame shows that PR 2390 moved it out of thread.hpp. Fix:
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(111): error C2143: syntax error: missing ';' before '<'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(111): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(144): error C3646: '_stats': unknown override specifier
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(144): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(155): error C3668: 'AsyncLogWriter::pre_run': method with override specifier 'override' did not override any base class methods
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(156): error C2039: 'pre_run': is not a member of 'NonJavaThread'
Fix: Remove the pre_run method from logAsyncWritter.hpp.
Stream Errors
./src/hotspot/share/logging/logAsyncWriter.cpp(108): error C2660: 'stringStream::as_string': function does not take 1 arguments
D:\dev\repos\java\jdk11u-dev\src\hotspot\share\utilities/ostream.hpp(220): note: see declaration of 'stringStream::as_string'
./src/hotspot/share/logging/logAsyncWriter.cpp(108): error C2661: 'AsyncLogMessage::AsyncLogMessage': no overloaded function takes 2 arguments
Fix:git cherry-pick b08595d8443bbfb141685dc5eda7c58a34738048 and resolve the conflict (year on copyright line) using Take Incoming (Source).
Unknown class AutoModifyRestore
./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C2065: 'AutoModifyRestore': undeclared identifier
./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C2275: 'size_t': illegal use of this type as an expression
./build/windows-x86_64-normal-server-release/hotspot/variant-server/libjvm/gtest/objs/BUILD_GTEST_LIBJVM_pch.cpp: note: see declaration of 'size_t'
./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C3861: 'saver': identifier not found
cd src/hotspot/share/utilities/
curl -Lo autoRestore.hpp https://raw.githubusercontent.com/openjdk/jdk/195c45a0e11207e15c277e7671b2a82b8077c5fb/src/hotspot/share/utilities/autoRestore.hpp
# Now include autoRestore.hpp in test_asynclog.cpp
Atomic Errors
./src/hotspot/share/logging/logAsyncWriter.cpp(172): error C2039: 'release_store_fence': is not a member of 'Atomic'
D:\dev\repos\java\jdk11u-dev\src\hotspot\share\runtime/atomic.hpp(51): note: see declaration of 'Atomic'
./src/hotspot/share/logging/logAsyncWriter.cpp(172): error C3861: 'release_store_fence': identifier not found
./src/hotspot/share/logging/logConfiguration.cpp(114): error C3861: 'disable_outputs': identifier not found
./src/hotspot/share/logging/logConfiguration.cpp(278): error C2039: 'disable_outputs': is not a member of 'LogConfiguration'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logConfiguration.hpp(39): note: see declaration of 'LogConfiguration'
./src/hotspot/share/logging/logConfiguration.cpp(279): error C2065: '_n_outputs': undeclared identifier
./src/hotspot/share/logging/logConfiguration.cpp(293): error C2065: '_outputs': undeclared identifier
./src/hotspot/share/logging/logConfiguration.cpp(296): error C3861: 'delete_output': identifier not found
./src/hotspot/share/logging/logConfiguration.cpp(298): error C2248: 'LogOutput::set_config_string': cannot access protected member declared in class 'LogOutput'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logOutput.hpp(63): note: see declaration of 'LogOutput::set_config_string'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logConfiguration.hpp(31): note: see declaration of 'LogOutput'
Line 114 is simple the method call disable_outputs(); Since that method body is present in the file, it must be missing in the header file. The correct logConfiguration.hpp shows that 8255756: Disabling logging does unnecessary work is necessary. (This error might have been visible earlier in the process!)
Once the build succeeds on Windows, validate the changes by building on macOS.
Undeclared identifier ‘primitive_hash’
/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/utilities/hashtable.hpp:326:36: error: use of undeclared identifier 'primitive_hash'
unsigned (*HASH) (K const&) = primitive_hash<K>,
^
/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/utilities/hashtable.hpp:327:46: error: use of undeclared identifier 'primitive_equals'
bool (*EQUALS)(K const&, K const&) = primitive_equals<K>
Fix:
diff --git a/src/hotspot/share/utilities/hashtable.hpp b/src/hotspot/share/utilities/hashtable.hpp
index 30483b2f36..5e4c414490 100644
--- a/src/hotspot/share/utilities/hashtable.hpp
+++ b/src/hotspot/share/utilities/hashtable.hpp
@@ -30,6 +30,7 @@
#include "oops/oop.hpp"
#include "oops/symbol.hpp"
#include "runtime/handles.hpp"
+#include "utilities/resourceHash.hpp"
// This is a generic hashtable, designed to be used for the symbol
// and string tables.
Default Member Initializer is a C++11 Extension
/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/logging/logAsyncWriter.hpp:149:33: error: default member initializer for non-static data member is a C++11 extension [-Werror,-Wc++11-extensions]
const size_t _buffer_max_size = {AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)};
^
Fix:
diff --git a/src/hotspot/share/logging/logAsyncWriter.cpp b/src/hotspot/share/logging/logAsyncWriter.cpp
index 0231be78a9..d9f9ddda5b 100644
--- a/src/hotspot/share/logging/logAsyncWriter.cpp
+++ b/src/hotspot/share/logging/logAsyncWriter.cpp
@@ -82,7 +82,8 @@ void AsyncLogWriter::enqueue(LogFileOutput& output, LogMessageBuffer::Iterator m
AsyncLogWriter::AsyncLogWriter()
: _initialized(false),
- _stats(17 /*table_size*/) {
+ _stats(17 /*table_size*/),
+ _buffer_max_size(AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)) {
if (os::create_thread(this, os::asynclog_thread)) {
_initialized = true;
} else {
diff --git a/src/hotspot/share/logging/logAsyncWriter.hpp b/src/hotspot/share/logging/logAsyncWriter.hpp
index 313dd6de06..c4e28e5676 100644
--- a/src/hotspot/share/logging/logAsyncWriter.hpp
+++ b/src/hotspot/share/logging/logAsyncWriter.hpp
@@ -146,7 +146,7 @@ class AsyncLogWriter : public NonJavaThread {
// The memory use of each AsyncLogMessage (payload) consists of itself and a variable-length c-str message.
// A regular logging message is smaller than vwrite_buffer_size, which is defined in logtagset.cpp
- const size_t _buffer_max_size = {AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)};
+ const size_t _buffer_max_size;
AsyncLogWriter();
void enqueue_locked(const AsyncLogMessage& msg);
‘override’ keyword is a C++11 extension
/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/logging/logAsyncWriter.hpp:154:14: error: 'override' keyword is a C++11 extension [-Werror,-Wc++11-extensions]
void run() override;
^
...
Notice that the order of the parameters passed to Atomic::cmpxchg was also changed so we need to ensure that the arguments are swapped (since they were written when the new Atomic::cmpxchg was already in place). Move the first argument into the last spot.
8253757: Add LLVM-based backend for hsdis by magicus · Pull Request #7531 makes it possible to easily use LLVM as the hsdis backend. An LLVM installation is required for this. The official LLVM builds for the Windows platform do not work for building hsdis because they do not have all the prerequisite LLVM include files. See Building LLVM for Windows ARM64 – Saint’s Log (swesonga.org) for instructions on how to build LLVM for ARM64 Windows (on an x64 Windows host). To configure OpenJDK for LLVM as an hsdis backend on Windows ARM64, use this command:
The JDK and hsdis can then be built as usual with these commands:
make images
make build-hsdis
make install-hsdis
cp /cygdrive/d/dev/software/llvm-aarch64/bin/LLVM-C.dll build/windows-aarch64-server-slowdebug/jdk/bin/
The generated JDK can then be deployed to an ARM64 machine like the Surface Pro X. To test LLVM’s disassembly, use the -XX:CompileCommand flag on the ARM64 machine:
The path given to --with-llvm needs to be a Cygwin path if building in Cygwin. Otherwise, the build-hsdis target will fail with this error: c:\...\jdk\src\utils\hsdis\llvm\hsdis-llvm.cpp(58): fatal error C1083: Cannot open include file: 'llvm-c/Disassembler.h': No such file or directory. I caught this by inspecting build\windows-aarch64-server-release\make-support\failure-logs\support_hsdis_hsdis-llvm.obj.cmdline after the build failed. This was the only include that didn’t have Cygwin paths: -IC:/dev/repos/llvm-project/build_llvm_AArch64/install_local/include
Investigating Missing Disassembly
My first disassembly attempt did not work – only abstract disassembly was displayed:
I verified that hsdis-aarch64.dll was present in the JDK’s bin folder. That was the only issue I had seen before that caused this behavior so I dug around to find the code that loads the hsdis DLL. A search for the “hsdis-” DLL prefix in the sources reveals the hsdis_library_name string used in the Disassembler::dll_load method. Notice that there is a Verbose flag that can display what is happening when loading the hsdis DLL!
void* Disassembler::dll_load(char* buf, int buflen, int offset, char* ebuf, int ebuflen, outputStream* st) {
int sz = buflen - offset;
int written = jio_snprintf(&buf[offset], sz, "%s%s", hsdis_library_name, os::dll_file_extension());
if (written < sz) { // written successfully, not truncated.
if (Verbose) st->print_cr("Trying to load: %s", buf);
return os::dll_load(buf, ebuf, ebuflen);
} else if (Verbose) {
st->print_cr("Try to load hsdis library failed: the length of path is beyond the OS limit");
}
return NULL;
}
Error: VM option 'Verbose' is develop and is available only in debug version of VM.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
hsdis-aarch64.dll is not being loaded because LLVM-C.dll cannot be found! Still learning the need for reading the full instructions to avoid unnecessary pain.
I was trying to test using LLVM as a backend for hsdis on the Windows ARM64 platform as implemented in PR 5920. I downloaded LLVM 13 and tried to use it in the build. Unfortunately, it didn’t have all the prerequisite include files and so building your own LLVM installation was the approach suggested for Windows. This post explicitly outlines the instructions needed to build LLVM for the Windows ARM64 platform on a Windows x64 host machine.
The first requirement is an LLVM build with native llvm-nm.exe and llvm-tblgen.exe binaries. These can be downloaded (I think) or generated by building LLVM for the native x64 platform as specified in the instructions from Jorn.
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build_llvm
cd build_llvm
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=X86" -D"CMAKE_BUILD_TYPE:STRING=Release" -D"CMAKE_INSTALL_PREFIX=install_local" -A x64 -T host=x64
cmake --build . --config Release --target install
Once that build successfully completes, we can then build LLVM for the Windows ARM64 platform with the commands below. Notice that we specify paths to the native llvm-nm and llvm-tblgen binaries to prevent the build from trying to use their ARM64 equivalents (which won’t run on the host).
cd llvm-project
mkdir build_llvm_AArch64
cd build_llvm_Aarch64
cmake ../llvm -DLLVM_TARGETS_TO_BUILD:STRING=AArch64 \
-DCMAKE_BUILD_TYPE:STRING=Release \
-DCMAKE_INSTALL_PREFIX=install_local \
-DCMAKE_CROSSCOMPILING=True \
-DLLVM_TARGET_ARCH=AArch64 \
-DLLVM_NM=C:/repos/llvm-project/build_llvm/install_local/bin/llvm-nm.exe \
-DLLVM_TABLEGEN=C:/repos/llvm-project/build_llvm/install_local/bin/llvm-tblgen.exe \
-DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc \
-A ARM64 \
-T host=x64
date; time \
cmake --build . --config Release --target install ; \
date
Once the build completes, the LLVM ARM64 files will be in the build_llvm_AArch64/install_local folder in the llvm-project repo. That build should have all the necessary header files and static libraries required for LLVM projects targeting Windows on ARM64. See the general cmake options and the LLVM-specific cmake options for details on the various flags and variables.
cd llvm-project
mkdir build_llvm_AArch64
cd build_llvm_AArch64
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
-D"CMAKE_BUILD_TYPE:STRING=Release" \
-D"CMAKE_INSTALL_PREFIX=install_local_AArch64" \
-D"CMAKE_CROSSCOMPILING=True" \
-D"LLVM_TARGET_ARCH=AArch64" \
-A x64 \
-T host=x64
cmake --build . --config Release --target install
This still results in errors about conflicting machine types:
c:\...\llvm-project\build_llvm_aarch64\install_local_aarch64\\lib\llvmaarch64disassembler.lib : warning LNK4272: library machine type 'x64' conflicts with target machine type 'ARM64'
That’s when I tried adding the LLVM_TABLE_GEN from a Windows x64 LLVM build I had generated earlier. I accidentally omitted the options prefixed with a # below because I didn’t include the trailing slash after adding the llvm-tblgen.exe option.
The build still succeeded and generated AArch64 .lib files in the LLVM installation! Interestingly, they still had the x64 machine type in the header.
$ dumpbin /headers build_llvm_AArch64_2\install_local_AArch64_2\lib\LLVMAArch64AsmParser.lib
Microsoft (R) COFF/PE Dumper Version 14.29.30133.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file build_llvm_AArch64_2\install_local_AArch64_2\lib\LLVMAArch64AsmParser.lib
File Type: LIBRARY
FILE HEADER VALUES
8664 machine (x64)
...
I had no choice but to reexamine my understanding of what the -A flag does. It is used to specify the platform name but it’s only after digging into the CMAKE_GENERATOR_PLATFORM docs that I noticed that this was the target platform! This also made me realize that I hadn’t noticed that the x64 C++ compiler was being used all along!
-- The C compiler identification is MSVC 19.29.30133.0
-- The CXX compiler identification is MSVC 19.29.30133.0
-- The ASM compiler identification is MSVC
-- Found assembler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - works
Setting -A to AArch64 causes MSBuild to fail with an error about an unknown platform. So -A just might be the argument I need to get ARM64 libraries built.
"C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\3.17.3\VCTargetsPath.vcxproj" (default target) (1) ->
(_CheckForInvalidConfigurationAndPlatform target) ->
C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(820,5): error : The BaseOutputPath/OutputPath property is not set for project 'VCTargetsPath.vcxproj'. Please check to make sure that you have specified a valid combination of Configuration and Platform for this project. Configuration='Debug' Platform='AArch64'. You may be seeing this message because you are trying to build a project without a solution file, and have specified a non-default Configuration or Platform that doesn't exist for this project. [C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\3.17.3\VCTargetsPath.vcxproj]
So I tried using -A ARM64 instead. I noticed that we now have the ARM64 C++ compiler selected! This is something I should have been paying attention to from the beginning, crucial for cross-compilation.
-- The C compiler identification is MSVC 19.29.30133.0
-- The CXX compiler identification is MSVC 19.29.30133.0
-- The ASM compiler identification is MSVC
-- Found assembler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe - works
Unfortunately, the build still failed with an error from gen-msvc-exports.py. Taking a look at gen-msvc-exports.py, it looks like it is trying to run llvm-nm.exe (for the target platform).
Generating export list for LLVM-C
Traceback (most recent call last):
File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 116, in <module>
main()
File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 112, in main
gen_llvm_c_export(ns.output, ns.underscore, libs, ns.nm)
File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 72, in gen_llvm_c_export
check_call([nm, '-g', lib], stdout=dumpout_f)
File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 359, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
OSError: [WinError 216] This version of %1 is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher
C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(241,5): error MSB8066: Custom build for 'C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFi
les\02a88fa656bb9cf8b9ffd0e0debe57ae\libllvm-c.exports.rule;C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\8ebc0efbf04134b25d0f37561fba0d55\LLVM-C.def.rule;C:\dev\repos\llvm-project\build_llvm_AArch64_
3\CMakeFiles\509fcb3f8bb132e9c560e15e8d25cb45\LLVM-C_exports.rule;C:\dev\repos\llvm-project\llvm\tools\llvm-shlib\CMakeLists.txt' exited with code 1. [C:\dev\repos\llvm-project\build_llvm_AArch64_3\tools\llvm-shl
ib\LLVM-C_exports.vcxproj]
date; time cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
-D"CMAKE_BUILD_TYPE:STRING=Release" \
-D"CMAKE_INSTALL_PREFIX=install_local" \
-D"CMAKE_CROSSCOMPILING=True" \
-D"LLVM_TARGET_ARCH=AArch64" \
-D"LLVM_NM=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-nm.exe" \
-D"LLVM_TABLEGEN=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-tblgen.exe" \
-D"LLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc" \
-A ARM64 \
-T host=x64
date; time \
cmake --build . --config Release --target install; \
date
These build commands work! Dumpbin shows that the generated .lib files have ARM64 headers!
$ dumpbin /headers C:\dev\repos\llvm-project\build_llvm_AArch64_3\Release\lib\LLVMAArch64Disassembler.lib
Microsoft (R) COFF/PE Dumper Version 14.29.30133.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file C:\dev\repos\llvm-project\build_llvm_AArch64_3\Release\lib\LLVMAArch64Disassembler.lib
File Type: LIBRARY
FILE HEADER VALUES
AA64 machine (ARM64)
Unfortunately, the JDK project that got me started down this path still doesn’t build. Cygwin shows defines like -DLLVM_DEFAULT_TRIPLET='”aarch64-pc-windows-msvc”‘ being passed to the compiler, which then complains:
C:/.../src/utils/hsdis/llvm/hsdis-llvm.cpp(217): error C2015: too many characters in constant
The quotes in the commands therefore needed to be dropped. This caused build failures since the paths used back-slashes!
Building Opts.inc...
'C:devreposllvm-projectbuild_llvminstall_localbinllvm-tblgen.exe' is not recognized as an internal or external command,
operable program or batch file
This is now the part where I find a nice document on the LLVM site with the 3-liner for this task 😀
The post about Exploring the hsdis LLVM Support PR mentioned link errors when building hsdis using an LLVM backend on Windows (x86-64 host building JDK for the x86-64 platform). Before we look at why linking fails, we can get a simple repro for the error from the Cygwin logs. To get the command line used to invoke the linker, run make LOG=debug build-hsdis. Search the output for link.exe to find the failing command or open build\windows-x86_64-server-release\support\hsdis\BUILD_HSDIS_link.cmdline. Change the path from Cygwin to Windows style so that the command can be run in the x64 Native Tools Command Prompt.
So there really is no such symbol in this lib folder! I’m guessing I need to add another lib folder to the path. A quick search for LLVMInitializeX86Disassembler leads to this post on Using the LLVM MC Disassembly API. It mentions using llvm-config to set the linker flags. Shouldn’t running the bash configure command take care of this? Let’s see what’s in the configure output:
...
checking what hsdis backend to use... 'llvm'
checking for LLVM_CONFIG... C:/dev/repos/llvm-project/build_llvm/install_local/bin [user supplied]
/cygdrive/c/dev/repos/java/forks/jdk/build/.configure-support/generated-configure.sh: line 135451: C:/dev/repos/llvm-project/build_llvm/install_local/bin: Is a directory
/cygdrive/c/dev/repos/java/forks/jdk/build/.configure-support/generated-configure.sh: line 135452: C:/dev/repos/llvm-project/build_llvm/install_local/bin: Is a directory
/cygdrive/c/dev/repos/java/forks/jdk/build/.configure-support/generated-configure.sh: line 135453: C:/dev/repos/llvm-project/build_llvm/install_local/bin: Is a directory
...
Well, that could be the problem! I think I need to fix the llvm-config path in Cygwin by appending /llvm-config to LLVM_CONFIG.
Sure enough, that was the problem! The bash configure output (below) now looks good and make build-hsdis now works. The fix for this would be to ensure bash configure fails if LLVM_CONFIG is set to the directory instead of the executable!
checking what hsdis backend to use... 'llvm'
checking for LLVM_CONFIG... C:/dev/repos/llvm-project/build_llvm/install_local/bin/llvm-config [user supplied]
checking for number of cores... 8
...
$ make build-hsdis
Building target 'build-hsdis' in configuration 'windows-x86_64-server-release'
Creating support/hsdis/hsdis.dll from 1 file(s)
Finished building target 'build-hsdis' in configuration 'windows-x86_64-server-release'
Notice from the new build command line in build\windows-x86_64-server-release\support\hsdis\BUILD_HSDIS_link.cmdline that there are now many .lib files supplied to the linker! These are the lib files that I was inspecting with dumpbin so my earlier hypothesis was wrong (there were no additional .lib files required, the ones I was looking at were simply not being passed to the linker).
Now running make install-hsdis copies hsdis-amd64.dll into /build/windows-x86_64-server-release/jdk/bin. The LLVM hsdis backend can now be used to disassemble instructions:
Here are some of the bugs/questions I looked at when investigating these failures. Stack overflow taught me about dumpbin and C++ decorated names/ the undname tool.
A previous post explored how to use LLVM as the backend disassembler for hsdis. The instructions for how to use GNU binutils (the currently supported option) are straightforward. Listing them here for completeness (assuming you have cloned the OpenJDK repo into your ~/repos/java/jdk folder). Note that they depend on more recent changes. See the docs on the Java command for more info about the -XX:CompileCommand option.
# Download and extract GNU binutils 2.37
cd ~
curl -Lo binutils-2.37.tar.gz https://ftp.gnu.org/gnu/binutils/binutils-2.37.tar.gz
tar xvf binutils-2.37.tar.gz
# Configure the OpenJDK repo for hsdis
cd ~/repos/java/jdk
bash configure --with-hsdis=binutils --with-binutils-src=~/binutils-2.37
# Build hsdis
make build-hsdis
To deploy the built hsdis library on macOS:
cd build/macosx-aarch64-server-release
# Copy the hsdis library into the JDK bin folder
cp support/hsdis/libhsdis.dylib jdk/bin/hsdis-aarch64.dylib
To deploy the built hsdis library on Ubuntu Linux (open question: is this step even necessary?):
cd build/linux-x86_64-server-release
# Copy the hsdis library into the JDK bin folder
cp support/hsdis/libhsdis.so jdk/bin/
Update 2024-03-13: use the make install-hsdis command to copy the hsdis binaries into the new OpenJDK build. This will ensure that the hsdis binary is copied to lib/hsdis-adm64.so (this file name should be used in place of any others that listed by find . -name *hsdis*).
Now we can disassemble some code, e.g. the String.checkIndex method mentioned in PR 5920.
# Disassemble some code
jdk/bin/java -XX:CompileCommand="print java.lang.String::checkIndex" -version
To see how to disassemble the code for a class, we can use the basic substitution cipher class from the post on Building HSDIS in Cygwin as an example. Download, compile and disassemble it using the commands below. Note that these commands save the .java file to a temp folder to make cleanup much easier. Also note the redirection to a file since the output can be voluminous.
cd jdk/bin
mkdir -p temp
cd temp
curl -Lo BasicSubstitutionCipher.java https://raw.githubusercontent.com/swesonga/scratchpad/main/apps/crypto/substitution-cipher/BasicSubstitutionCipher.java
../javac BasicSubstitutionCipher.java
../java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation BasicSubstitutionCipher > disassembled.txt
open disassembled.txt
The previous post described how LLVM can be configured as the disassembly backend for hsdis. Here, I explain the process it took for me to figure out the details of the change adding support for LLVM. One of the first things to do when learning these details of this change is to build it. Since I’m using my own fork of the OpenJDK repo, I need to add the upstream repo to my remotes. This makes it possible to fetch commits from PRs submitted to the upstream repo.
cd ~/repos/forks/jdk
git remote add upstream https://github.com/openjdk/jdk
git fetch upstream
The LLVM-backend PR has only 1 commit (as of this writing). Create a new branch then cherry-pick that commit (I was on commit 77757ba9 when I wrote this.
Install LLVM using homebrew (if it is not already installed).
brew install llvm
Set the the LDFLAGS and CPPFLAGS environment variables then run printenv | grep -i flags to verify that the flags have been set correctly. Exporting CC and CXX is crucial since that is how to let bash configure know that we need a custom compiler for the build!
Run make build-hsdis in the root folder of the jdk repo.
If the proper flags have not been set, make will fail with the error below. Run make --debug=v for additional information on what make is doing.
saint@Saints-MBP-2021 jdk % make build-hsdis
Building target 'build-hsdis' in configuration 'macosx-aarch64-server-release'
/Users/saint/repos/java/forks/jdk/src/utils/hsdis/llvm/hsdis-llvm.cpp:58:10: fatal error: 'llvm-c/Disassembler.h' file not found
#include <llvm-c/Disassembler.h>
^~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make[3]: *** [/Users/saint/repos/java/forks/jdk/build/macosx-aarch64-server-release/support/hsdis/hsdis-llvm.o] Error 1
make[2]: *** [build-hsdis] Error 2
ERROR: Build failed for target 'build-hsdis' in configuration 'macosx-aarch64-server-release' (exit code 2)
After all that fidgeting around, the fix is as simple as updating your path to include LLVM <insert facepalm / clown>. This is what installing LLVM using brew ends with:
...
==> llvm
To use the bundled libc++ please add the following LDFLAGS:
LDFLAGS="-L/opt/homebrew/opt/llvm/lib -Wl,-rpath,/opt/homebrew/opt/llvm/lib"
llvm is keg-only, which means it was not symlinked into /opt/homebrew,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.
If you need to have llvm first in your PATH, run:
echo 'export PATH="/opt/homebrew/opt/llvm/bin:$PATH"' >> ~/.zshrc
For compilers to find llvm you may need to set:
export LDFLAGS="-L/opt/homebrew/opt/llvm/lib"
export CPPFLAGS="-I/opt/homebrew/opt/llvm/include"
My MacBook didn’t even have a ~/.zshrc file. Setting the PATH using the suggestion above fixed the build errors!
Now open a new terminal and configure the repo (no need for LLVM_CONFIG).
% bash configure --with-hsdis=llvm
% make build-hsdis
Interestingly, running make images does not work on subsequent attempts?! After further investigation, it turns out that the clang compiler installed by brew cannot successfully compile the OpenJDK sources. Why does it issue warnings that Apple’s clang compiler does not?
In file included from /Users/saint/repos/java/forks/jdk/src/hotspot/cpu/aarch64/abstractInterpreter_aarch64.cpp:31:
In file included from /Users/saint/repos/java/forks/jdk/src/hotspot/share/runtime/frame.inline.hpp:42:
In file included from /Users/saint/repos/java/forks/jdk/src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp:31:
In file included from /Users/saint/repos/java/forks/jdk/src/hotspot/cpu/aarch64/pauth_aarch64.hpp:28:
/Users/saint/repos/java/forks/jdk/src/hotspot/os_cpu/bsd_aarch64/pauth_bsd_aarch64.inline.hpp:29:10: fatal error: 'ptrauth.h' file not found
#include <ptrauth.h>
^~~~~~~~~~~
1 error generated.
make[3]: *** [/Users/saint/repos/java/forks/jdk/build/macosx-aarch64-server-release/hotspot/variant-server/libjvm/objs/abstractInterpreter_aarch64.o] Error 1
m
To work around this, first build the JDK using Apple’s clang. Next, add brew’s LLVM installation to the PATH, then configure for hsdis. Finally, build hsdis:
# Warning: ensure /opt/homebrew/opt/llvm/bin is not in the PATH
cd ~/repos/java/forks/jdk
bash configure
make images
# Now add brew's LLVM to the PATH before running bash configure
export OLDPATH=$PATH
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"
bash configure --with-hsdis=llvm
make build-hsdis
make install-hsdis
export PATH=$OLDPATH
# Why doesn't install-hsdis do this?
cd build/macosx-aarch64-server-release
cp support/hsdis/libhsdis.dylib jdk/bin/
Install the 64-bit WindowsLLVM. Configure the OpenJDK repo using both the --with-hsdis and LLVM_CONFIG options as shown. I needed to use the 8.3 path name (obtained using the command suggested on StackOverflow) for value of the LLVM_CONFIG parameter.
Unfortunately, this is not sufficient to enable building on Windows as detailed by this error:
$ make build-hsdis
Building target 'build-hsdis' in configuration 'windows-x86_64-server-release'
Creating support/hsdis/hsdis.dll from 1 file(s)
/usr/bin/bash: x86_64-w64-mingw32-g++: command not found
make[3]: *** [Hsdis.gmk:135: /..../build/windows-x86_64-server-release/support/hsdis/hsdis-llvm.obj] Error 127
make[2]: *** [make/Main.gmk:530: build-hsdis] Error 2
ERROR: Build failed for target 'build-hsdis' in configuration 'windows-x86_64-server-release' (exit code 2)
Jorn fixed this so we can add Jorn’s upstream JDK, fetch its commits, then cherry pick the commit with the fix.
git remote add jorn https://github.com/JornVernee/jdk/
git fetch jorn
git cherry-pick 8de8b763c9159f84bcc044c04ee2fac9f2390774
Some conflicts in make/Hsdis.gmk need to be resolved. This is straightforward since Jorn’s change splits the existing binutils Windows code into the first branch of an if-statement then adds support for the LLVM backend in the else case. The resolved conflicts are in my fork in the branch. The repo should now be configured with the additional --with-llvm option added by Jorn.
Running make build-hsdis results in errors about missing LLVM includes.
$ make build-hsdis
Building target 'build-hsdis' in configuration 'windows-x86_64-server-release'
Creating support/hsdis/hsdis.dll from 1 file(s)
d:\.....\jdk\src\utils\hsdis\llvm\hsdis-llvm.cpp(58): fatal error C1083: Cannot open include file: 'llvm-c/Disassembler.h': No such file or directory
make[3]: *** [Hsdis.gmk:142: /cygdrive/d/dev/repos/java/forks/jdk/build/windows-x86_64-server-release/support/hsdis/hsdis-llvm.obj] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [make/Main.gmk:530: build-hsdis] Error 2
Let’s try setting CC and CXX then rerunning the above configure command.
configure: Will use user supplied compiler CC=C:/PROGRA~1/LLVM/bin/clang.exe
checking resolved symbolic links for CC... no symlink
configure: The C compiler (located as C:/PROGRA~1/LLVM/bin/clang.exe) does not seem to be the required microsoft compiler.
configure: The result from running it was: "clang: error: no input files"
configure: error: A microsoft compiler is required. Try setting --with-tools-dir.
configure exiting with result code 1
But let’s see what happens if we change the toolchain type to clang:
# This command does not work
bash configure --with-hsdis=llvm LLVM_CONFIG=C:/PROGRA~1/LLVM/bin --with-llvm=C:/PROGRA~1/LLVM --with-toolchain-type=clang
configure: Toolchain type clang is not valid on this platform.
configure: Valid toolchains: microsoft.
configure: error: Cannot continue.
configure exiting with result code 1
Indeed, clang is not a valid toolchain for Windows as declared in make/autoconf/toolchain.m4. Open question: how is the VALID_TOOLCHAIN_windows actually checked? So we can now unset the environment variables.
unset CC
unset CXX
This brought me back to the first thing I should have done when I saw the “No such file or directory” error – verifying that the file existed on disk! This is all there is there:
$ ls C:/PROGRA~1/LLVM/include/llvm-c
Remarks.h lto.h
Well, turns out this is the issue that led Jorn to build LLVM manually. I now know what the needed header files being referred to are. So let’s build LLVM using Jorn’s steps.
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build_llvm
cd build_llvm
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=X86" -D"CMAKE_BUILD_TYPE:STRING=Release" -D"CMAKE_INSTALL_PREFIX=install_local" -A x64 -T host=x64
cmake --build . --config Release --target install
The last command fails with the error below!??? Why can’t anything just simply work?
Building Opts.inc...
'..\..\RelWithDebInfo\bin\llvm-tblgen.exe' is not recognized as an internal or external command,
operable program or batch file.
C:\Program Files\Microsoft Visual Studio\2022\Preview\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(243,5): error MSB8066: Custom build for 'D:\dev\repos\llvm-project\build_llvm\CMakeFiles\dd1f7b42098
1667d7f617e96802947d3\Opts.inc.rule;D:\dev\repos\llvm-project\build_llvm\CMakeFiles\9fbf2dc5caba7f0c75934f43d12abdf5\RcOptsTableGen.rule;D:\dev\repos\llvm-project\llvm\tools\llvm-rc\CMakeLists.txt' exited wit
h code 9009. [D:\dev\repos\llvm-project\build_llvm\tools\llvm-rc\RcOptsTableGen.vcxproj]
Switch to my Surface Book 2 and LLVM builds just fine!
Interestingly, this fails with the same errors I saw on macOS:
$ make build-hsdis
Building target 'build-hsdis' in configuration 'windows-x86_64-server-release'
Creating support/hsdis/hsdis.dll from 1 file(s)
hsdis-llvm.obj : error LNK2019: unresolved external symbol LLVMCreateDisasm referenced in function "public: __cdecl hsdis_backend::hsdis_backend(unsigned __int64,unsi...,char const *,int)" (??0hsdis_backend@@QEAA@_K0PEAE0P6APEAXPEAXPEBD2@Z2P6AH23ZZ23H@Z)
...
hsdis-llvm.obj : error LNK2019: unresolved external symbol LLVMInitializeX86Disassembler referenced in function LLVMInitializeNativeDisassembler
c:\dev\repos\java\forks\jdk\build\windows-x86_64-server-release\support\hsdis\hsdis.dll : fatal error LNK1120: 9 unresolved externals
make[3]: *** [Hsdis.gmk:142: /cygdrive/c/dev/repos/java/forks/jdk/build/windows-x86_64-server-release/support/hsdis/hsdis.dll] Error 1
The PATH environment variable probably needs to be adjusted to work around this.
Update 2022-02-08: the problem above is that bash configure is invoked with the wrong LLVM_CONFIG option – the actual llvm-config executable name is missing. See Troubleshooting hsdis LLVM backend MSVC Linker Errors for details.
To specify a backend for hsdis, the OpenJDK repo needs to be configured with the --with-hsdis option. As of commit 77757ba9, LLVM is not yet supported as an hsdis disassembly backend. Therefore, this error from make/autoconf/jdk-options.m4 is displayed. Here’s an example on the Windows platform:
$ bash configure --with-hsdis=llvm
...
checking what hsdis backend to use... invalid
configure: error: Incorrect hsdis backend "llvm"
configure exiting with result code 1
To test the LLVM backend for hsdis on macOS, install LLVM using brew (Apple’s LLVM does not have the llvm-c include files):
# install LLVM
brew install llvm
Now build the OpenJDK. This should use Apple’s compiler since we have not made any configuration changes.
cd ~/repos/java/jdk
bash configure
make images
Now add brew’s LLVM bin directory to the PATH and run bash configure again passing the --with-hsdis=llvm option as shown below. The configuration process will detect the clang++ compiler installed by brew and set it up for use when the build-hsdis target is executed.
# Now add brew's LLVM to the PATH before running bash configure
export OLDPATH=$PATH
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"
bash configure --with-hsdis=llvm
make build-hsdis
make install-hsdis
export PATH=$OLDPATH
The install-hsdis target does not appear to be copying the hsdis library to the jdk/bin folder so these commands are required:
cd build/macosx-aarch64-server-release
cp support/hsdis/libhsdis.dylib jdk/bin/hsdis-aarch64.dylib
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build_llvm
cd build_llvm
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=X86" -D"CMAKE_BUILD_TYPE:STRING=Release" -D"CMAKE_INSTALL_PREFIX=install_local" -A x64 -T host=x64
cmake --build . --config Release --target install
Now we can configure the OpenJDK repo for hsdis, and build both the JDK and hsdis.
bash configure --with-hsdis=llvm \
LLVM_CONFIG=C:/dev/repos/llvm-project/build_llvm/install_local/bin \
--with-llvm=C:/dev/repos/llvm-project/build_llvm/install_local/
make build-hsdis
make images
hsdis LLVM backend on Windows ARM64
Open question: is this supported?
Testing the hsdis LLVM backend
The String.checkIndex method of PR 5920 is a good candidate for testing the hsdis LLVM backend. The -XX:CompileCommand option can be used to print the generated assembler code after compilation of the specified method.
To view the command lines being executed by make as well as the value of the variables in use: make LOG=debug build-hsdis
Autoconf macros are defined using the AC_DEFUN macro. The JDKOPT_SETUP_HSDIS macro (modified by PR 5920) is defined using AC_DEFUN_ONCE, which is for macros that should only be called once.
To address warnings like ld: warning: dylib (/opt/homebrew/opt/llvm/lib/libunwind.dylib) was built for newer macOS version (12.0) than being linked (11.0) update MACOSX_VERSION_MIN in make/autoconf/flags.m4.