Programming – Page 4

2022-08-10 —Categories: Assembly, Visual C++

Building & Disassembling ARM64 Code using Visual C++

This path C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build has various scripts to set up a command window as documented at Use the Microsoft C++ toolset from the command line | Microsoft Docs. If vcvarsx86_arm64.bat and vcvarsamd64_arm64.bat are missing in that folder on your Windows x64 machine, install the MSVC v143 – VS 2022 C++ ARM64 build tools (Latest) component in the Visual Studio 2022 installer.

Selection ARM64 Build Tools in VS Installer

Once it is installed, open a new cmd.exe window and run this command to set up the build environment:

"C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsamd64_arm64.bat"

To verify that the ARM64 compiler will be used when cl or dumpbin is executed:

D:\> where cl
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\cl.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\cl.exe

D:\> where dumpbin
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\dumpbin.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\dumpbin.exe

To see the command Visual Studio uses to build the project, create a C++ console application and use the Configuration Manager to change the Active solution platform to ARM64. Next, go to Tools > Options then expand the Projects and Solutions node. Select Build And Run then change the MSBuild project build output verbosity to Detailed. Building the project should now show the full command line used to invoke the compiler, for example here are the command lines used in the Debug and Release configurations respectively.

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /JMC /nologo /W3 /WX- /diagnostics:column /sdl /Od /Oy- /D _DEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Debug\\" /Fd"ARM64\Debug\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /nologo /W3 /WX- /diagnostics:column /sdl /O2 /Oi /Oy- /GL /D NDEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /MD /GS /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Release\\" /Fd"ARM64\Release\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

Notice the /O2 flag (maximize speed) in the release build instead of the /Od flag (no optimizations) above. The debug build also uses the just my code /JMC, runtime error checks /RTC1, and debug multithread-specific version of the run-time library /MDd flags. For our testing purposes, we can ignore most of these flags.

Calling Printf

Here is a simple program, aarch64-abi-test-printf.cpp, which calls printf with a format specifier and 4 additional arguments.

#include <stdio.h>

int main()
{
    int result = printf("%.4f,%.4f,%.4f,%s", 1.2345, 1.2345, 1.2345, "str");
}

Compiling a Debug Build

To compile and disassemble this program, run:

cl /c aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi.asm aarch64-abi-test-printf.obj
dumpbin /all /out:printf-abi.txt aarch64-abi-test-printf.obj

The disassembly is shown below with some links to the documentation for the various instructions. See the Arm Architecture Reference Manual for A-profile architecture PDF for more details about these instructions. The overview of AArch64 state at ARM Compiler armasm User Guide Version 6.6.1 is also a useful resource.

Dump of file aarch64-abi-test-printf.obj

File Type: COFF OBJECT

main:
  0000000000000000: A9BE7BFD  stp         fp,lr,[sp,#-0x20]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: 90000008  adrp        x8,$SG5571
  000000000000000C: 91000104  add         x4,x8,$SG5571
  0000000000000010: 58000183  ldr         x3,$LN3
  0000000000000014: 58000162  ldr         x2,$LN3
  0000000000000018: 58000141  ldr         x1,$LN3
  000000000000001C: 90000008  adrp        x8,$SG5572
  0000000000000020: 91000100  add         x0,x8,$SG5572
  0000000000000024: 94000000  bl          printf
  0000000000000028: 2A0003E0  mov         w0,w0
  000000000000002C: B90013E0  str         w0,[sp,#0x10]
  0000000000000030: 52800000  mov         w0,#0
  0000000000000034: A8C27BFD  ldp         fp,lr,[sp],#0x20
  0000000000000038: D65F03C0  ret
  000000000000003C: D503201F  nop
$LN3:
  0000000000000040: 126E978D
  0000000000000044: 3FF3C083

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: F90017E0  str         x0,[sp,#0x28]
  000000000000000C: F90013E1  str         x1,[sp,#0x20]
  0000000000000010: F9000FE2  str         x2,[sp,#0x18]
  0000000000000014: F9000BE3  str         x3,[sp,#0x10]
  0000000000000018: 94000000  bl          __local_stdio_printf_options
  000000000000001C: F9400BE4  ldr         x4,[sp,#0x10]
  0000000000000020: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000024: F94013E2  ldr         x2,[sp,#0x20]
  0000000000000028: F94017E1  ldr         x1,[sp,#0x28]
  000000000000002C: F9400000  ldr         x0,[x0]
  0000000000000030: 94000000  bl          __stdio_common_vfprintf
  0000000000000034: 2A0003E0  mov         w0,w0
  0000000000000038: 2A0003E0  mov         w0,w0
  000000000000003C: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000040: D65F03C0  ret

printf:
  0000000000000000: D10103FF  sub         sp,sp,#0x40
  0000000000000004: A9008BE1  stp         x1,x2,[sp,#8]
  0000000000000008: A90193E3  stp         x3,x4,[sp,#0x18]
  000000000000000C: A9029BE5  stp         x5,x6,[sp,#0x28]
  0000000000000010: F9001FE7  str         x7,[sp,#0x38]
  0000000000000014: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000018: 910003FD  mov         fp,sp
  000000000000001C: F90013E0  str         x0,[sp,#0x20]
  0000000000000020: 9100E3E8  add         x8,sp,#0x38
  0000000000000024: F9000FE8  str         x8,[sp,#0x18]
  0000000000000028: 52800020  mov         w0,#1
  000000000000002C: 94000000  bl          __acrt_iob_func
  0000000000000030: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000034: D2800002  mov         x2,#0
  0000000000000038: F94013E1  ldr         x1,[sp,#0x20]
  000000000000003C: 94000000  bl          _vfprintf_l
  0000000000000040: 2A0003E0  mov         w0,w0
  0000000000000044: B90013E0  str         w0,[sp,#0x10]
  0000000000000048: D2800008  mov         x8,#0
  000000000000004C: F9000FE8  str         x8,[sp,#0x18]
  0000000000000050: B94013E0  ldr         w0,[sp,#0x10]
  0000000000000054: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000058: 910103FF  add         sp,sp,#0x40
  000000000000005C: D65F03C0  ret

  Summary

           8 .bss
          68 .chks64
          9C .debug$S
          62 .drectve
          18 .pdata
          1A .rdata
          F8 .text$mn
          10 .xdata

In the disassembly generated by dumpbin (printf-abi.asm), notice that all 5 arguments to printf are passed in registers! x0 contains a pointer to the format string, x1-x3 contain the address of the $LN3 label. The 64-bits at that label are the IEEE double floating point representation of 1.2345. x4 contains a pointer to the null-terminated string “str“.

Which are the printf String Arguments?

To determine what symbols in instructions like adrp x8,$SG5571 mean, we use the output of dumpbin /all. The RELOCATIONS section shows $SG5571 to have symbol index 8. The COFF SYMBOL TABLE shows this symbol index 8 to be in SECT3. The raw data for section 3 contains the format string and the single string parameter passed to printf. I’m still not sure how the assembler knows the difference in offsets between these 2 strings?

.
.
.
SECTION HEADER #3
  .rdata name
       0 physical address
       0 virtual address
      1A size of raw data
     31A file pointer to raw data (0000031A to 00000333)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40400040 flags
         Initialized Data
         8 byte align
         Read Only

RAW DATA #3
  00000000: 73 74 72 00 00 00 00 00 25 2E 34 66 2C 25 2E 34  str.....%.4f,%.4
  00000010: 66 2C 25 2E 34 66 2C 25 73 00                    f,%.4f,%s.
.
.
.
RELOCATIONS #4
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000008  PAGEBASE_REL21             90000008         8  $SG5571
 0000000C  PAGEOFFSET_12A             91000104         8  $SG5571
 0000001C  PAGEBASE_REL21             90000008         9  $SG5572
 00000020  PAGEOFFSET_12A             91000100         9  $SG5572
 00000024  BRANCH26                   94000000        16  printf
.
.
.
COFF SYMBOL TABLE
000 01057A64 ABS    notype       Static       | @comp.id
001 80010190 ABS    notype       Static       | @feat.00
002 00000000 SECT1  notype       Static       | .drectve
    Section length   62, #relocs    0, #linenums    0, checksum        0
004 00000000 SECT2  notype       Static       | .debug$S
    Section length   9C, #relocs    0, #linenums    0, checksum        0
006 00000000 SECT3  notype       Static       | .rdata
    Section length   1A, #relocs    0, #linenums    0, checksum B99D9667
008 00000000 SECT3  notype       Static       | $SG5571
009 00000008 SECT3  notype       Static       | $SG5572
00A 00000000 SECT4  notype       Static       | .text$mn

Compiling an Optimized Build

Specifying the /O2 flag for speed generates optimized code.

cl /c /O2 /Fo"printf-abi-o2.obj" aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi-o2.asm printf-abi-o2.obj
dumpbin /all /out:printf-abi-o2.txt printf-abi-o2.obj

In the optimized code below, the IEEE double is loaded into d16 then copied to the x1-x3 registers by the FMOV instruction.

Dump of file printf-abi-o2.obj

File Type: COFF OBJECT

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD53F3  stp         x19,x20,[sp,#-0x30]!
  0000000000000004: A9015BF5  stp         x21,x22,[sp,#0x10]
  0000000000000008: F90013FE  str         lr,[sp,#0x20]
  000000000000000C: AA0003F6  mov         x22,x0
  0000000000000010: AA0103F5  mov         x21,x1
  0000000000000014: AA0203F4  mov         x20,x2
  0000000000000018: AA0303F3  mov         x19,x3
  000000000000001C: 94000000  bl          __local_stdio_printf_options
  0000000000000020: F9400000  ldr         x0,[x0]
  0000000000000024: AA1303E4  mov         x4,x19
  0000000000000028: AA1403E3  mov         x3,x20
  000000000000002C: AA1503E2  mov         x2,x21
  0000000000000030: AA1603E1  mov         x1,x22
  0000000000000034: 94000000  bl          __stdio_common_vfprintf
  0000000000000038: F94013FE  ldr         lr,[sp,#0x20]
  000000000000003C: A9415BF5  ldp         x21,x22,[sp,#0x10]
  0000000000000040: A8C353F3  ldp         x19,x20,[sp],#0x30
  0000000000000044: D65F03C0  ret

main:
  0000000000000000: F81F0FFE  str         lr,[sp,#-0x10]!
  0000000000000004: 5C0001B0  ldr         d16,$LN4
  0000000000000008: 90000008  adrp        x8,??_C@_03OJMAPEGJ@str@
  000000000000000C: 91000104  add         x4,x8,??_C@_03OJMAPEGJ@str@
  0000000000000010: 90000008  adrp        x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000014: 91000100  add         x0,x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000018: 9E660203  fmov        x3,d16
  000000000000001C: 9E660202  fmov        x2,d16
  0000000000000020: 9E660201  fmov        x1,d16
  0000000000000024: 94000000  bl          printf
  0000000000000028: 52800000  mov         w0,#0
  000000000000002C: F84107FE  ldr         lr,[sp],#0x10
  0000000000000030: D65F03C0  ret
  0000000000000034: D503201F  nop
$LN4:
  0000000000000038: 126E978D
  000000000000003C: 3FF3C083

printf:
  0000000000000000: A9BA53F3  stp         x19,x20,[sp,#-0x60]!
  0000000000000004: A9017BF5  stp         x21,lr,[sp,#0x10]
  0000000000000008: A9028BE1  stp         x1,x2,[sp,#0x28]
  000000000000000C: A90393E3  stp         x3,x4,[sp,#0x38]
  0000000000000010: A9049BE5  stp         x5,x6,[sp,#0x48]
  0000000000000014: F9002FE7  str         x7,[sp,#0x58]
  0000000000000018: AA0003F4  mov         x20,x0
  000000000000001C: 52800020  mov         w0,#1
  0000000000000020: 9100A3F5  add         x21,sp,#0x28
  0000000000000024: 94000000  bl          __acrt_iob_func
  0000000000000028: AA0003F3  mov         x19,x0
  000000000000002C: 94000000  bl          __local_stdio_printf_options
  0000000000000030: F9400000  ldr         x0,[x0]
  0000000000000034: D2800003  mov         x3,#0
  0000000000000038: AA1403E2  mov         x2,x20
  000000000000003C: AA1303E1  mov         x1,x19
  0000000000000040: AA1503E4  mov         x4,x21
  0000000000000044: 94000000  bl          __stdio_common_vfprintf
  0000000000000048: A9417BF5  ldp         x21,lr,[sp,#0x10]
  000000000000004C: A8C653F3  ldp         x19,x20,[sp],#0x60
  0000000000000050: D65F03C0  ret

  Summary

           8 .bss
          70 .chks64
          94 .debug$S
          62 .drectve
          18 .pdata
          16 .rdata
          E8 .text$mn
           8 .xdata

The example we have reviewed in this post passed only 5 parameters to printf. To see how more than 8 parameters are handled, see the example print call in aarch64-abi-test-printf-manyargs.cpp and printf-abi-many.asm (or for the optimized assembly code, printf-abi-many-o2.asm).

Additional resources on AArch64:

2022-07-23 —Categories: CUDA, Graphics

Testing nVidia Cuda Samples

I have been toying around with the idea of doing a fluid dynamics or crystal growth simulation using nVidia CUDA. I decided to try out nVidia’s cuda samples to see what their approach looks like, in particular when rendering using OpenGL. I am using Visual Studio 2022 so I simply cloned the cuda samples repo, opened the fluidsGL_vs2022.sln solution, right click on the fluidsGL project, then selected Build.

Build started...
1>------ Build started: Project: fluidsGL, Configuration: Debug x64 ------
1>D:\dev\...\cuda-samples\Samples\5_Domain_Specific\fluidsGL\fluidsGL_vs2022.vcxproj(37,5): error MSB4019: The imported project "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 11.6.props" was not found. Confirm that the expression in the Import declaration "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\\BuildCustomizations\CUDA 11.6.props" is correct, and that the file exists on disk.
1>Done building project "fluidsGL_vs2022.vcxproj" -- FAILED.
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

The prerequisites section does mention that the CUDA Toolkit 11.6 is required, so I close VS and install it. I end up with version 11.7 though:

When reopening the fluidsGL solution, I still get the same error about CUDA 11.6.props not being found. A quick look at the directory this file is expected to be in reveals that this is a simple version mismatch problem – see the hard coded version in the fluidsGL.vcxproj file. Instead of fixing every example .vcxproj file to match CUDA 11.7, we can patch the VS folder by running these commands from an admin command prompt:

cd "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\BuildCustomizations"

copy "CUDA 11.7.props" "CUDA 11.6.props"
copy "CUDA 11.7.targets" "CUDA 11.6.targets"

The code now builds in Visual Studio and I can now oooh, aaaah over the demo. Visual Studio does seem a bit sluggish at opening the entire samples solution though… I get this information about my device in the console window after the demo launches:

GPU Device 0: "Pascal" with compute capability 6.1

CUDA device [Quadro P1000] has 5 Multi-Processors

After clicking and dragging mouse around

2022-07-23 —Categories: Compilers, Fortran, LLVM

Failing to Build Flang with Visual C++

Background

Elmer is the first codebase that I have dug into that has a substantial (or really any) amount of Fortran code. I used GFortran to build it but went digging around for a clang based compiler. I found llvm-project/flang and since I had been building LLVM earlier this year, I figured it should be straightforward to build flang and perhaps explore it in a debugger.

My first attempt to build flang (on Windows, my primary OS) resulted in many build errors. Unfortunately, I was using a preview Visual Studio build, so I didn’t want to compare the errors with those from a different machine because it wasn’t the same compiler version in use. I decided to use an RTM Visual Studio compiler (VS 17.2.5) to avoid possible compiler bugs present only in VS preview builds since most people would not be using preview VS builds anyway.

Without giving it much thought, my suspicion was that any build failures probably arose from not using the correct C++ version. The source code I was trying to build (commit c0702ac0) states that it uses C++17. I set this in CMake by defining the CXX_STANDARD property. Here is the full cmake command line I used to set up the build.

cd llvm-project
mkdir build
cd build

cmake \
  -G Ninja \
  ../llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DFLANG_ENABLE_WERROR=On \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_TARGETS_TO_BUILD=host \
  -DCMAKE_INSTALL_PREFIX=../install
  -DLLVM_LIT_ARGS=-v \
  -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" \
  -DLLVM_ENABLE_RUNTIMES="compiler-rt" \
  -DCXX_STANDARD=17

# Shown here without \ to be executable in cmd.exe
cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt" -DCXX_STANDARD=17

That took about 2 minutes on my machine after which I ran ninja to start the build

ninja

Unfortunately, the build failed! The first error I encountered was in fold-real.cpp. Here is the command line used to invoke the compiler (shown with newlines to simplify interpretation, see Compiler options listed alphabetically | Microsoft Docs for the complete list of compiler options).

 C:\PROGRA~1\MIB055~1\2022\ENTERP~1\VC\Tools\MSVC\1432~1.313\bin\Hostx64\x64\cl.exe
 /nologo
 /TP
 -DFLANG_LITTLE_ENDIAN=1
 -DGTEST_HAS_RTTI=0 -DUNICODE
 -D_CRT_NONSTDC_NO_DEPRECATE
 ...
 -D__STDC_LIMIT_MACROS
 -ID:\dev\repos\llvm-project\build-cpp17\tools\flang\lib\Evaluate
 ...
 -ID:\dev\repos\llvm-project\llvm\include
 -external:I D:\dev\repos\llvm-project\llvm\..\mlir\include
 ...
 -external:I D:\dev\repos\llvm-project\llvm\..\clang\include
 -external:W0
 /DWIN32
 /D_WINDOWS
 /Zc:inline
 /Zc:__cplusplus
 /Oi
 /bigobj
 /permissive-
 /W4
 -wd4141
 ...
 -wd4324
 -w14062
 -we4238
 /Gw
 /WX
 /MD
 /O2
 /Ob2
 /EHs-c-
 /GR-
 -UNDEBUG
 -std:c++17
 /showIncludes
 /Fotools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\fold-real.cpp.obj
 /Fdtools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\
 /FS
 -c
 D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-real.cpp

I had to look up the meaning of the C++ syntax on that line to understand why the build could be failing. Turns out to be a lambda, as explained at c++ – What is a lambda expression in C++11? – Stack Overflow.

I tried manually creating a repro for this compiler issue by creating a new Visual C++ project in Visual Studio and recreating the structure of the code failing to build. One of the questions I had was how to set conformance mode in a Visual Studio Cmake project. I still haven’t yet figured this out. However, one of the issues I ran into was that my cmake project was building the code without the /permissive- flag! I ended up switching to a regular Visual C++ project (.vcxproj) since I knew how to change the compiler options reliably for such projects. After struggling with recreating the code, I realized that I would make more progress removing code from flang’s fold-real.cpp instead. Here are some of the other searches and concepts I had to look up to understand the code while trying to create a minimal repro of the build failure.

I was eventually able to create a simpler test case showing that the flang code could not build with my RTM compiler.

cl /std:c++17 /permissive- flang-msvc-clang-test.cpp

Microsoft (R) C/C++ Optimizing Compiler Version 19.32.31332 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

flang-msvc-clang-test.cpp
flang-msvc-clang-test.cpp(159): error C2065: 'T': undeclared identifier
flang-msvc-clang-test.cpp(48): note: see reference to function template instantiation 'auto FoldIntrinsicFunction::<lambda_1>::operator ()<_First>(const _T1 &) const' being compiled
        with
        [
            _First=Expr<Type<TypeCategory::Real,1>>,
            _T1=Expr<Type<TypeCategory::Real,1>>
        ]
flang-msvc-clang-test.cpp(171): note: see reference to function template instantiation 'Expr<Type<TypeCategory::Real,2>> FoldIntrinsicFunction<2>(FoldingContext &,FunctionRef<Type<TypeCategory::Real,2>> &&)' being compiled
flang-msvc-clang-test.cpp(159): error C2923: 'Scalar': 'T' is not a valid template type argument for parameter 'T'
flang-msvc-clang-test.cpp(159): note: see declaration of 'T'

So after all that, the RTM LTS Visual C++ compiler turned out to have a bug. Turns out the Visual C++ folks had already fixed this issue so the way to unblock myself was to switch to the preview Visual Studio build :(! The irony…

Suppressing Warnings

Armed with a preview build that correctly compiled the test case, the next obstacle in the build process was a set of warnings that were treated as errors: C4661 and C4101.

FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold.cpp.obj
C:\...\cl.exe ... -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold.cpp
D:\dev\repos\llvm-project\flang\include\flang\Evaluate\expression.h(101): error C2220: the following warning is treated as an error
D:\dev\repos\llvm-project\flang\include\flang\Evaluate\expression.h(101): warning C4661: 'std::optional<Fortran::evaluate::DynamicType> Fortran::evaluate::ExpressionBase<Fortran::evaluate::SomeDerived>::GetType(void) const': no suitable definition provided for explicit template instantiation request
...
FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-complex.cpp.obj
C:\...\cl.exe ... -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-complex.cpp
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-implementation.h(1583): error C2220: the following warning is treated as an error
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-implementation.h(1583): warning C4101: 'buffer': unreferenced local variable

I tried to suppressed them to keep marching forward:

cd \dev\repos\llvm-project
mkdir build-nowarn
cd build-nowarn

cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt" -DCXX_STANDARD=17 -DCXX_FLAGS="-wd4661 -wd4101"

ninja

Defining CXX_FLAGS like that did not work so I end up looking around for how to disable warnings in cmake. This was when I discovered that CMAKE_CXX_STANDARD is not necessary on the command line because flang/CMakeLists.txt already requires C++17. Trying to append the warning disable option /wdXXXX to that file didn’t work either. However, the comment on line 329 made me explore HandleLLVMOptions.cmake. There, I discovered support for setting the number of parallel jobs (via /MP for Visual C++). This file also contained the code that sets up most of the compiler options used when building! Closer to the task at hand is the discover of the LLVM_ENABLE_WARNINGS option and the hard-coded list of MSVC warning flags! I therefore made this change (before running cmake and ninja) to get the warning flags to be respected:

diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 56d05f5b5fce..589281b232f1 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -648,6 +648,8 @@ if (MSVC)
           # v15.8.8. Re-evaluate the usefulness of this diagnostic when the bug
           # is fixed.
       -wd4709 # Suppress comma operator within array index expression
+      -wd4101  # Suppress ...
+      -wd4661  # Suppress ...

       # Ideally, we'd like this warning to be enabled, but even MSVC 2019 doesn't
       # support the 'aligned' attribute in the way that clang sources requires (for

cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt"

Another Compiler Failure

With the aforementioned change, the build proceeded to a different build failure, this time in fold-integer.cpp.

FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-integer.cpp.obj
C:\PROGRA~1\MIB055~1\2022\Preview\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\cl.exe  /nologo /TP -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -ID:\dev\repos\llvm-project\build-vsmain\tools\flang\lib\Evaluate -ID:\dev\repos\llvm-project\flang\lib\Evaluate -ID:\dev\repos\llvm-project\flang\include -ID:\dev\repos\llvm-project\build-vsmain\tools\flang\include -ID:\dev\repos\llvm-project\build-vsmain\include -ID:\dev\repos\llvm-project\llvm\include -external:ID:\dev\repos\llvm-project\llvm\..\mlir\include -external:ID:\dev\repos\llvm-project\build-vsmain\tools\mlir\include -external:ID:\dev\repos\llvm-project\build-vsmain\tools\clang\include -external:ID:\dev\repos\llvm-project\llvm\..\clang\include -external:W0 /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4101 -wd4661 -wd4324 -w14062 -we4238 /Gw /WX /MD /O2 /Ob2  /EHs-c- /GR- -UNDEBUG -std:c++17 /showIncludes /Fotools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\fold-integer.cpp.obj /Fdtools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\ /FS -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(771): error C2672: 'invoke': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.33.31627\include\type_traits(1552): note: could be 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(771): note: Failed to specialize function template 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.33.31627\include\type_traits(1552): note: see declaration of 'std::invoke'
...

By this point, I knew that simplifying the function containing the error was the fastest path to a repro. One of the little problems I ran into was how to figure out the type of fptr since it is declared using the auto keyword. I ended up assigning it to a new temporary variable of a different type, e.g. char et voila!

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(505): error C2440: 'initializing': cannot convert from 'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const' to 'char'

I then removed the temporary assignment and explicitly specified this type as the type of fptr:

using T2 = int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const;

T2 fptr{&Scalar<TI>::LEADZ};

The build then failed because the function pointer types are not the same, which was really confusing given that I had just checked the type of fptr.

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(504): error C2440: 'initializing': cannot convert from
'int (__cdecl Fortran::evaluate::value::Integer<16,true,16,unsigned short,unsigned int>::* )(void) const' to
'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const'

I switched the type of fptr and got a different error:

D:\dev\repos\llvm-project\flang\include\flang\Evaluate\integer.h(66): error C2607: static assertion failed
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(490): note: see reference to class template instantiation 'Fortran::evaluate::value::Integer<16,true,16,unsigned char,unsigned short>' being compiled

Here is a different change I tried:

using T2 = int (__cdecl Fortran::evaluate::value::Integer<8>::* )(void) const;

T2 fptr{&Scalar<TI>::LEADZ};

That still failed with the following error:

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(504): error C2440: 'initializing': cannot convert from
'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const' to
'int (__cdecl Fortran::evaluate::value::Integer<32,true,32,unsigned int,unsigned __int64>::* )(void) const'

It was at this point that I realized that it was time to learn a bit more about decay. What is decay and array-to-pointer conversion? | C++ FAQ (64.github.io) had a good explanation of why the term decay is used. Perhaps a reexamination of std::decay – cppreference.com might lead to some insight. I wasn’t sure what Result referred to in the statement using TI = typename std::decay_t<decltype(n)>::Result; One idea I got was to append a number to the typename and examine the compiler error. Here’s the new line 752 of llvm-project/fold-integer.cpp and the resulting compiler error showing that this name cannot be arbitrary.

using TI = typename std::decay_t<decltype(n)>::Result3;


C:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(502): error C2039: 'Result3': is not a member of 'Fortran::evaluate::Expr<Fortran::evaluate::Type<Fortran::common::TypeCategory::Integer,1>>'

Aha, so what it was referring to is the using statement in llvm-project/expression.h!

template <int KIND>
class Expr<Type<TypeCategory::Integer, KIND>>
    : public ExpressionBase<Type<TypeCategory::Integer, KIND>> {
public:
  using Result = Type<TypeCategory::Integer, KIND>;

...

The problematic lambda is therefore expecting a Scalar<Type<TypeCategory::Integer, KIND>>. Scalar is defined using decay and Type<TypeCategory::Integer, KIND>::Scalar is defined in llvm-project/type.h as the type value::Integer<8 * KIND>. This is when I see the reason for the previous build errors about mismatched Integer sizes no matter which size I picked – the fixed type I was using didn’t allow for the different template instantiations! Note that the problematic lambda is defined as a ScalarFunc.

By this point, I had a self-contained repro of the compiler bug, which ironically, compiled successfully on the RTM C++ compiler so I could use neither the preview nor the RTM to build the flang code.

cl /c /TP /std:c++17 /permissive- flang-msvc-clang-test-02.cpp

This compiler invocation gives the same error seen when compiling the flang code:

Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31627.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

flang-msvc-clang-test-02.cpp
flang-msvc-clang-test-02.cpp(193): error C2672: 'invoke': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.34.31721\include\type_traits(1552): note: could be 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
flang-msvc-clang-test-02.cpp(193): note: Failed to specialize function template 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'

I ended up reporting this compiler bug via the Visual Studio feedback system – see C++17 lambda fails to compile on latest VS preview compiler.

2022-07-19 —Categories: Java, Testing

Investigating a jtreg Failure

Download jtreg from the AdoptOpenJDK dependency pipeline (adoptopenjdk.net). For this investigation, I’ll be using my MacBook M1 running Monterey 12.1.

mkdir investigate
cd investigate
git clone https://github.com/openjdk/jdk11u

# Download jtreg 6
curl -Lo jtreg-6+1.tar.gz https://ci.adoptopenjdk.net/view/Dependencies/job/dependency_pipeline/lastSuccessfulBuild/artifact/jtreg/jtreg-6+1.tar.gz

tar xzfv jtreg-6+1.tar.gz

cd jdk11u

We switch the current directory to the root of jdk11u repo so that test paths are relative to the repo root. I will assume that we’re in the jdk11u repo root directory and are using the directory structure generated by the commands above. To see a detailed list of all the jtreg options, run this command:

../jtreg/bin/jtreg -help all

Now let us try to run a jtreg test, specifically AmazonCA.java:

../jtreg/bin/jtreg test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

There are some failure messages but it looks like a test ran.

failed to get value for vm.hasJFR
java.lang.UnsatisfiedLinkError: 'boolean sun.hotspot.WhiteBox.isJFRIncludedInVmBuild()'
	at sun.hotspot.WhiteBox.isJFRIncludedInVmBuild(Native Method)
	at requires.VMProps.vmHasJFR(VMProps.java:343)
	at requires.VMProps$SafeMap.put(VMProps.java:72)
	at requires.VMProps.call(VMProps.java:107)
	at requires.VMProps.call(VMProps.java:60)
	at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80)
	at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54)
failed to get value for vm.aot.enabled
java.lang.UnsatisfiedLinkError: 'int sun.hotspot.WhiteBox.aotLibrariesCount()'
	at sun.hotspot.WhiteBox.aotLibrariesCount(Native Method)
	at requires.VMProps.vmAotEnabled(VMProps.java:408)
	at requires.VMProps$SafeMap.put(VMProps.java:72)
	at requires.VMProps.call(VMProps.java:112)
	at requires.VMProps.call(VMProps.java:60)
	at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80)
	at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54)
.
.
.
Test results: passed: 1
Report written to /Users/saint/repos/java/jdk11u/JTreport/html/report.html
Results written to /Users/saint/repos/java/jdk11u/JTwork

Are these failure messages concerning given that the test passed? Reviewing the test report suggests not. The report keywords mention bug 8233223, which must be Bug ID: JDK-8233223 Add Amazon Root CA certificates (java.com). From the look of things, the java.lang.UnsatisfiedLinkErrors can be safely ignored (for this test anyway). That said, let us dig into these errors to ensure we understand what is happening.

The immediate cause of these errors is the failure to get the values for the SafeMap in VMProps.java. This raises the question of which JDK is being used by jtreg? My MacBook has both JDK11 and JDK17. The default java version is:

java -version
openjdk version "17.0.1" 2021-10-19 LTS
OpenJDK Runtime Environment Microsoft-28056 (build 17.0.1+12-LTS)
OpenJDK 64-Bit Server VM Microsoft-28056 (build 17.0.1+12-LTS, mixed mode)

Let’s ensure jtreg is using JDK11 by setting JTREG_JAVA.

JTREG_JAVA=/Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home

$JTREG_JAVA/bin/java -version

openjdk version "11.0.14" 2022-01-18 LTS
OpenJDK Runtime Environment Microsoft-30257 (build 11.0.14+9-LTS)
OpenJDK 64-Bit Server VM Microsoft-30257 (build 11.0.14+9-LTS, mixed mode)

../jtreg/bin/jtreg test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

We still see the same warnings though so let us explicitly use the -jdk option:

../jtreg/bin/jtreg -jdk:$JTREG_JAVA test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

We now get an interesting error message indicating that the -jdk option was using the newer JDK17.

Exception while calling user-specified class: requires.VMProps
java.lang.UnsupportedClassVersionError: requires/VMProps has been compiled by a more recent version of the Java Runtime (class file version 61.0), this version of the Java Runtime only recognizes class file versions up to 55.0
	at java.base/java.lang.ClassLoader.defineClass1(Native Method)
	...
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:315)
	at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:78)
	at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54)
failed to get JDK properties for /Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home/bin/java ; exit code 1
Error: failed to get JDK properties for /Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home/bin/java ; exit code 1

On my machine, I can remove these files as follows:

ls -l /Users/saint/repos/java/jdk11u/JTwork/extraPropDefns/classes/requires

rm -fr /Users/saint/repos/java/jdk11u/JTwork/extraPropDefns/classes/requires

Rerunning the test now results in a single (different) UnsatisfiedLinkError AND a test failure! However, we now have a properly set up environment since we control the JDK version tested by jtreg.

jdk11u % ../jtreg/bin/jtreg -jdk:$JTREG_JAVA test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

failed to get value for vm.musl
java.lang.UnsatisfiedLinkError: 'java.lang.String sun.hotspot.WhiteBox.getLibcName()'
	at sun.hotspot.WhiteBox.getLibcName(Native Method)
	at requires.VMProps.isMusl(VMProps.java:514)
	at requires.VMProps$SafeMap.put(VMProps.java:72)
	at requires.VMProps.call(VMProps.java:122)
	at requires.VMProps.call(VMProps.java:60)
	at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80)
	at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54)
Test results: failed: 1
Report written to /Users/saint/repos/java/jdk11u/JTreport/html/report.html
Results written to /Users/saint/repos/java/jdk11u/JTwork
Error: Some tests failed or other problems occurred.

Now here’s an interesting question: why doesn’t this approach yield identical result to setting the -jdk flag to this same JTREG_JAVA path?

JTREG_JAVA=/Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home

../jtreg/bin/jtreg test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

After doing some printf (or rather echo) debugging and observing an empty string for JTREG_JAVA, the culprit turns out to be the difference between a shell variable and an environment variable. See command line – What is the difference in usage between shell variables and environment variables? – Unix & Linux Stack Exchange. For the jtreg script to pull in this value of JTREG_JAVA, it needs to be an environment variable. It should therefore show up when this command is executed:

printenv | grep -i java

The proper way to execute this test then is:

JTREG_JAVA=/Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home

export JTREG_JAVA

../jtreg/bin/jtreg test/jdk/security/infra/java/security/cert/CertPathValidator/certification/AmazonCA.java

The outcome of the experiment so far though is that the AmazonCA test appears to fail when run with JDK11 and pass when run with JDK17 (of the respective versions). To convince ourselves that the infrastructure is fine, we can run this test with JDK11 (which is our focus) after exporting JTREG_JAVA.

../jtreg/bin/jtreg test/hotspot/jtreg/compiler/aot/cli/IncorrectAOTLibraryTest.java

This test passes, despite the single UnsatisfiedLinkError printed out.

failed to get value for vm.musl
java.lang.UnsatisfiedLinkError: 'java.lang.String sun.hotspot.WhiteBox.getLibcName()'
	at sun.hotspot.WhiteBox.getLibcName(Native Method)
	at requires.VMProps.isMusl(VMProps.java:514)
	at requires.VMProps$SafeMap.put(VMProps.java:72)
	at requires.VMProps.call(VMProps.java:122)
	at requires.VMProps.call(VMProps.java:60)
	at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80)
	at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54)
Test results: passed: 1
Report written to /Users/saint/repos/java/jdk11u/JTreport/html/report.html
Results written to /Users/saint/repos/java/jdk11u/JTwork

An Interesting Test

The above experimentation was inspired by AotInvokeDynamic2AotTest.java. The first time I tried to run this test, I used this command line.

../jtreg/bin/jtreg test/hotspot/jtreg/compiler/aot/calls/fromAot/AotInvokeDynamic2AotTest.java

We first set of 5 UnsatisfiedLinkError failures in the previous experiment were displayed but no tests were executed.

...
Test results: no tests selected
Report written to /Users/saint/repos/java/jdk11u/JTreport/html/report.html

This was happening while jtreg was using JDK17 and one of the values that could not be get()ed vm.aot.enabled. Could that be why there were no selected tests? Ignoring that rabbit hole for now sine jdk11u is our focus. We can now run the test with JTREG_JAVA exported. The test is now run but fails with this message in JTreport/text/summary.txt:

compiler/aot/calls/fromAot/AotInvokeDynamic2AotTest.java  Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Expected to get exit value of [0]

To see more details about the test failure, use the -verbose flag:

../jtreg/bin/jtreg -verbose:fail,error,summary test/hotspot/jtreg/compiler/aot/calls/fromAot/AotInvokeDynamic2AotTest.java

Here is the a portion of the output of the test run. Notice the linker error in there!

ACTION: build -- Passed. All files up to date
REASON: Named class compiled on demand
TIME:   0.0 seconds
messages:
command: build compiler.aot.AotCompiler
reason: Named class compiled on demand
elapsed time (seconds): 0.0

ACTION: driver -- Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: Expected to get exit value of [0]
REASON: User specified action: run driver compiler.aot.AotCompiler -libname AotInvokeDynamic2AotTest.so -class compiler.calls.common.InvokeDynamic -extraopt -XX:+UnlockDiagnosticVMOptions -extraopt -XX:+WhiteBoxAPI -extraopt -Xbootclasspath/a:. 
TIME:   4.821 seconds
messages:
command: driver compiler.aot.AotCompiler -libname AotInvokeDynamic2AotTest.so -class compiler.calls.common.InvokeDynamic -extraopt -XX:+UnlockDiagnosticVMOptions -extraopt -XX:+WhiteBoxAPI -extraopt -Xbootclasspath/a:.
reason: User specified action: run driver compiler.aot.AotCompiler -libname AotInvokeDynamic2AotTest.so -class compiler.calls.common.InvokeDynamic -extraopt -XX:+UnlockDiagnosticVMOptions -extraopt -XX:+WhiteBoxAPI -extraopt -Xbootclasspath/a:. 
Mode: othervm
Additional options from @modules: --add-modules java.base --add-exports java.base/jdk.internal.org.objectweb.asm=ALL-UNNAMED --add-exports java.base/jdk.internal.misc=ALL-UNNAMED
elapsed time (seconds): 4.821
configuration:
Boot Layer
  add modules: java.base                                
  add exports: java.base/jdk.internal.misc              ALL-UNNAMED
               java.base/jdk.internal.org.objectweb.asm ALL-UNNAMED

STDOUT:
Command line: [/usr/bin/ld -v]
@(#)PROGRAM:ld  PROJECT:ld64-764
BUILD 11:22:50 Apr 28 2022
configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em
LTO support using: LLVM version 13.1.6, (clang-1316.0.21.2.5) (static support for 28, runtime is 28)
TAPI support using: Apple TAPI version 13.1.6 (tapi-1316.0.7.3)

found working linker: /usr/bin/ld
Command line: [/Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home/bin/jaotc -J-XX:+UnlockDiagnosticVMOptions -J-XX:+WhiteBoxAPI -J-Xbootclasspath/a:. -J-classpath -J/Users/saint/repos/java/jdk11u/JTwork/classes/compiler/aot/calls/fromAot/AotInvokeDynamic2AotTest.d:/Users/saint/repos/java/jdk11u/JTwork/classes/test/lib:/Users/saint/repos/java/jdk11u/JTwork/classes/testlibrary:/Users/saint/repos/java/jdk11u/JTwork/classes:/Users/saint/repos/java/jdk11u/test/hotspot/jtreg/compiler/aot/calls/fromAot --compile-with-assertions --info --output AotInvokeDynamic2AotTest.so --class-name compiler.calls.common.InvokeDynamic -J-ea -J-esa -J-Xmixed]
Compiling AotInvokeDynamic2AotTest.so...
1 classes found (22 ms)
9 methods total, 8 methods to compile (12 ms)
Compiling with 10 threads
.
8 methods compiled, 0 methods failed (2785 ms)
Parsing compiled code (7 ms)
Processing metadata (46 ms)
Preparing stubs binary (3 ms)
Preparing compiled binary (2 ms)
Creating binary: AotInvokeDynamic2AotTest.so.o (18 ms)
Creating shared library: AotInvokeDynamic2AotTest.so (30 ms)
Exception in thread "main" java.lang.InternalError: ld: dynamic main executables must link with libSystem.dylib for architecture x86_64
	at jdk.aot@11.0.14/jdk.tools.jaotc.Linker.link(Linker.java:142)
	at jdk.aot@11.0.14/jdk.tools.jaotc.Main.run(Main.java:262)
	at jdk.aot@11.0.14/jdk.tools.jaotc.Main.run(Main.java:133)
	at jdk.aot@11.0.14/jdk.tools.jaotc.Main.main(Main.java:89)

Why on earth is there an error about x86_64 on my M1? Here is the failing command line, listed separately for easy execution:

/Library/Java/JavaVirtualMachines/microsoft-11.jdk/Contents/Home/bin/jaotc -J-XX:+UnlockDiagnosticVMOptions -J-XX:+WhiteBoxAPI -J-Xbootclasspath/a:. -J-classpath -J/Users/saint/repos/java/jdk11u/JTwork/classes/compiler/aot/calls/fromAot/AotInvokeDynamic2AotTest.d:/Users/saint/repos/java/jdk11u/JTwork/classes/test/lib:/Users/saint/repos/java/jdk11u/JTwork/classes/testlibrary:/Users/saint/repos/java/jdk11u/JTwork/classes:/Users/saint/repos/java/jdk11u/test/hotspot/jtreg/compiler/aot/calls/fromAot --compile-with-assertions --info --output AotInvokeDynamic2AotTest.so --class-name compiler.calls.common.InvokeDynamic -J-ea -J-esa -J-Xmixed

Once this command completes (and fails), a file named AotInvokeDynamic2AotTest.so.o exists on disk. The format of the ld command can be deduced from Linker.java:101. The ld command can then be directly invoked to see the actual failure:

% ld -dylib -o AotInvokeDynamic2AotTest.so AotInvokeDynamic2AotTest.so.o
ld: dynamic main executables must link with libSystem.dylib for architecture x86_64

As per Clang -nostdlib option not working | Apple Developer Forums I tried adding the -lSystem option but that was not sufficient.

% ld -lSystem -dylib -o AotInvokeDynamic2AotTest.so AotInvokeDynamic2AotTest.so.o
ld: library not found for -lSystem

Exploring Mach-O, Part 1 | g.p. anders (gpanders.com) pointed out that the solution is to include the path to the lib folder as well!

ld -L /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/ -lSystem -dylib -o AotInvokeDynamic2AotTest.so AotInvokeDynamic2AotTest.so.o

The proper way to address this test failure then is to fix Linker.java to pass these additional flags to ld.

2022-06-26 —Categories: CMake, Elmer

Building the Elmer Install Folder

My post on How to Build Elmer on Windows has a succinct list of instructions but when I first built the Elmer source code (on Windows), I was unsure how to run the binaries. Running ElmerGUI, for example, failed because qt5.dll could not be found. This post details the process I used to figure out how to create a usable Elmer installation.

I started by going through the generated Makefile. It contained targets like install, install/local, and install/strip. I started with install/local, fearing that perhaps these targets would attempt to install the binaries into program files.

Installing only the local directory...
-- Install configuration: "RelWithDebInfo"
-- Installing: C:/dev/repos/fem/install/bin/libopenblas.dll.a
CMake Error at cmake_install.cmake:45 (file):
  file INSTALL cannot find "C:/dev/repos/fem/elmerfem/../bundle_msys2/bin":
  No error.


make: *** [Makefile:142: install/local] Error 1

The error at cmake_install.cmake:45 is from ElmerCPack.cmake. I did not even know about the existence of the CPack tool. Git blame points to commit b9914082 as the source of the failing command. The strange thing about that commit is that there is no mention of what is supposed to create the bundle folders (despite the callout that the change “Relies on selected bundled files selected prior to installation”).

A workaround I tried was to use this define when running cmake: CPACK_BUNDLE_EXTRA_WINDOWS_DLLS:BOOL=FALSE. That avoided the build error but did not result in a usable installation. ElmerGUI’s windows_bundle.cmake needs this to be set to TRUE to package the Qt5 binaries into the install folder. Strangely enough, ElmerGUILogger’s windows_bundle.cmake does not have this logic for Qt5. This is likely because ElmerGUI would already have installed the right files into the bin folder but this looks like a bug.

Another work-around I tried was to manually create the folders expected by the build. The install target then succeeded but the necessary binaries were not copied to the install folder.

mkdir -p bundle_msys2/bin
mkdir -p bundle_qt5/bin
mkdir -p platforms

This is when I dig around and find that only ElmerGUILogger and ElmerClips include windows_bundle.cmake. Hmm… the latter looks promising but doesn’t look up to date either since it requires Qt4. More exploring of the Makefile – the package target looks interesting but fails because NSIS is not installed.

...
[ 97%] Built target ElmerGUI_autogen
[100%] Built target ElmerGUI
Run CPack packaging tool...
CPack Error: Cannot find NSIS compiler makensis: likely it is not installed, or not in your PATH
CPack Error: Could not read NSIS registry value. This is usually caused by NSIS not being installed. Please install NSIS from http://nsis.sourceforge.net
CPack Error: Cannot initialize the generator NSIS
make: *** [Makefile:71: package] Error 1

I wonder if we couldn’t just use NSIS for the MSYS environment:

$ pacman -Ss nsis
...
mingw64/mingw-w64-x86_64-nsis 3.06.1-1
    Windows installer development tool (mingw-w64)
mingw64/mingw-w64-x86_64-nsis-nsisunz 1.0-2
    NSIS plugin which allows you to extract files from ZIP archives (mingw-w64)
...

$ pacman -S mingw64/mingw-w64-x86_64-nsis

Now we can create an Elmer installer by running make package. Unfortunately, that turns out to be insufficient. My next idea is to compare the binaries from the installed. This turned out to be easier when using ls -R1 to output only the file names and in 1 column only. Some obvious differences are that the build in Program Files has a bin folder containing the Qt and vtk binaries (as well as a stripped gfortran).

Qt5Core.dll
Qt5Gui.dll
Qt5OpenGL.dll
Qt5PrintSupport.dll
Qt5Script.dll
Qt5Svg.dll
Qt5Widgets.dll
Qt5Xml.dll
...
libvtkChartsCore-8.2.dll
libvtkChartsCorePython38D-8.2.dll
libvtkCommonColor-8.2.dll
libvtkCommonColorPython38D-8.2.dll
libvtkCommonComputationalGeometry-8.2.dll
libvtkCommonComputationalGeometryPython38D-8.2.dll
libvtkCommonCore-8.2.dll
libvtkCommonCorePython38D-8.2.dll
libvtkCommonDataModel-8.2.dll

The Qt binaries certainly look like the output of windows_bundle.cmake (found this time by a search for “vtk”) but it’s still not clear how this file would be included in the build. I’m using VSCode to search for “windows_bundle” and only 2 of the 3 references in the codebase were showing up (on my desktop). Looking for “ElmerGUILogger” then revealed yet another reference. Such a waste of time! Not cool VSCode, not cool. It’s included in ElmerGUI/CMakeLists.txt. So I probably only need to define WIN32. But why does the WIN32 code run if I add some statements to that IF block?

Some searching (TODO: put bing searches here) leads to indications that there might be an error in CMake where WIN32 is not defined. Seeing signs that MSYS can be in Cywgin? Trying to get to the bottom of why WIN32 is not respected by CMake, I review ElmerGUI/CMakeLists.txt again. It adds the netgen subdirectory. Interestingly, ElmerGUI/netgen/README points out that install\lib\ElmerGUI\ngcore\libng.a is the unix library and that the win32 extension should be .lib. Biggest sign I’ve seen so far that something is really off. At this point, I remember seeing a .a file in the install folder – and that seemed strange for a lib folder since I expected a DLL. This hypothesis fails though because the working Elmer installation also has the file “C:\Program Files\Elmer 9.0-Release\lib\ElmerGUI\ngcore\libng.a“.

The sure way to verify that the WIN32 include is working is to introduce an error into windows_bundle.cmake, e.g. by change the first IF into the unknown IF2. Even better, notice that the “Qt5 Windows packaging” message is correctly displayed. Perhaps the FIND_FILE command should have a REQUIRED option now that it is supported. Adding that doesn’t fail so reexamine the INSTALL command. Since windows_bundle.cmake uses a relative path for the DESTINATION, it is interpreted relative to the value of the CMAKE_INSTALL_PREFIX variable.

At this point, I take a step back and search for cmake install component. My suspision is that there is something about the elmergui COMPONENT specified in the INSTALL command. I discover from Understanding the CMake `COMPONENT` keyword in the `install` command – Code – CMake Discourse that cmake --install is a thing. It seems to do the same things. Unfortunately, it doesn’t copy the files I need either.

How about zooming into the windows_bundle.cmake file and outputting the list of files installed after the install command. The new message says the files were installed. However, they aren’t in the install folder! I need an explanation for this behavior. So let’s open Process Explorer and see which paths are actually used. These requests show up using the Qt5 path filter:

D:\dev\Software\msys64\mingw64\lib\libQt5OpenGL.dll.a
D:\dev\Software\msys64\mingw64\lib\libQt5Xml.dll.a
…
D:\dev\Software\msys64\mingw64\lib\libQt5Core.dll.a

These files exist on disk (despite the Result column in process explorer having the value INVALID DEVICE REQUEST)! For a better understanding of what is happening, I change the bin path used by the install command to dbgcmake since there are multiple bin folders (under install and also in the MinGW installation). So such path shows up in Process Explorer when using a path filter for dbgcmake. This means I shouldn’t be expecting this files to be written by cmake at this point. In fact, running grep -Rin dbgcmake shows that build/ElmerGUI/cmake_install.cmake now contains this snippet (thereby verifying that windows_bundle.cmake is used to generate cmake_install.cmake).

if("x${CMAKE_INSTALL_COMPONENT}x" STREQUAL "xelmerguix" OR NOT CMAKE_INSTALL_COMPONENT)
  file(INSTALL DESTINATION "${CMAKE_INSTALL_PREFIX}/dbgcmake" TYPE FILE FILES
    "D:/dev/Software/msys64/mingw64/bin/Qt5Core.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5Gui.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5OpenGL.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5Script.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5Xml.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5Svg.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5Widgets.dll"
    "D:/dev/Software/msys64/mingw64/bin/Qt5PrintSupport.dll"
    )
endif()

So back to the question of why windows_bundle.cmake is not invoked by default – let us dig into the CMake sources for clues. Renaming it outputs an error for the include statement meaning the file is being included! Sigh… I end up poking around some CMake sources to see where WIN32 is set anyway.:

At this point, I’m down to simply finding out how to print CMake call stacks to confirm that this file is indeed included. The StackOverflow question about how to trace cmakelists has the big reveal: use the cmake –trace option!

date; time cmake -G "MSYS Makefiles" -DWITH_ELMERGUI:BOOL=TRUE  -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install  -DCMAKE_Fortran_COMPILER=d:/dev/Software/msys64/mingw64/bin/gfortran.exe  -DQWT_INCLUDE_DIR=d:/dev/Software/msys64/mingw64/include/qwt-qt5/  -DWIN32:BOOL=TRUE --trace-expand ../elmerfem > build.txt 2>&1; date;

I immediately notice that the install command of interest is skipped because CPACK_BUNDLE_EXTRA_WINDOWS_DLLS is not set to TRUE! Setting it to FALSE earlier then seeing a build failure gave the impression that it was set to TRUE everywhere (especially given that ElmerCPack.cmake sets it to TRUE) but that happens after windows_bundle.cmake has been evaluated! Here’s the final required command line:

date; time \
cmake -G "MSYS Makefiles" \
 -DWITH_ELMERGUI:BOOL=TRUE \
 -DWITH_MPI:BOOL=FALSE \
 -DCMAKE_INSTALL_PREFIX=../install \
 -DCMAKE_Fortran_COMPILER=d:/dev/Software/msys64/mingw64/bin/gfortran.exe \
 -DQWT_INCLUDE_DIR=d:/dev/Software/msys64/mingw64/include/qwt-qt5/ \
 -DWIN32:BOOL=TRUE \
 -DCPACK_BUNDLE_EXTRA_WINDOWS_DLLS:BOOL=TRUE \
 ../elmerfem

There are lessons there about making assumptions but the biggest takeaway for me is the need for (and existence) of tracing capabilities in CMake. Enabling tracing made it so easy to figure out exactly what was broken – the variable did not have the value I expected and I simply needed to define it! Running make install now results in new errors haha!

The code execution cannot proceed because libgcc_s_seh-1.dll was not found. Reinstalling the program may fix this problem. This error dialog is then followed by others for qwt-qt5.dll, libstdc++-6.dll, and libdouble-conversion.dll respectively. To see which other binaries are required, manually copy these 4 binaries from d:\dev\Software\msys64\mingw64\bin to install/bin. These are the other missing binaries that show up in error dialogs:

libwinpthread-1.dll
libicuin69.dll
libicuuc69.dll
libpcre2-16-0.dll
libharfbuzz-0.dll
libmd4c.dll
libmd4c.dll
libpng16-16.dll
zlib1.dll
libzstd.dll
libfreetype-6.dll
libgraphite2.dll
libintl-8.dll
libglib-2.0-0.dll
libicudt69.dll
libiconv-2.dll
libbz2-1.dll
libbrotlidec.dll
libbrotlicommon.dll
libpcre-1.dll

There are so many missing binaries that I wonder if building the other components might be necessary. I first manually copy them from the bin folder to see if ElmerGUI can load but that now fails with the error This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

The working installation from the downloaded public Elmer installer has a qwindows.dll in the bin/platforms directory. This file exists on my system as D:\dev\Software\msys64\mingw64\share\qt5\plugins\platforms\qwindows.dll. At this point, I’m really wondering how on earth they are gathering all these files. Hmm, maybe they build using their docker image? Didn’t think of seeing how their docker image is set up. One final try to set up the qwindows.dll first and see what happens.

cp /mingw64/share/qt5/plugins/platforms/qwindows.dll ../install/bin/platforms/

Lo and behold! ElmerGUI loads successfully! Whether it actually works… well, I will revisit that after I learn how to use Elmer/ElmerGUI :). Actually, turns out ElmerSolver.exe doesn’t start. Here is the fix for the 3 binaries that cause error dialogs:

# binaries required by ElmerSolver
cp /mingw64/bin/libgfortran-5.dll ../install/bin/
cp /mingw64/bin/libopenblas.dll ../install/bin/
cp /mingw64/bin/libquadmath-0.dll ../install/bin/

The final set of commands to deploy these binaries is therefore:

# binaries required by ElmerGUI
cp /mingw64/bin/libgcc_s_seh-1.dll ../install/bin/
cp /mingw64/bin/libdouble-conversion.dll ../install/bin/
cp /mingw64/bin/qwt-qt5.dll ../install/bin/
cp /mingw64/bin/libstdc++-6.dll ../install/bin/
cp /mingw64/bin/libwinpthread-1.dll ../install/bin/
cp /mingw64/bin/libicuin69.dll ../install/bin/
cp /mingw64/bin/libicuuc69.dll ../install/bin/
cp /mingw64/bin/libpcre2-16-0.dll ../install/bin/
cp /mingw64/bin/libharfbuzz-0.dll ../install/bin/
cp /mingw64/bin/libmd4c.dll ../install/bin/
cp /mingw64/bin/libpng16-16.dll ../install/bin/
cp /mingw64/bin/zlib1.dll ../install/bin/
cp /mingw64/bin/libzstd.dll ../install/bin/
cp /mingw64/bin/libfreetype-6.dll ../install/bin/
cp /mingw64/bin/libgraphite2.dll ../install/bin/
cp /mingw64/bin/libintl-8.dll ../install/bin/
cp /mingw64/bin/libglib-2.0-0.dll ../install/bin/
cp /mingw64/bin/libicudt69.dll ../install/bin/
cp /mingw64/bin/libiconv-2.dll ../install/bin/
cp /mingw64/bin/libbz2-1.dll ../install/bin/
cp /mingw64/bin/libbrotlidec.dll ../install/bin/
cp /mingw64/bin/libbrotlicommon.dll ../install/bin/
cp /mingw64/bin/libpcre-1.dll ../install/bin/
cp /mingw64/share/qt5/plugins/platforms/qwindows.dll ../install/bin/platforms/

# binaries required by ElmerSolver
cp /mingw64/bin/libgfortran-5.dll ../install/bin/
cp /mingw64/bin/libopenblas.dll ../install/bin/
cp /mingw64/bin/libquadmath-0.dll ../install/bin/

At this point, ElmerSolver starts up successfully but outputs an error about missing ELMERSOLVER_STARTINFO. This is an error from ElmerSolver.F90 (the Fortran 90 source code). From that code, it looks like the error is because I didn’t specify an input file name and the default file does not exist. This is the same behavior as ElmerSolver.exe from the official Elmer installation.

ELMER SOLVER (v 9.0) STARTED AT: 2022/06/26 21:55:22
ParCommInit:  Initialize #PEs:            1
MAIN:
MAIN: =============================================================
MAIN: ElmerSolver finite element software, Welcome!
MAIN: This program is free software licensed under (L)GPL
MAIN: Copyright 1st April 1995 - , CSC - IT Center for Science Ltd.
MAIN: Webpage http://www.csc.fi/elmer, Email elmeradm@csc.fi
MAIN: Version: 9.0 (Rev: af959fd0, Compiled: 2022-06-26)
MAIN:  Running one task without MPI parallelization.
MAIN:  Running with just one thread per task.
MAIN: =============================================================
ERROR:: ElmerSolver: Unable to find ELMERSOLVER_STARTINFO, can not execute.
STOP 1

Reviewing Docker Files

I was wondering after all this if there was a Windows docker file that had all these steps already baked in. However, the docker directory has only Ubuntu dockerfiles. Perhaps I could create a Windows docker file if I could figure out these dependencies.

Binaries Required by ElmerSolver

The above deployment instructions starts with the binaries required by ElmerGUI but ElmerSolver is the crucial component (since Elmer can be used with the GUI). Here are the binaries grouped such that the ElmerGUI binaries can be excluded if desired.

# binaries required by ElmerSolver
cp /mingw64/bin/libgfortran-5.dll ../install/bin/
cp /mingw64/bin/libgcc_s_seh-1.dll ../install/bin/
cp /mingw64/bin/libopenblas.dll ../install/bin/
cp /mingw64/bin/libquadmath-0.dll ../install/bin/
cp /mingw64/bin/libwinpthread-1.dll ../install/bin/

# binaries required by Mesh2D
cp /mingw64/bin/libstdc++-6.dll ../install/bin/

# binaries required by ElmerGUI
cp /mingw64/bin/qwt-qt5.dll ../install/bin/
cp /mingw64/bin/libdouble-conversion.dll ../install/bin/
cp /mingw64/bin/libicuin69.dll ../install/bin/
cp /mingw64/bin/libicuuc69.dll ../install/bin/
cp /mingw64/bin/libpcre2-16-0.dll ../install/bin/
cp /mingw64/bin/libharfbuzz-0.dll ../install/bin/
cp /mingw64/bin/libmd4c.dll ../install/bin/
cp /mingw64/bin/libpng16-16.dll ../install/bin/
cp /mingw64/bin/zlib1.dll ../install/bin/
cp /mingw64/bin/libzstd.dll ../install/bin/
cp /mingw64/bin/libicudt69.dll ../install/bin/
cp /mingw64/bin/libfreetype-6.dll ../install/bin/
cp /mingw64/bin/libglib-2.0-0.dll ../install/bin/
cp /mingw64/bin/libgraphite2.dll ../install/bin/
cp /mingw64/bin/libintl-8.dll ../install/bin/
cp /mingw64/bin/libbz2-1.dll ../install/bin/
cp /mingw64/bin/libbrotlidec.dll ../install/bin/
cp /mingw64/bin/libpcre-1.dll ../install/bin/
cp /mingw64/bin/libiconv-2.dll ../install/bin/
cp /mingw64/bin/libbrotlicommon.dll ../install/bin/

cp /mingw64/share/qt5/plugins/platforms/qwindows.dll ../install/bin/platforms/

Outstanding Issues

Deploying the above binaries manually gives a locally runnable build. Unfortunately, it does not fix the build created by CPack when you run make install – that build will not have any of these binaries/dependencies.

2022-06-12 —Categories: CMake, Finite Element Analysis, Programming

Investigating how to Build Elmer on Windows

The instructions for building the Elmer source code are really simple! I decided to try them on Windows. The Developer Command Prompt is necessary for cmake (as far as I can tell). Note that C, C++, and Fortran compilers are required for building Elmer.

cd \dev\repos
mkdir fem
git clone git://www.github.com/ElmerCSC/elmerfem 
mkdir build
cd build
cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install ../elmerfem

I discovered that a Fortran compiler is required when I got this error on my first build attempt:

-- Building for: Visual Studio 17 2022
-- The Fortran compiler identification is unknown
-- The C compiler identification is MSVC 19.32.31326.0
-- The CXX compiler identification is MSVC 19.32.31326.0
CMake Error at CMakeLists.txt:34 (PROJECT):
  No CMAKE_Fortran_COMPILER could be found.

Line 34 of CMakeLists.txt – PROJECT(Elmer Fortran C CXX) – uses the PROJECT cmake command to set the project name to “Elmer” and specify the programming languages required, hence the build failure above.

Installing a Fortran Compiler – GFortran?

GFortran looks like the only free Fortran compiler out there so I grabbed the compiler from Fortran, C, C++ for Windows (equation.com) as recommended by Installing GFortran – (fortran-lang.org). The newly installed Fortran compiler was not automatically detected by CMake. Based on the discussion at c++ – Error: No CMAKE_Fortran_COMPILER could be found for Visual Studio 2019 Fortran support – Stack Overflow, I made this change to CMakeLists.txt to pick up the GFortran compiler:

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -21,6 +21,8 @@ if(APPLE)
   # option(HUNTER_ENABLED "Enable Hunter package manager support" OFF)
   # set (CMAKE_GENERATOR "Unix Makefiles" CACHE INTERNAL "" FORCE)
   # set(CMAKE_TRY_COMPILE_TARGET_TYPE "STATIC_LIBRARY")
+else()
+  set(CMAKE_Fortran_COMPILER "C:/dev/software/gcc/bin/gfortran.exe")
 endif()

Unfortunately, that wasn’t sufficient to address the build failure. Interestingly, someone else ran into this exact same issue at windows – The MinGW gfortran compiler is not able to compile a simple test program – Stack Overflow. Sad times though when StackOverflow does not have an answer! Their solution for specifying a custom compiler is much cleaner – simply define the CMake variable when invoking cmake!

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=C:/dev/software/gcc/bin/gfortran.exe ../elmerfem

Searching for the error message “The Fortran compiler identification is unknown (bing.com)” reveals an existing GitHub issue issue Cannot build using cmake with gfortran on Windows — the Fortran compiler identification is unknown · Issue #328 · fortran-lang/stdlib. Someone mentioned that the MinGW compiler worked fine.

Installing a Fortran Compiler – MinGW

Via Cygwin

The MinGW-w64 downloads looked promising. Since I already had Cygwin installed, I installed the GFortran package. The path to the GFortran compiler can be retrieved using the Cygwin command cygpath -w `which gfortran` and passed to CMake. That still didn’t work.

setup-x86_64.exe q -P gcc-fortran

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=C:/dev/cygwin64/bin/gfortran.exe ../elmerfem

At least that showed the mingw Fortran compiler package name mingw64-x86_64-gcc-fortran. Interestingly, that package is marked already installed!

Via MSYS2

Since Cygwin didn’t simply work, I decided to try installing MSYS2 (before resorting to uninstalling the Cygwin gcc-fortran package). The Fortran compiler is installed by MSYS2. Once setup completes, CMake also fails when using the MinGW Fortran compiler!

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe ../elmerfem

Debugging the Fortran Detection Failure

Since none of the compilers work, let’s take a closer look at the error:

$ cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=C:/dev/software/gcc/bin/gfortran.exe ../elmerfem
-- The Fortran compiler identification is unknown
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - failed
-- Check for working Fortran compiler: C:/dev/software/gcc/bin/gfortran.exe
-- Check for working Fortran compiler: C:/dev/software/gcc/bin/gfortran.exe - broken
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/CMakeTestFortranCompiler.cmake:61 (message):
  The Fortran compiler

    "C:/dev/software/gcc/bin/gfortran.exe"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: D:/dev/repos/fem/build/CMakeFiles/CMakeTmp

    Run Build Command(s):C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_4528a &&
    Microsoft Visual Studio 2022 Version 17.3.0 Preview 1.0 [...].
    Copyright (C) Microsoft Corp. All rights reserved.

    The operation could not be completed. The parameter is incorrect.

    Use:
    devenv  [solutionfile | projectfile | folder | anyfile.ext]  [switches]
...

To get a sense of what could be going wrong, I opened the folder containing the temporary project CMake is trying to build. Its contents are deleted before CMake terminates. However, the build was slow enough for me to copy all the files into another temp folder to repro this failure. Running the devenv.com command above fails with the same error.

Interestingly, loading the solution in Visual Studio results in an error because one of the projects cannot be loaded! However, that project file has a .vfproj extension (which seems specific to the Intel Fortran compiler, e.g. as described at Cannot open vfproj file in visual studio 2017 – Intel Communities).

Looks like it’s the CMakeTestFortranCompiler.cmake file that is generating Intel Fortran projects. The first check that file is:

if(CMAKE_Fortran_COMPILER_FORCED)
  # The compiler configuration was forced by the user.
  # Assume the user has configured all compiler information.
  set(CMAKE_Fortran_COMPILER_WORKS TRUE)
  return()
endif()

The CMAKE_Fortran_COMPILER_FORCED define can be used to bail out of the custom configuration so define it when invoking cmake:

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE ../elmerfem

We now get a new error! Finally making some progress!

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE ../elmerfem
-- The Fortran compiler identification is unknown
CMake Deprecation Warning at cmake/Modules/FindMKL.cmake:2 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
  CMakeLists.txt:308 (FIND_PACKAGE)


-- ------------------------------------------------
-- Looking for Fortran sgemm
-- Looking for Fortran sgemm - not found
-- Looking for pthread.h
-- Looking for pthread.h - not found
-- Found Threads: TRUE
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find BLAS (missing: BLAS_LIBRARIES)
Call Stack (most recent call first):
  C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules/FindBLAS.cmake:1337 (find_package_handle_standard_args)
  CMakeLists.txt:433 (FIND_PACKAGE)


-- Configuring incomplete, errors occurred!
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeOutput.log".
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeError.log".

This error is from the FindBLAS module (see FindBLAS source code I’ve linked to in the error log above). It should be able to find BLAS as per this question Can CMake FindBLAS find OpenBLAS? – Stack Overflow.

https://duckduckgo.com/?q=gfortran+blas
fortran – Error in linking gfortran to LAPACK and BLAS – Stack Overflow

Installing BLAS

Searching for “pacman blas” leads to fortran – Using BLAS, LAPACK, and ARPACK with MSYS2 – Stack Overflow which points out that you can search for packages using pacman -Ss. The -S flag stands for sync. Use pacman -Sh to see the package sync options. See Package Management – MSYS2 for more details.

# Search for BLAS packages
pacman -Ss blas

# Install mingw BLAS package
pacman -S mingw64/mingw-w64-x86_64-openblas

# Install LAPACK
pacman -S mingw64/mingw-w64-x86_64-lapack

The output should look like this when complete:

$ pacman -S mingw64/mingw-w64-x86_64-openblas
resolving dependencies...
looking for conflicting packages...

Packages (1) mingw-w64-x86_64-openblas-0.3.20-3

Total Download Size:    11.76 MiB
Total Installed Size:  103.67 MiB

:: Proceed with installation? [Y/n] y
:: Retrieving packages...
 mingw-w64-x86_64-openblas-0.3.20-3-any                                                                                                 11.8 MiB  2.26 MiB/s 00:05 [#####...#####] 100%
(1/1) checking keys in keyring             [#####...#####] 100%
(1/1) checking package integrity           [#####...#####] 100%
(1/1) loading package files                [#####...#####] 100%
(1/1) checking for file conflicts          [#####...#####] 100%
(1/1) checking available disk space        [#####...#####] 100%
:: Processing package changes...
(1/1) installing mingw-w64-x86_64-openblas [#####...#####] 100%
Set the environment variable OPENBLAS_NUM_THREADS to the
number of threads to use.resolving dependencies...

This doesn’t address the errors. A search for the exact error message “Could NOT find BLAS (missing: BLAS_LIBRARIES)” reveals a useful GitHub discussion at find_package(BLAS) failed with CMake · Issue #2440 · mxe/mxe. So BLAS_LIBRARIES can simply be defined at the command line!

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe  -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem

We now get a new error about LAPACK_LIBRARIES and define it as well!

cmake -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib -DLAPACK_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem

This finally gets us past the missing package issues and on to more Fortran compiler errors!

-- Found LAPACK: D:/dev/Software/msys64/mingw64/lib
-- Checking whether D:/dev/Software/msys64/mingw64/bin/gfortran.exe supports PROCEDURE POINTER
-- Checking whether D:/dev/Software/msys64/mingw64/bin/gfortran.exe supports PROCEDURE POINTER -- no
CMake Error at CMakeLists.txt:477 (MESSAGE):
  Fortran compiler does not seem to support the PROCEDURE statement.

Support for PROCEDURE Statements

CMakeLists.txt:475 is this line INCLUDE(testProcedurePointer). The included script tests the Fortran compiler but does not explain why the test fails. To see the details, append the string : ${OUTPUT} to the end of the string “Checking whether ${CMAKE_Fortran_COMPILER} supports PROCEDURE POINTER — no” (just before the closing quote). The error message now contains additional information – the same error from earlier! Opening the solution in Visual Studio confirms that yet another unsupported .vfproj has been generated.

Change Dir: D:/dev/repos/fem/build/CMakeFiles/CMakeTmp

Run Build Command(s):C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/devenv.com CMAKE_TRY_COMPILE.sln /build Debug /project cmTC_77a33 &&
Microsoft Visual Studio 2022 Version 17.3.0 Preview 1.0 [...].
Copyright (C) Microsoft Corp. All rights reserved.

The operation could not be completed. The parameter is incorrect.

Use:
devenv  [solutionfile | projectfile | folder | anyfile.ext]  [switches]

<Updated VS, unfortunately changing the CMake version>. This is the CMakeLists.txt generated for the solution:

cmake_minimum_required(VERSION 3.22.22022201.0)
set(CMAKE_MODULE_PATH "D:/dev/repos/fem/elmerfem/cmake/Modules;C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.22/Modules")
cmake_policy(SET CMP0091 OLD)
cmake_policy(SET CMP0126 OLD)
project(CMAKE_TRY_COMPILE Fortran)
set(CMAKE_VERBOSE_MAKEFILE 1)
set(CMAKE_Fortran_FLAGS "")
set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} ${COMPILE_DEFINITIONS}")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${EXE_LINKER_FLAGS}")
include_directories(${INCLUDE_DIRECTORIES})
set(CMAKE_SUPPRESS_REGENERATION 1)
link_directories(${LINK_DIRECTORIES})
cmake_policy(SET CMP0065 OLD)
cmake_policy(SET CMP0083 OLD)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "D:/dev/repos/fem/build/CMakeFiles/CMakeTmp")
add_executable(cmTC_b909d "D:/dev/repos/fem/build/CMakeFiles/CMakeTmp/testFortranProcedurePointer.f90")
target_link_libraries(cmTC_b909d ${LINK_LIBRARIES})

The cmake project command names the generated project CMAKE_TRY_COMPILE and specifies that the Fortran programming language is needed to build the project. At this point, it looks like a question of how the project is generated. Searching the cmake sources for “.vfproj” leads to the documentation at Help/variable/CMAKE_MAKE_PROGRAM.rst · v3.22.0 · CMake. Turns out this is simply the public documentation at CMAKE_MAKE_PROGRAM — CMake 3.23.2 Documentation. Finally get to the generators docs at cmake-generators(7) — CMake 3.22.5.

If the Visual Studio generator is not appropriate, then which one is? Since I’m using MSYS2, I wonder if the MSYS generator is better suited to this build task. Come to think of it, I saw some discussion of makefile generators, e.g. in Cannot build using cmake with gfortran on Windows — the Fortran compiler identification is unknown · Issue #328 · fortran-lang/stdlib. Sure enough, the cmake options docs say -G is how you choose the generator:

cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib -DLAPACK_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem

That does not work though (in my developer command prompt)

CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles".  CMAKE_MAKE_PROGRAM is not set.  You probably need to select a different build tool.
CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles".  CMAKE_MAKE_PROGRAM is not set.  You probably need to select a different build tool.
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!

Looks like I need to try this process in MSYS2.

Custom Generator in MSYS

Running which cmake in MSYS did not find cmake so here’s the version I installed.

$ pacman -Ss cmake
...
mingw64/mingw-w64-x86_64-cmake 3.23.2-1
    A cross-platform open-source make system (mingw-w64)
...
$ pacman -S mingw64/mingw-w64-x86_64-cmake

This doesn’t result in being able to run cmake.exe (even though it exists on disk in D:\dev\Software\msys64\mingw64\bin). Time to hit the docs again: msys2 cmake – Search (bing.com) -> Using CMake in MSYS2 – MSYS2. No red flags there… How about a search for the exact error message: msys bash: cmake: command not found – Search (bing.com) -> c++ – CMake is not found when running through make – Stack Overflow. Aha! The answer there about launching MSYS2 using mingw32.exe leads me to inquire about how I’m launching MSYS2. Turns out I’m launching using the last shortcut below (which launches “D:\dev\Software\msys64\msys2_shell.cmd -msys“) instead of MinGW x64.lnk (which launches “D:\dev\Software\msys64\msys2_shell.cmd -mingw64“). Sure enough, which cmake now shows /mingw64/bin/cmake.

 Directory of C:\Users\USERNAME\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\MSYS2 64bit

MSYS2 MinGW Clang x64.lnk
MSYS2 MinGW UCRT x64.lnk
MSYS2 MinGW x64.lnk
MSYS2 MinGW x86.lnk
MSYS2 MSYS.lnk

Custom Generator in MinGW

Retrying the command line now makes progress! Notice the Fortran compiler is successfully detected (and the GNU C++ compiler is also selected).

$ cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib -DLAPACK_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem
-- The Fortran compiler identification is GNU 12.1.0
-- The C compiler identification is GNU 12.1.0
-- The CXX compiler identification is GNU 12.1.0
...

The build fails but things are very promising now. The error is because Qt is missing:

--   Building ElmerGUI
-- ------------------------------------------------
CMake Deprecation Warning at ElmerGUI/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


CMake Warning at ElmerGUI/CMakeLists.txt:19 (find_package):
  By not providing "FindQt5.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "Qt5", but
  CMake did not find one.

  Could not find a package configuration file provided by "Qt5" with any of
  the following names:

    Qt5Config.cmake
    qt5-config.cmake

  Add the installation prefix of "Qt5" to CMAKE_PREFIX_PATH or set "Qt5_DIR"
  to a directory containing one of the above files.  If "Qt5" provides a
  separate development package or SDK, be sure it has been installed.


-- ------------------------------------------------
CMake Error at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt4.cmake:1314 (message):
  Found unsuitable Qt version "" from NOTFOUND, this code requires Qt 4.x
Call Stack (most recent call first):
  ElmerGUI/CMakeLists.txt:42 (FIND_PACKAGE)

Installing Qt5 does not address the build failure. The new error message:

--   Building ElmerGUI
-- ------------------------------------------------
CMake Deprecation Warning at ElmerGUI/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 2.8.12 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- ------------------------------------------------
-- Qt5 Windows packaging
--   [ElmerGUI] Qt5:               1
--   [ElmerGUI] Qt5 Libraries: Qt5::OpenGL Qt5::Xml Qt5::Script Qt5::Gui Qt5::Core
-- ------------------------------------------------
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
  The package name passed to `find_package_handle_standard_args` (OpenGL)
  does not match the name of the calling package (Qwt).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindOpenGL.cmake:443 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  ElmerGUI/cmake/Modules/FindQwt.cmake:10 (INCLUDE)
  ElmerGUI/CMakeLists.txt:61 (FIND_PACKAGE)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found OpenGL: opengl32
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
  The package name passed to `find_package_handle_standard_args` (Qt3) does
  not match the name of the calling package (Qwt).  This can lead to problems
  in calling code that expects `find_package` result variables (e.g.,
  `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt3.cmake:213 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  D:/dev/Software/msys64/mingw64/share/cmake/Modules/FindQt.cmake:160 (include)
  ElmerGUI/cmake/Modules/FindQwt.cmake:11 (INCLUDE)
  ElmerGUI/CMakeLists.txt:61 (FIND_PACKAGE)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Could NOT find Qt3 (missing: QT_QT_LIBRARY QT_INCLUDE_DIR)
CMake was unable to find desired Qt version: 3. Set advanced values QT_QMAKE_EXECUTABLE and QT3_QGLOBAL_H_FILE.
--   [ElmerGUI] Qwt:             FALSE
--   [ElmerGUI] QWT_LIBRARY:     QWT_LIBRARY-NOTFOUND
--   [ElmerGUI] QWT_INCLUDE_DIR: QWT_INCLUDE_DIR-NOTFOUND
-- ------------------------------------------------
CMake Warning (dev) at D:/dev/Software/msys64/mingw64/lib/cmake/Qt5Core/Qt5CoreMacros.cmake:44 (message):
  qt5_use_modules is not part of the official API, and might be removed in Qt
  6.
Call Stack (most recent call first):
  D:/dev/Software/msys64/mingw64/lib/cmake/Qt5Core/Qt5CoreMacros.cmake:431 (_qt5_warn_deprecated)
  ElmerGUI/Application/CMakeLists.txt:216 (QT5_USE_MODULES)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- ------------------------------------------------
--   BLAS library:   D:/dev/Software/msys64/mingw64/lib
--   LAPACK library: D:/dev/Software/msys64/mingw64/lib
-- ------------------------------------------------
--   Fortran compiler:        D:/dev/Software/msys64/mingw64/bin/gfortran.exe
--   Fortran flags:            -fallow-argument-mismatch -O2 -g -DNDEBUG
-- ------------------------------------------------
--   C compiler:              D:/dev/Software/msys64/mingw64/bin/cc.exe
--   C flags:                  -O2 -g -DNDEBUG
-- ------------------------------------------------
--   CXX compiler:            D:/dev/Software/msys64/mingw64/bin/c++.exe
--   CXX flags:                -O2 -g -DNDEBUG
-- ------------------------------------------------
-- ------------------------------------------------
--   Package filename: elmerfem-9.0--20220612_Windows-AMD64
--   Patch version: 9.0-
CMake Error at cpack/ElmerCPack.cmake:99 (INSTALL):
  INSTALL FILES given directory "D:/dev/Software/msys64/mingw64/lib" to
  install.
Call Stack (most recent call first):
  CMakeLists.txt:660 (INCLUDE)


-- Configuring incomplete, errors occurred!
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeOutput.log".
See also "D:/dev/repos/fem/build/CMakeFiles/CMakeError.log".

Does this need Qt3? The ElmerGUI documentation says Qt4 (4.8 or higher). FindQt.cmake:160 (in bold above) appears to indicate that only Qt versions 3 and 4 are supported in MinGW. The mix of warnings and “could not find” makes it hard to know exactly what is wrong. The last error, for example, appears to be about the installation files directory. So is there anything wrong with Qt? I’ll assume not.

The cmake docs on installing files doesn’t point to anything peculiar in this scenario but this is a hint that my LAPACK_LIBRARIES variable is most likely wrong. Let’s drop it altogether:

# Clean up old make files
# rm -fr *

cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DBLAS_LIBRARIES=D:/dev/Software/msys64/mingw64/lib ../elmerfem

The build still fails but right before the error, notice the LAPACK library now has a DLL instead of a directory (below)!

-- ------------------------------------------------
--   BLAS library:   D:/dev/Software/msys64/mingw64/lib
--   LAPACK library: D:/dev/Software/msys64/mingw64/lib/libopenblas.dll.a;D:/dev/Software/msys64/mingw64/lib
-- ------------------------------------------------
--   Fortran compiler:        D:/dev/Software/msys64/mingw64/bin/gfortran.exe
--   Fortran flags:            -fallow-argument-mismatch -O2 -g -DNDEBUG
-- ------------------------------------------------
--   C compiler:              D:/dev/Software/msys64/mingw64/bin/cc.exe
--   C flags:                  -O2 -g -DNDEBUG
-- ------------------------------------------------
--   CXX compiler:            D:/dev/Software/msys64/mingw64/bin/c++.exe
--   CXX flags:                -O2 -g -DNDEBUG
-- ------------------------------------------------
-- ------------------------------------------------
--   Package filename: elmerfem-9.0--20220612_Windows-AMD64
--   Patch version: 9.0-
CMake Error at cpack/ElmerCPack.cmake:99 (INSTALL):
  INSTALL FILES given directory "D:/dev/Software/msys64/mingw64/lib" to
  install.

So now it makes sense to drop the BLAS_LIBRARIES definition as well!

# Clean up old make files
# rm -fr *

cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE ../elmerfem

This build step now succeeds as indicated by the selection of libopenblas.dll.a as the BLAS and LAPACK library.

-- ------------------------------------------------
--   BLAS library:   D:/dev/Software/msys64/mingw64/lib/libopenblas.dll.a
--   LAPACK library: D:/dev/Software/msys64/mingw64/lib/libopenblas.dll.a
-- ------------------------------------------------
--   Fortran compiler:        D:/dev/Software/msys64/mingw64/bin/gfortran.exe
--   Fortran flags:            -fallow-argument-mismatch -O2 -g -DNDEBUG
-- ------------------------------------------------
--   C compiler:              D:/dev/Software/msys64/mingw64/bin/cc.exe
--   C flags:                  -O2 -g -DNDEBUG
-- ------------------------------------------------
--   CXX compiler:            D:/dev/Software/msys64/mingw64/bin/c++.exe
--   CXX flags:                -O2 -g -DNDEBUG
-- ------------------------------------------------
-- ------------------------------------------------
--   Package filename: elmerfem-9.0--20220612_Windows-AMD64
--   Patch version: 9.0-
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
QWT_INCLUDE_DIR (ADVANCED)
   used as include directory in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
   ...
   used as include directory in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
QWT_LIBRARY (ADVANCED)
    linked by target "ElmerGUI" in directory D:/dev/repos/fem/elmerfem/ElmerGUI/Application
...

Looks like I now need to define QWT_INCLUDE_DIR and QWT_LIBRARY. Hmm, I don’t think I even installed QWT.

$ pacman -S mingw64/mingw-w64-x86_64-qwt-qt5
resolving dependencies...
looking for conflicting packages...

Packages (1) mingw-w64-x86_64-qwt-qt5-6.2.0-5

Total Download Size:    29.17 MiB
Total Installed Size:  175.53 MiB

:: Proceed with installation? [Y/n] y
:: Retrieving packages...
 mingw-w64-x86_64-qwt-qt5-6.2.0-5-any                                                                                  29.2 MiB  1136 KiB/s 00:26 [###...###] 100%
(1/1) checking keys in keyring                                                                                                                    [###...###] 100%
(1/1) checking package integrity                                                                                                                  [###...###] 100%
(1/1) loading package files                                                                                                                       [###...###] 100%
(1/1) checking for file conflicts                                                                                                                 [###...###] 100%
(1/1) checking available disk space                                                                                                               [###...###] 100%
:: Processing package changes...
(1/1) installing mingw-w64-x86_64-qwt-qt5                                                                                                         [#########################################################################################] 100%
Optional dependencies for mingw-w64-x86_64-qwt-qt5
    mingw-w64-x86_64-qt5-tools [installed]

Now that QWT is installed, we can set the include directory as follows:

cmake -G "MinGW Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DQWT_INCLUDE_DIR=D:/dev/Software/msys64/mingw64/include/qwt-qt5/ ../elmerfem

CMake finally succeeds! The output ends with these lines:

-- Generating done
-- Build files have been written to: D:/dev/repos/fem/build

The generated Makefile has targets such as ElmerGUI, elmersolver, AdvectionDiffusion, FluxSolver, etc. The strange thing is that it has a line that sets SHELL = cmd.exe and so a Windows command prompt is launched when you run make.

#==================================================================
# Target rules for targets named ElmerGUI

# Build rule for target.
ElmerGUI: cmake_check_build_system
	$(MAKE) $(MAKESILENT) -f CMakeFiles\Makefile2 ElmerGUI
.PHONY : ElmerGUI

# fast build rule for target.
ElmerGUI/fast:
	$(MAKE) $(MAKESILENT) -f ElmerGUI\Application\CMakeFiles\ElmerGUI.dir\build.make ElmerGUI/Application/CMakeFiles/ElmerGUI.dir/build
.PHONY : ElmerGUI/fast

Some digging around via mingw cmake shell at DuckDuckGo and I’m reading that makefiles from the MinGW Makefiles generator are for use with mingw32-make under a Windows command prompt. Looks like I need the MSYS Makefiles generator.

cmake -G "MSYS Makefiles" -DWITH_ELMERGUI:BOOL=TRUE -DWITH_MPI:BOOL=FALSE -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_Fortran_COMPILER=D:/dev/Software/msys64/mingw64/bin/gfortran.exe -DCMAKE_Fortran_COMPILER_FORCED:BOOL=TRUE -DQWT_INCLUDE_DIR=D:/dev/Software/msys64/mingw64/include/qwt-qt5/ ../elmerfem

Now we see the expected SHELL = /bin/sh and running make actually causes code to start building! What a journey! I will write another post with simplified instructions for how to build Elmer (on Windows).

$ make
[  0%] Building C object matc/src/CMakeFiles/matc.dir/c3d.c.obj
[  0%] Building C object matc/src/CMakeFiles/matc.dir/clip.c.obj
[  0%] Building C object matc/src/CMakeFiles/matc.dir/dri_ps.c.obj
[  0%] Building C object matc/src/CMakeFiles/matc.dir/eig.c.obj
...

2022-04-18 —Categories: Graphics

OpenGL Programming Gotchas

It has been a while since I wrote graphics/rendering code. The bugs are very different from those I typically write/fix since so I thought I might as well share the types of issues I dealt with. Here are some of the pitfalls I encountered:

Not checking all OpenGL API results. Continuing execution when vertex/fragment shader compilation failed, for example, wastes a ton of time debugging downstream failures.
Assuming successful shader compilation means that you can assign values to every shader uniform you declared! If the uniform is not actually used to generate the shader’s output, the compiler (which I learned lives in the graphics driver) can eliminate the uniform, thereby causing attempts to set it to fail.
Using glVertexAttribPointer instead of glVertexAttribIPointer to pass integer IDs needed by a fragment shader. Wasted so much time on this because I was feeling schedule pressure and didn’t carefully read the documentation. Since I set the normalized parameter to GL_FALSE, the IDs were being converted into floats directly without normalization. TODO: study this behavior now to see exactly which floats ended up in the shader.
Passing a count of 0 to glDrawArrays. The count argument specifies the number of indices to be rendered. Took me a while to figure out why nothing was showing up after some refactoring that I did. Turns out the number of vertices in the class I created was 0. An assertion here would have saved a ton of time.
Mismatched vertex attribute formats. Spiky rendered output instead of a shape like a cube/sphere makes this one rather easy to detect. In my case, I was using 2 structs and one had an extra ID field that ended up being vertex data when the other type of struct was passed to the rendering code.
Passing GL_TEXTURE0 to a sampler2D shader instead of 0! This was a typo that I didn’t catch in course slides.
Mixing up sizedInternalFormat and internalFormat values in calls to glTextureStorage2D and glTextureSubImage2D.

Buffer Texture Issues

Shader Compilation Errors

When storing vertex shader data into a buffer texture, the OpenGL APIs were used to create and bind the buffer: glCreateBuffers, glNamedBufferStorage, glCreateTextures, glTextureBuffer, glBindImageTexture. A snippet of the vertex shader code to write into the buffer is shown below. Unfortunately, the shader compilation failed with this error: 0(69) : error C1115: unable to find compatible overloaded function “imageStore(struct image1D4x32_bindless, int, struct DebugData)”.

layout(binding=0, rgba32f) uniform image1D bufferTexture;
...
struct DebugData
{
    int vertexId;
    vec4 vPosition;
    vec4 modelViewVertexPosition;
    vec4 glPosition;
    vec3 vNormal;
    vec3 transformedNormal;
};
...
DebugData debugData;
debugData.vertexId = faceId;
debugData.vPosition = position;
debugData.modelViewVertexPosition = vEyeSpacePosition;
debugData.glPosition = gl_Position;
debugData.vNormal = vNormal;
debugData.transformedNormal = vEyeSpaceNormal;

imageStore(bufferTexture, gl_VertexID, debugData);

I needed up having to define a const int numVec4sPerStruct = 6; and call imageStore for each member of the struct, e.g. imageStore(bufferTexture, gl_VertexID * numVec4sPerStruct + 1, debugData.vPosition);. See related discussions here and here.

Invalid Shader Data in Buffers

There were all 0s in the output written into the texture/shader storage buffers. I tried using imageBuffer instead of image1D to avoid getting all zeros when reading the texture image buffer. The code looked correct but I couldn’t explain why zeros were being read back despite the rendered output looking correct. To figure out why this could be happening, I initialized the texture memory to a known value (integer value -1, which turns out to be a NaN when interpreted as a float). This made it easier to explain the random crap that was being displayed (hint from stack overflow) since it made it obvious that many memory locations were not being written to. Here is a snippet of the fragment shader:

struct FragmentData
{
    uint uniqueFragmentId;
    vec2 texCoords;
    vec4 glFragCoord;
    uint faceId;
};

layout(std430, binding=1) buffer DebugFragmentData
{
    FragmentData fragmentData[];
} outputFragmentData;

The fragment shader’s main method wrote to the shader storage like this:

outputFragmentData.fragmentData[uniqueFragmentId].glFragCoord = gl_FragCoord;

Initializing the storage to all 0xFF bytes made it possible to conclude that the wrong data was being written to the host, more specifically that the wrong locations were being written to! Who knew structs and alignment were a thing (TODO: link to alignment discussion in GLSL spec)! The host needed to interpret the shader storage using this struct:

struct FragmentData
{
    unsigned int uniqueFragmentId;
    unsigned int fourBytesForAlignment1;
    TextureCoord2f texCoords;
    VertexCoord4f glFragCoord;
    unsigned int faceId;
    unsigned int fourBytesForAlignment2;
    unsigned int fourBytesForAlignment3;
    unsigned int fourBytesForAlignment4;
};

Also see discussion in GLSL spec about std430 vs std140 for shader block storage!

C++ Bugs

Some of the bugs I introduced were also plain old C++ bugs (not OpenGL specific), e.g.

Uninitialized variables (the fovy float had a NaN value).
Copy/pasting code and missing a key fact that both Gouraud and Phong shading code paths were calling the Gouraud shader (even though the scene state output in the console showed the state had been correctly updated to Phong). That’s what you get for copy pasting and not having tests…
Wrong array indexing logic. In the bug below (commit 3cfe07aea6914a91), I was multiplying the indices by elementSize but that is wrong because lines 2-5 from the bottom already have a built in multiplication by the element size. Noticed this from the disassembly.

int VolumeDataset3D::GetIndexFromSliceAndCol(uint32_t slice, uint32_t column)
{
    const int sizeOf1Row = width * sizeof(uint8_t);

    int index = slice * sizeOf1Row + column;

    return index;
}

const int elementSize = sizeof(VertexDataPositionedByte);

    for (uint32_t slice = 0; slice < depth - 1; slice++)
    {
        for (uint32_t col = 0; col < width - 1; col++)
        {
            int index = elementSize * GetIndexFromSliceAndCol(slice, col);
            int rightIndex = elementSize * GetIndexFromSliceAndCol(slice, col + 1);
            int diagIndex = elementSize * GetIndexFromSliceAndCol(slice + 1, col + 1);
            int bottomIndex = elementSize * GetIndexFromSliceAndCol(slice + 1, col);

            VertexDataPositionedByte cellData[4] =
            {
                ((VertexDataPositionedByte*)vertexData2D.data)[bottomIndex],
                ((VertexDataPositionedByte*)vertexData2D.data)[diagIndex],
                ((VertexDataPositionedByte*)vertexData2D.data)[rightIndex],
                ((VertexDataPositionedByte*)vertexData2D.data)[index],
            };

Here is the corresponding disassembly:

            int bottomIndex = elementSize * GetIndexFromSliceAndCol(slice + 1, col);
00007FF7A79AE5FC  mov         eax,dword ptr [rbp+0C4h]  
00007FF7A79AE602  inc         eax  
00007FF7A79AE604  mov         r8d,dword ptr [rbp+0E4h]  
00007FF7A79AE60B  mov         edx,eax  
00007FF7A79AE60D  mov         rcx,qword ptr [this]  
00007FF7A79AE614  call        VolumeDataset3D::GetIndexFromSliceAndCol (07FF7A7990046h)  
00007FF7A79AE619  imul        eax,eax,10h  
00007FF7A79AE61C  mov         dword ptr [rbp+164h],eax  

            VertexDataPositionedByte cellData[4] =
            {
                ((VertexDataPositionedByte*)vertexData2D.data)[bottomIndex],
00007FF7A79AE622  movsxd      rax,dword ptr [rbp+164h]  
00007FF7A79AE629  imul        rax,rax,10h  
00007FF7A79AE62D  mov         rcx,qword ptr [this]  
00007FF7A79AE634  mov         rcx,qword ptr [rcx+0E8h]  
00007FF7A79AE63B  lea         rdx,[rbp+190h]  
00007FF7A79AE642  mov         rdi,rdx  
00007FF7A79AE645  lea         rsi,[rcx+rax]  
00007FF7A79AE649  mov         ecx,10h  
00007FF7A79AE64E  rep movs    byte ptr [rdi],byte ptr [rsi]  
                ((VertexDataPositionedByte*)vertexData2D.data)[diagIndex],
00007FF7A79AE650  movsxd      rax,dword ptr [rbp+144h]  
00007FF7A79AE657  imul        rax,rax,10h  
00007FF7A79AE65B  mov         rcx,qword ptr [this]  
00007FF7A79AE662  mov         rcx,qword ptr [rcx+0E8h]  
00007FF7A79AE669  lea         rdx,[rbp+1A0h]  
00007FF7A79AE670  mov         rdi,rdx  
00007FF7A79AE673  lea         rsi,[rcx+rax]  
00007FF7A79AE677  mov         ecx,10h  
00007FF7A79AE67C  rep movs    byte ptr [rdi],byte ptr [rsi]

Crashes

I ran into (currently unexplained) crashes in both the Intel and nVidia OpenGL drivers (I used Windows only). There were also crashes (on a specific commit) from a nullref/access violation on my HP desktop but not on my Surface Pro. Found out later that the desktop was actually right to crash but the difference in behavior was certainly troubling.

Debugging Resources

2022-04-10 —Categories: OpenJDK

Backporting Async Logging to JDK11

Background

Longer than expected pauses were observed during GC in JDK 7 as explained on the Buffered Logging hotspot-dev mailing list:

Some folks noticed much longer than expected
pauses that seemed to coincide with GC logging in the midst of a GC
safepoint. In that setup, the GC logs were going to a disk file (these were
often useful for post-mortem analyses) rather than to a RAM-based tmpfs
which had been the original design center assumption. The vicissitudes of
the dirty page flushing policy in Linux when
IO load on the machine (not necessarily the JVM process doing the logging)
could affect the length and duration of these inline logging stalls.

A buffered logging scheme was then implemented by us (and independently by
others) which we have used successfully to date to avoid these pauses in
high i/o
multi-tenant environments.

[JDK-8229517] Support for optional asynchronous/buffered logging was filed for introducing that implementation to the public upstream OpenJDK. The release notes for the asynchronous logging feature describe it as a way to avoid undesirable delays in a thread using unified logging.

Note that Unified JVM Logging was introduced in JDK 9 whereas asynchronous logging was introduced in JDK17 in PR 3135. As per the Java docs, “logging messages are output synchronously” by default whereas in “asynchronous logging mode, log sites enqueue all logging messages to an intermediate buffer and a standalone thread is responsible for flushing them to the corresponding outputs.” The AWS Developer Tools Blog has an excellent writeup about how and why they implemented this feature as well as an overview of unified logging (e.g. run java -Xlog:'gc*=info:stdout' to see logging output from log_info_p, which in my case includes output from the G1InitLogger).

Starting the Backport

This is a relatively straightforward backport. Clone the jdk11u-dev repo (or your fork as appropriate). The repo was at commit 86d39a69 when I started the backport.

git clone https://github.com/openjdk/jdk11u-dev
cd jdk11u-dev/

To see the exact same outcomes, switch to that commit (if desired).

git checkout 86d39a69

To backport this feature to JDK11, cherry-pick the commit from PR 3135 onto a new branch. We need to add the upstream as a remote to enable cherry-picking PR commits.

git checkout -b AsyncLogging
git remote add upstream-jdk https://github.com/openjdk/jdk
git fetch upstream-jdk
git cherry-pick 41185d38f21e448370433f7e4f1633777cab6170

Conflict Resolution

I used Visual Studio for conflict resolution with this strategy:

Take Incoming (Source)
Inspect the diff using Compare with Unmodified… to ensure that the changes being pulled are sensible.

The rest of this section can be skipped. I am including the details of the validation of the conflict resolution strategy (i.e. ensuring nothing undesirable is getting pulled in). The advantage of the strategy outlined above is that changes that are required by the code we want to backport are most likely going to be present after conflict resolution.

Conflict Resolution: logTagSet.cpp

As an example, the upstream PR introduced 1 new method and 1 extern size_t to logTagSet.hpp. After conflict resolution, the updated logTagSet.hpp contains improvements to the logging code such as

None of these changes would be present if only the changes from the PR 3135 commit were used. These lists are generated from the blame view are therefore likely omit any delete-only diffs.

Conflict Resolution: logConfiguration.cpp

This is the list of unrelated changes (i.e. changes not in commit from PR 3135) after taking the incoming changes to logConfiguration.cpp includes (potentially partial) changes from:

8265101: Remove unnecessary functions in os*.inline.hpp. The other changes from that commit are not brought in but are not relevant for the backport and can therefore be safely ignored.
8255756: Disabling logging does unnecessary work
8257872: UL: -Xlog does not check number of options
8265047: Inconsistent warning message in jcmd VM.log

Conflict Resolution: logDecorators.hpp

Conflict Resolution: logFileOutput.hpp

Only the Copyright year conflicts. Other changes brought in include:

Conflict Resolution: logOutputList.hpp

Conflict Resolution: globals.hpp

Comparing the current and incoming globals.hpp reveals a significant rewriting of this file between the jdk and jdk11u-dev repos. To resolve the conflict, copy only the change from the PR 3135 commit to the target (local) globals.hpp by selecting the checkmark next to the conflict in the Visual Studio merge editor then manually fix up the last line.

Conflict Resolution: init.hpp

jdk and jdk11u-dev also have non-trivial changes to init.hpp so the Merge… command is necessary here.

Pick all the #includes from the source (conflict 1)
Pick all the changes from the target (conflict 2)
Add the new line to the merged file: AsyncLogWriter::initialize();

Conflict Resolution: thread.cpp

The Merge… command is again necessary here due to the significant number of changes between the source and target versions. Take the single line from the source and accept the merge:

cl.do_thread(AsyncLogWriter::instance());

Conflict Resolution: hashtable.hpp

Use the Merge… command once more to resolve the changes between the source and target versions. Take the single line from the source and accept the merge:

template class BasicHashtable;

Addressing Build Errors

Now that all conflicts have been resolved, build the code before committing anything. Here are additional issues that need to be resolved.

Missing ‘runtime/nonJavaThread.hpp’

D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(31): fatal error C1083: Cannot open include file: 'runtime/nonJavaThread.hpp': No such file or directory

nonJavaThread.hpp is a file now in the upstream JDK repo. Blame shows that PR 2390 moved it out of thread.hpp. Fix:

-#include "runtime/nonJavaThread.hpp"
+#include "runtime/thread.hpp"

Missing ‘;’ before ‘<‘

D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(111): error C2143: syntax error: missing ';' before '<'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(111): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(144): error C3646: '_stats': unknown override specifier
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(144): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int

Line 111 contains:

typedef KVHashtable<LogFileOutput*, uint32_t, mtLogging> AsyncLogMap;

Line 144 contains:

AsyncLogMap _stats; // statistics for dropped messages

Turns out KVHashtable was removed after async logging support was added so the latest sources aren’t the place to go for details about this class. Instead, see the KVHashtable implementation in the parent commit before it was removed. KVHashtable “is a subclass of BasicHashtable that allows you to do a simple K -> V mapping without using tons of boilerplate code.” The blame view of hashtable.hpp in the async logging support commit reveals that KVHashtable was added in commit 6d269930fdd3. For our purposes, we need to use the KVHashtable implementation that was in use when async logging was added.

Fix: insert lines 223-310 of hashtable.cpp into the local jdk11u-dev hashtable.hpp.

Missing pre_run Method

D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(155): error C3668: 'AsyncLogWriter::pre_run': method with override specifier 'override' did not override any base class methods
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logAsyncWriter.hpp(156): error C2039: 'pre_run': is not a member of 'NonJavaThread'

Notice that nonJavaThread.hpp in the upstream JDK repo has a pre_run method, unlike the NonJavaThread class in jdk11u-dev. The blame view of PR 2390’s parent commit reveals that these methods were added in commit 526f854c.

Fix: Remove the pre_run method from logAsyncWritter.hpp.

Stream Errors

./src/hotspot/share/logging/logAsyncWriter.cpp(108): error C2660: 'stringStream::as_string': function does not take 1 arguments
D:\dev\repos\java\jdk11u-dev\src\hotspot\share\utilities/ostream.hpp(220): note: see declaration of 'stringStream::as_string'
./src/hotspot/share/logging/logAsyncWriter.cpp(108): error C2661: 'AsyncLogMessage::AsyncLogMessage': no overloaded function takes 2 arguments

The as_string method only has a boolean parameter in the jdk repo (added in JDK15).

Fix: Remove the parameter to as_string.

Conversion loses qualifiers

./src/hotspot/share/logging/logAsyncWriter.cpp(143): error C2440: 'initializing': cannot convert from 'const E *' to 'AsyncLogMessage *'
        with
        [
            E=AsyncLogMessage
        ]
./src/hotspot/share/logging/logAsyncWriter.cpp(143): note: Conversion loses qualifiers

Line 143 contains:

AsyncLogMessage* e = it.next();

This works in the original async logging implementation because jdk/src/hotspot/share/utilities/linkedlist.hpp was updated by 8239066: make LinkedList<T> more generic (a next() method that returns an E* was added).

Fix: git cherry-pick b08595d8443bbfb141685dc5eda7c58a34738048 and resolve the conflict (year on copyright line) using Take Incoming (Source).

Unknown class AutoModifyRestore

./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C2065: 'AutoModifyRestore': undeclared identifier
./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C2275: 'size_t': illegal use of this type as an expression
./build/windows-x86_64-normal-server-release/hotspot/variant-server/libjvm/gtest/objs/BUILD_GTEST_LIBJVM_pch.cpp: note: see declaration of 'size_t'
./test/hotspot/gtest/logging/test_asynclog.cpp(205): error C3861: 'saver': identifier not found

Line 205 contains:

AutoModifyRestore<size_t> saver(AsyncLogBufferSize, sz * 1024 /*in byte*/);

AutoModifyRestore was introduced to fix JDK-8245226.

Fix:

cd src/hotspot/share/utilities/
curl -Lo autoRestore.hpp https://raw.githubusercontent.com/openjdk/jdk/195c45a0e11207e15c277e7671b2a82b8077c5fb/src/hotspot/share/utilities/autoRestore.hpp
# Now include autoRestore.hpp in test_asynclog.cpp

Atomic Errors

./src/hotspot/share/logging/logAsyncWriter.cpp(172): error C2039: 'release_store_fence': is not a member of 'Atomic'
D:\dev\repos\java\jdk11u-dev\src\hotspot\share\runtime/atomic.hpp(51): note: see declaration of 'Atomic'
./src/hotspot/share/logging/logAsyncWriter.cpp(172): error C3861: 'release_store_fence': identifier not found

This method was added to atomic.hpp by OrderAccess. Notice that it appears to have been moved from orderAccess.hpp.

Fix:

-Atomic::release_store_fence(&AsyncLogWriter::_instance, self);
+OrderAccess::release_store_fence(&AsyncLogWriter::_instance, self);

‘disable_outputs’ Identifier not Found

./src/hotspot/share/logging/logConfiguration.cpp(114): error C3861: 'disable_outputs': identifier not found
./src/hotspot/share/logging/logConfiguration.cpp(278): error C2039: 'disable_outputs': is not a member of 'LogConfiguration'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logConfiguration.hpp(39): note: see declaration of 'LogConfiguration'
./src/hotspot/share/logging/logConfiguration.cpp(279): error C2065: '_n_outputs': undeclared identifier
./src/hotspot/share/logging/logConfiguration.cpp(293): error C2065: '_outputs': undeclared identifier
./src/hotspot/share/logging/logConfiguration.cpp(296): error C3861: 'delete_output': identifier not found
./src/hotspot/share/logging/logConfiguration.cpp(298): error C2248: 'LogOutput::set_config_string': cannot access protected member declared in class 'LogOutput'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logOutput.hpp(63): note: see declaration of 'LogOutput::set_config_string'
D:\dev\repos\java\forks\jdk11u-dev\src\hotspot\share\logging/logConfiguration.hpp(31): note: see declaration of 'LogOutput'

Line 114 is simple the method call disable_outputs(); Since that method body is present in the file, it must be missing in the header file. The correct logConfiguration.hpp shows that 8255756: Disabling logging does unnecessary work is necessary. (This error might have been visible earlier in the process!)

Fix:

git cherry-pick e66fd6f0aa43356ab4b4361d6d332e5e3bcabeb6

# Resolve straightforward conflicts.

git cherry-pick --continue

Undeclared Identifier

./src/hotspot/share/runtime/thread.cpp(4694): error C2065: 'cl': undeclared identifier

Line 4706 contains:

cl.do_thread(AsyncLogWriter::instance());

The declaration of cl is missing. Blame says it was introduced by commit 06e47d05 of [JDK-8246622] Remove CollectedHeap::print_gc_threads_on() – Java Bug System. Simply paste that PrintClosure class definition into thread.cpp (after line 4654) and the cl declaration PrintOnClosure cl(st); on (now) line 4714.

Building on macOS

Once the build succeeds on Windows, validate the changes by building on macOS.

Undeclared identifier ‘primitive_hash’

/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/utilities/hashtable.hpp:326:36: error: use of undeclared identifier 'primitive_hash'
    unsigned (*HASH)  (K const&) = primitive_hash<K>,
                                   ^
/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/utilities/hashtable.hpp:327:46: error: use of undeclared identifier 'primitive_equals'
    bool     (*EQUALS)(K const&, K const&) = primitive_equals<K>

Fix:

diff --git a/src/hotspot/share/utilities/hashtable.hpp b/src/hotspot/share/utilities/hashtable.hpp
index 30483b2f36..5e4c414490 100644
--- a/src/hotspot/share/utilities/hashtable.hpp
+++ b/src/hotspot/share/utilities/hashtable.hpp
@@ -30,6 +30,7 @@
 #include "oops/oop.hpp"
 #include "oops/symbol.hpp"
 #include "runtime/handles.hpp"
+#include "utilities/resourceHash.hpp"
 
 // This is a generic hashtable, designed to be used for the symbol
 // and string tables.

Default Member Initializer is a C++11 Extension

/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/logging/logAsyncWriter.hpp:149:33: error: default member initializer for non-static data member is a C++11 extension [-Werror,-Wc++11-extensions]
  const size_t _buffer_max_size = {AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)};
                                ^

Fix:

diff --git a/src/hotspot/share/logging/logAsyncWriter.cpp b/src/hotspot/share/logging/logAsyncWriter.cpp
index 0231be78a9..d9f9ddda5b 100644
--- a/src/hotspot/share/logging/logAsyncWriter.cpp
+++ b/src/hotspot/share/logging/logAsyncWriter.cpp
@@ -82,7 +82,8 @@ void AsyncLogWriter::enqueue(LogFileOutput& output, LogMessageBuffer::Iterator m
 
 AsyncLogWriter::AsyncLogWriter()
   : _initialized(false),
-    _stats(17 /*table_size*/) {
+    _stats(17 /*table_size*/),
+    _buffer_max_size(AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)) {
   if (os::create_thread(this, os::asynclog_thread)) {
     _initialized = true;
   } else {
diff --git a/src/hotspot/share/logging/logAsyncWriter.hpp b/src/hotspot/share/logging/logAsyncWriter.hpp
index 313dd6de06..c4e28e5676 100644
--- a/src/hotspot/share/logging/logAsyncWriter.hpp
+++ b/src/hotspot/share/logging/logAsyncWriter.hpp
@@ -146,7 +146,7 @@ class AsyncLogWriter : public NonJavaThread {
 
   // The memory use of each AsyncLogMessage (payload) consists of itself and a variable-length c-str message.
   // A regular logging message is smaller than vwrite_buffer_size, which is defined in logtagset.cpp
-  const size_t _buffer_max_size = {AsyncLogBufferSize / (sizeof(AsyncLogMessage) + vwrite_buffer_size)};
+  const size_t _buffer_max_size;
 
   AsyncLogWriter();
   void enqueue_locked(const AsyncLogMessage& msg);

‘override’ keyword is a C++11 extension

/Users/saint/repos/java/forks/jdk11u-dev/src/hotspot/share/logging/logAsyncWriter.hpp:154:14: error: 'override' keyword is a C++11 extension [-Werror,-Wc++11-extensions]
  void run() override;
             ^
...

Fix: Remove the override keywords

diff --git a/src/hotspot/share/logging/logAsyncWriter.hpp b/src/hotspot/share/logging/logAsyncWriter.hpp
index 313dd6de06..e6ac8aab4a 100644
--- a/src/hotspot/share/logging/logAsyncWriter.hpp
+++ b/src/hotspot/share/logging/logAsyncWriter.hpp
@@ -151,10 +151,10 @@ class AsyncLogWriter : public NonJavaThread {
   AsyncLogWriter();
   void enqueue_locked(const AsyncLogMessage& msg);
   void write();
-  void run() override;
-  char* name() const override { return (char*)"AsyncLog Thread"; }
-  bool is_Named_thread() const override { return true; }
-  void print_on(outputStream* st) const override {
+  void run();
+  char* name() const { return (char*)"AsyncLog Thread"; }
+  bool is_Named_thread() const { return true; }
+  void print_on(outputStream* st) const {
     st->print("\"%s\" ", name());
     Thread::print_on(st);
     st->cr();

Building on Linux

Depending on the GCC version, logAsyncWriter.cpp, logFileOutput.cpp, and test_asynclog.cpp might need to define nullptr to successfully compile:

#ifdef __linux__
#define nullptr 0
#endif

Testing the Build

Windows

To test the async logging code, run this command (HelloWorld doesn’t even need to exist for a really basic test):

./build/windows-x86_64-normal-server-release/jdk/bin/java.exe -Xlog:async -Xlog:all=trace:file=all.log::filecount=0 HelloWorld

Fixing Runtime Bugs

Corrupted Output

After running the simple test above, it becomes evident from the output lgos that something is wrong:

[0.039s][info ][logging          ] The maximum entries of AsyncLogBuffer: 2319, estimated memory use: 2097152 bytes
[@ùŸôÊ ][debug][@ùŸôÊ            ] Async logging thread started.
[      ][info ][ôŸôÊ            ] TemplateTable initialization, 0.0000106 secs

Search for %.*.3.+ to find where the log decorations are done based on this output in the log file. Looks like the big difference is from 8266503: [UL] Make Decorations safely copy-able and reduce their size.

Fix:

git cherry-pick 94c6177f246fc569b416f85f1411f7fe031f7aaf
git cherry-pick 74fecc070a6462e6a2d061525b53a63de15339f9

Wrong Parameter Order

Notice that the order of the parameters passed to Atomic::cmpxchg was also changed so we need to ensure that the arguments are swapped (since they were written when the new Atomic::cmpxchg was already in place). Move the first argument into the last spot.

Resources

2022-03-11 —Categories: hsdis, LLVM

hsdis LLVM backend for Windows ARM64

8253757: Add LLVM-based backend for hsdis by magicus · Pull Request #7531 makes it possible to easily use LLVM as the hsdis backend. An LLVM installation is required for this. The official LLVM builds for the Windows platform do not work for building hsdis because they do not have all the prerequisite LLVM include files. See Building LLVM for Windows ARM64 – Saint’s Log (swesonga.org) for instructions on how to build LLVM for ARM64 Windows (on an x64 Windows host). To configure OpenJDK for LLVM as an hsdis backend on Windows ARM64, use this command:

bash configure --openjdk-target=aarch64-unknown-cygwin \
 --with-hsdis=llvm \
 --with-llvm=/cygdrive/d/dev/software/llvm-aarch64/

The JDK and hsdis can then be built as usual with these commands:

make images
make build-hsdis
make install-hsdis
cp /cygdrive/d/dev/software/llvm-aarch64/bin/LLVM-C.dll build/windows-aarch64-server-slowdebug/jdk/bin/

The generated JDK can then be deployed to an ARM64 machine like the Surface Pro X. To test LLVM’s disassembly, use the -XX:CompileCommand flag on the ARM64 machine:

/java -XX:CompileCommand="print java.lang.String::checkIndex" -version

Behind the Scenes

Missing Include File that Exists?

The path given to --with-llvm needs to be a Cygwin path if building in Cygwin. Otherwise, the build-hsdis target will fail with this error: c:\...\jdk\src\utils\hsdis\llvm\hsdis-llvm.cpp(58): fatal error C1083: Cannot open include file: 'llvm-c/Disassembler.h': No such file or directory. I caught this by inspecting build\windows-aarch64-server-release\make-support\failure-logs\support_hsdis_hsdis-llvm.obj.cmdline after the build failed. This was the only include that didn’t have Cygwin paths: -IC:/dev/repos/llvm-project/build_llvm_AArch64/install_local/include

Investigating Missing Disassembly

My first disassembly attempt did not work – only abstract disassembly was displayed:

...
  # {method} {0x000002ca9940f2e8} 'checkIndex' '(II)V' in 'java/lang/String'
  # parm0:    c_rarg1   = int
  # parm1:    c_rarg2   = int
  #           [sp+0x30]  (sp of caller)
  0x000002ca87ad3940: 1f20 03d5 | e953 40d1 | 3f01 00f9 | ffc3 00d1 | fd7b 02a9 | a201 f837 | 3f00 026b | e200 0054
  0x000002ca87ad3960: fd7b 42a9 | ffc3 0091
...

I verified that hsdis-aarch64.dll was present in the JDK’s bin folder. That was the only issue I had seen before that caused this behavior so I dug around to find the code that loads the hsdis DLL. A search for the “hsdis-” DLL prefix in the sources reveals the hsdis_library_name string used in the Disassembler::dll_load method. Notice that there is a Verbose flag that can display what is happening when loading the hsdis DLL!

void* Disassembler::dll_load(char* buf, int buflen, int offset, char* ebuf, int ebuflen, outputStream* st) {
  int sz = buflen - offset;
  int written = jio_snprintf(&buf[offset], sz, "%s%s", hsdis_library_name, os::dll_file_extension());
  if (written < sz) { // written successfully, not truncated.
    if (Verbose) st->print_cr("Trying to load: %s", buf);
    return os::dll_load(buf, ebuf, ebuflen);
  } else if (Verbose) {
    st->print_cr("Try to load hsdis library failed: the length of path is beyond the OS limit");
  }
  return NULL;
}

This turns out to be a JVM flag! I try passing it to java.exe but -Verbose doesn’t do anything. Learn from HotSpot Command-Line Flags Overhaul – Design Doc – OpenJDK Wiki (java.net) that it’s a -XX: flag. Trying to use it causes JVM to complain that it is a develop-only flag.

Error: VM option 'Verbose' is develop and is available only in debug version of VM.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

The –enable-debug flag documented at Building OpenJDK (java.net) is required to generate a debug VM.

bash configure --openjdk-target=aarch64-unknown-cygwin \
 --with-hsdis=llvm \
 --with-llvm=/cygdrive/d/dev/software/llvm-aarch64/ \
 --enable-debug

Running the debug JVM with the verbose flag now displays some diagnostic information:

.\java.exe -XX:CompileCommand="print java.lang.String::checkIndex" -XX:+Verbose -XX:+PrintMiscellaneous
CompileCommand: print java/lang/String.checkIndex bool print = true


============================= C1-compiled nmethod ==============================
----------------------------------- Assembly -----------------------------------
Trying to load: C:\dev\software\jdk-aarch64\jdk\bin\server\hsdis-aarch64.dll
Trying to load: C:\dev\software\jdk-aarch64\jdk\bin\server\hsdis-aarch64.dll
Trying to load: C:\dev\software\jdk-aarch64\jdk\bin\hsdis-aarch64.dll
Trying to load: hsdis-aarch64.dll
Could not load hsdis-aarch64.dll; Can't find dependent libraries; PrintAssembly defaults to abstract disassembly.

...

The error message substring “find dependent libraries” appears only once in the hotspot source code – in os::dll_load (which is called by Disassembler::dll_load). This error is displayed because LoadLibrary returns ERROR_MOD_NOT_FOUND.

The case of the DLL that refuses to load – The Old New Thing (microsoft.com) mentions loader snaps. The loader snaps are an option for the gflags tool found in the Windows Kits folder. The docs explain that GFlags is included in the Debugging Tools for Windows 10 (WinDbg) so a search for “Debugging Tools for Windows arm64” leads to Debugging ARM64 – Windows drivers. This says to install the Windows SDK, after which I now have the gflags binary (in the x86 folder)!

C:\Program Files (x86)\Windows Kits\10\Debuggers\arm64\gflags.exe

I still wasn’t sure how to see the snaps output. Show Loader Snaps in GFlags.exe, fails to capture any output in WinDbg – Stack Overflow implies that I should be able to use WinDbg to see what is failing to load.

Turns out the loader snaps aren’t really necessary. There is some critical info in the WinDbg diagnostic output:

2698:2e5c @ 03908953 - LdrpResolveDllName - ENTER: DLL name: .\LLVM-C.dll
2698:2e5c @ 03908953 - LdrpResolveDllName - RETURN: Status: 0xc0000135
...
2698:2e5c @ 03908968 - LdrpResolveDllName - ENTER: DLL name: C:\WINDOWS\LLVM-C.dll
2698:2e5c @ 03908968 - LdrpResolveDllName - RETURN: Status: 0xc0000135
...
2698:2e5c @ 03908968 - LdrpSearchPath - RETURN: Status: 0xc0000135
2698:2e5c @ 03908968 - LdrpProcessWork - ERROR: Unable to load DLL: "LLVM-C.dll", Parent Module: "C:\dev\software\jdk-aarch64\jdk\bin\hsdis-aarch64.dll", Status: 0xc0000135

hsdis-aarch64.dll is not being loaded because LLVM-C.dll cannot be found! Still learning the need for reading the full instructions to avoid unnecessary pain.

2022-02-21 —Categories: Compilers, LLVM

Building LLVM for Windows ARM64

I was trying to test using LLVM as a backend for hsdis on the Windows ARM64 platform as implemented in PR 5920. I downloaded LLVM 13 and tried to use it in the build. Unfortunately, it didn’t have all the prerequisite include files and so building your own LLVM installation was the approach suggested for Windows. This post explicitly outlines the instructions needed to build LLVM for the Windows ARM64 platform on a Windows x64 host machine.

The first requirement is an LLVM build with native llvm-nm.exe and llvm-tblgen.exe binaries. These can be downloaded (I think) or generated by building LLVM for the native x64 platform as specified in the instructions from Jorn.

git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build_llvm
cd build_llvm
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=X86" -D"CMAKE_BUILD_TYPE:STRING=Release" -D"CMAKE_INSTALL_PREFIX=install_local" -A x64 -T host=x64
cmake --build . --config Release --target install

Once that build successfully completes, we can then build LLVM for the Windows ARM64 platform with the commands below. Notice that we specify paths to the native llvm-nm and llvm-tblgen binaries to prevent the build from trying to use their ARM64 equivalents (which won’t run on the host).

cd llvm-project
mkdir build_llvm_AArch64
cd build_llvm_Aarch64

cmake ../llvm -DLLVM_TARGETS_TO_BUILD:STRING=AArch64 \
 -DCMAKE_BUILD_TYPE:STRING=Release \
 -DCMAKE_INSTALL_PREFIX=install_local \
 -DCMAKE_CROSSCOMPILING=True \
 -DLLVM_TARGET_ARCH=AArch64 \
 -DLLVM_NM=C:/repos/llvm-project/build_llvm/install_local/bin/llvm-nm.exe \
 -DLLVM_TABLEGEN=C:/repos/llvm-project/build_llvm/install_local/bin/llvm-tblgen.exe \
 -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc \
 -A ARM64 \
 -T host=x64

date; time \
 cmake --build . --config Release --target install ; \
 date

Once the build completes, the LLVM ARM64 files will be in the build_llvm_AArch64/install_local folder in the llvm-project repo. That build should have all the necessary header files and static libraries required for LLVM projects targeting Windows on ARM64. See the general cmake options and the LLVM-specific cmake options for details on the various flags and variables.

Behind the Scenes: Cross-Compiling LLVM

I naively started out by adding AArch64 to the list of LLVM_TARGETS_TO_BUILD, then using only AArch64 in the list. Trying to use the generated build would still fail with errors about mismatched platforms so I knew some cross compilation specific flags would be needed. How do I cross-compile LLVM/Clang for AArch64 on x64 host? – Stack Overflow and How To Cross-Compile Clang/LLVM using Clang/LLVM — LLVM 15.0.0git documentation were handy references. They didn’t have anything windows specific but got me walking down the right path (e.g. the importance of the native LLVM_TABLEGEN). I tried something along these lines:

cd llvm-project
mkdir build_llvm_AArch64
cd build_llvm_AArch64

cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
 -D"CMAKE_BUILD_TYPE:STRING=Release" \
 -D"CMAKE_INSTALL_PREFIX=install_local_AArch64" \
 -D"CMAKE_CROSSCOMPILING=True" \
 -D"LLVM_TARGET_ARCH=AArch64" \
 -A x64 \
 -T host=x64

cmake --build . --config Release --target install

This still results in errors about conflicting machine types:

c:\...\llvm-project\build_llvm_aarch64\install_local_aarch64\\lib\llvmaarch64disassembler.lib : warning LNK4272: library machine type 'x64' conflicts with target machine type 'ARM64'

That’s when I tried adding the LLVM_TABLE_GEN from a Windows x64 LLVM build I had generated earlier. I accidentally omitted the options prefixed with a # below because I didn’t include the trailing slash after adding the llvm-tblgen.exe option.

cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
 -D"CMAKE_BUILD_TYPE:STRING=Release" \
 -D"CMAKE_INSTALL_PREFIX=install_local_AArch64_2" \
 -D"CMAKE_CROSSCOMPILING=True" \
 -D"LLVM_TARGET_ARCH=AArch64" \
 -D"LLVM_TABLEGEN=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-tblgen.exe" \
 #-D"LLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc" \
 #-A x64 \
 #-T host=x64

date; time \
 cmake --build . --config Release --target install; \
 date

The build still succeeded and generated AArch64 .lib files in the LLVM installation! Interestingly, they still had the x64 machine type in the header.

$ dumpbin /headers build_llvm_AArch64_2\install_local_AArch64_2\lib\LLVMAArch64AsmParser.lib
Microsoft (R) COFF/PE Dumper Version 14.29.30133.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file build_llvm_AArch64_2\install_local_AArch64_2\lib\LLVMAArch64AsmParser.lib

File Type: LIBRARY

FILE HEADER VALUES
            8664 machine (x64)
...

I had no choice but to reexamine my understanding of what the -A flag does. It is used to specify the platform name but it’s only after digging into the CMAKE_GENERATOR_PLATFORM docs that I noticed that this was the target platform! This also made me realize that I hadn’t noticed that the x64 C++ compiler was being used all along!

-- The C compiler identification is MSVC 19.29.30133.0
-- The CXX compiler identification is MSVC 19.29.30133.0
-- The ASM compiler identification is MSVC
-- Found assembler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - works

Some references to LLVM triples led back to the clang cross-compilation docs and the llvm::Triple source code so I tried again with the triple set and with -A now set to AArch64.

cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
 -D"CMAKE_BUILD_TYPE:STRING=Release" \
 -D"CMAKE_INSTALL_PREFIX=install_local_AArch64_3" \
 -D"CMAKE_CROSSCOMPILING=True" \
 -D"LLVM_TARGET_ARCH=AArch64" \
 -D"LLVM_TABLEGEN=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-tblgen.exe" \
 -D"LLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc" \
 -A AArch64 \
 -T host=x64

Setting -A to AArch64 causes MSBuild to fail with an error about an unknown platform. So -A just might be the argument I need to get ARM64 libraries built.

"C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\3.17.3\VCTargetsPath.vcxproj" (default target) (1) ->
    (_CheckForInvalidConfigurationAndPlatform target) ->
      C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(820,5): error : The BaseOutputPath/OutputPath property is not set for project 'VCTargetsPath.vcxproj'.  Please check to make sure that you have specified a valid combination of Configuration and Platform for this project.  Configuration='Debug'  Platform='AArch64'.  You may be seeing this message because you are trying to build a project without a solution file, and have specified a non-default Configuration or Platform that doesn't exist for this project. [C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\3.17.3\VCTargetsPath.vcxproj]

So I tried using -A ARM64 instead. I noticed that we now have the ARM64 C++ compiler selected! This is something I should have been paying attention to from the beginning, crucial for cross-compilation.

-- The C compiler identification is MSVC 19.29.30133.0
-- The CXX compiler identification is MSVC 19.29.30133.0
-- The ASM compiler identification is MSVC
-- Found assembler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/arm64/cl.exe - works

Unfortunately, the build still failed with an error from gen-msvc-exports.py. Taking a look at gen-msvc-exports.py, it looks like it is trying to run llvm-nm.exe (for the target platform).

  Generating export list for LLVM-C
  Traceback (most recent call last):
    File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 116, in <module>
      main()
    File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 112, in main
      gen_llvm_c_export(ns.output, ns.underscore, libs, ns.nm)
    File "C:/dev/repos/llvm-project/llvm/tools/llvm-shlib/gen-msvc-exports.py", line 72, in gen_llvm_c_export
      check_call([nm, '-g', lib], stdout=dumpout_f)
    File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 359, in check_call
      retcode = call(*popenargs, **kwargs)
    File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 340, in call
      with Popen(*popenargs, **kwargs) as p:
    File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 854, in __init__
      self._execute_child(args, executable, preexec_fn, close_fds,
    File "C:\dev\tools\Anaconda3\lib\subprocess.py", line 1307, in _execute_child
      hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
  OSError: [WinError 216] This version of %1 is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher
C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(241,5): error MSB8066: Custom build for 'C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFi
les\02a88fa656bb9cf8b9ffd0e0debe57ae\libllvm-c.exports.rule;C:\dev\repos\llvm-project\build_llvm_AArch64_3\CMakeFiles\8ebc0efbf04134b25d0f37561fba0d55\LLVM-C.def.rule;C:\dev\repos\llvm-project\build_llvm_AArch64_
3\CMakeFiles\509fcb3f8bb132e9c560e15e8d25cb45\LLVM-C_exports.rule;C:\dev\repos\llvm-project\llvm\tools\llvm-shlib\CMakeLists.txt' exited with code 1. [C:\dev\repos\llvm-project\build_llvm_AArch64_3\tools\llvm-shl
ib\LLVM-C_exports.vcxproj]

A quick search for the general message (Generating export list for LLVM-C) reveals that it is from llvm-shlib/CMakeLists.txt. Looks like we just need to set LLVM_NM as per llvm-shlib/CMakeLists.txt.

date; time cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=AArch64" \
 -D"CMAKE_BUILD_TYPE:STRING=Release" \
 -D"CMAKE_INSTALL_PREFIX=install_local" \
 -D"CMAKE_CROSSCOMPILING=True" \
 -D"LLVM_TARGET_ARCH=AArch64" \
 -D"LLVM_NM=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-nm.exe" \
 -D"LLVM_TABLEGEN=C:\dev\repos\llvm-project\build_llvm\install_local\bin\llvm-tblgen.exe" \
 -D"LLVM_DEFAULT_TARGET_TRIPLE=aarch64-win32-msvc" \
 -A ARM64 \
 -T host=x64

date; time \
 cmake --build . --config Release --target install; \
 date

These build commands work! Dumpbin shows that the generated .lib files have ARM64 headers!

$ dumpbin /headers C:\dev\repos\llvm-project\build_llvm_AArch64_3\Release\lib\LLVMAArch64Disassembler.lib
Microsoft (R) COFF/PE Dumper Version 14.29.30133.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file C:\dev\repos\llvm-project\build_llvm_AArch64_3\Release\lib\LLVMAArch64Disassembler.lib

File Type: LIBRARY

FILE HEADER VALUES
            AA64 machine (ARM64)

Unfortunately, the JDK project that got me started down this path still doesn’t build. Cygwin shows defines like -DLLVM_DEFAULT_TRIPLET='”aarch64-pc-windows-msvc”‘ being passed to the compiler, which then complains:

C:/.../src/utils/hsdis/llvm/hsdis-llvm.cpp(217): error C2015: too many characters in constant

The quotes in the commands therefore needed to be dropped. This caused build failures since the paths used back-slashes!

Building Opts.inc...
  'C:devreposllvm-projectbuild_llvminstall_localbinllvm-tblgen.exe' is not recognized as an internal or external command,
  operable program or batch file

This is now the part where I find a nice document on the LLVM site with the 3-liner for this task 😀