Categories: Assembly, Java

Inspecting Code in JitWatch

Developers disassemble! Use Java and hsdis to see it all. (oracle.com) is an excellent introduction to using the hotspot disassembler to view the instructions generated by HotSpot for a Java program. It also introduces JITWatch.

JITWatch processes the JIT compilation logs that are output by the JVM and explains the optimization decisions made by the JIT compilers.

Developers disassemble! Use Java and hsdis to see it all. (oracle.com)

Let us try using JITWatch on the sample Factorization program I have been using to learn about systems performance. Use these instructions from that blog post to get JITWatch:

git clone https://github.com/AdoptOpenJDK/jitwatch.git
cd jitwatch
mvn clean package
# Produces an executable jar in ui/target/jitwatch-ui-shaded.jar

java -jar ui/target/jitwatch-ui-shaded.jar

Start the factorization sample application such that a hotspot log file is generated. To do so, use the flags listed in the JITWatch Instructions · AdoptOpenJDK/jitwatch Wiki (github.com). I decide to redirect the output to a file to avoid filling the script with the additional logging output.

$JAVA_HOME/bin/java -XX:+UnlockDiagnosticVMOptions -Xlog:class+load=info -XX:+LogCompilation -XX:+PrintAssembly Factorize 897151542039582592342572091 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 6 > logfile.txt

Loading the HotSpot Log

Click on the “Open Log” button in JITWatch then select the hotspot*.log file. Next, click on the Start button to process the JIT log.

Opening a HotSpot Log File
Processed HotSpot Log
Viewing JIT-compiled Class Members

Clicking on a class member opens another window with the corresponding assembly instructions generated by the JIT. I haven’t set up any source code locations but the assembly instructions are still displayed.

Setting up MVN on Windows

To run JITWatch on Windows, download the Maven binaries from Maven – Download Apache Maven and verify the hashes using certutil. Extract the downloaded .zip file using tar. Here are the instructions I used in Git Bash.

mkdir -p /c/java/binaries/apache
cd /c/java/binaries/apache

curl -Lo apache-maven-3.9.3-bin.zip https://dlcdn.apache.org/maven/maven-3/3.9.3/binaries/apache-maven-3.9.3-bin.zip

certutil -hashfile apache-maven-3.9.3-bin.zip SHA512
# shasum -a 512 apache-maven-3.9.3-bin.zip

tar xf apache-maven-3.9.3-bin.zip

Add MAVEN_HOME to the system PATH environment variable as described at How to Install Maven on Windows {Step-by-Step Guide} (phoenixnap.com) – or run these command in an admin command prompt. Note that I echo the path because if the new PATH is too long, this will happen: WARNING: The data being saved is truncated to 1024 characters but the previous value will still be onscreen if needed. See the pitfalls of setx at setx | Microsoft Learn. The quotes around the new path prevent issues like cmd – Invalid syntax. Default option is not allowed more than ‘2’ time(s) – Stack Overflow.

set MAVEN_HOME=C:\java\binaries\apache\apache-maven-3.9.3
setx /M MAVEN_HOME %MAVEN_HOME%

echo %PATH%
setx /M PATH "%PATH%;%MAVEN_HOME%\bin"

Now build the JITWatch sources in a command prompt:

cd \java\repos\AdoptOpenJDK\jitwatch
C:\java\binaries\apache\apache-maven-3.9.3\bin\mvn clean package

Categories: Assembly

Trial Division Factorization Disassembly

When Experimenting with Async Profiler, I created a basic trial division factorization Java application. To run it, download the OpenJDK build if it isn’t already installed:

mkdir -p ~/java/binaries/jdk/x64
cd ~/java/binaries/jdk/x64
wget https://aka.ms/download-jdk/microsoft-jdk-17.0.7-linux-x64.tar.gz
tar xzf microsoft-jdk-17.0.7-linux-x64.tar.gz

Test the factorization application to verify that the Java build works.

export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7

cd ~/repos/scratchpad/demos/java/FindPrimes
$JAVA_HOME/bin/javac Factorize.java
$JAVA_HOME/bin/java Factorize 123890571352112309857

# Use 4 threads to speed things up
$JAVA_HOME/bin/java Factorize 123890571352112309857 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 4

Using hsdis

hsdis is a HotSpot plugin for disassembling dynamically generated code. Chriswhocodes was kind enough to build hsdis for various platforms and share the binaries on his website – hsdis HotSpot Disassembly Plugin Downloads (chriswhocodes.com). Download the appropriate hsdis binary and move it to the OpenJDK build’s lib directory, e.g.

wget https://chriswhocodes.com/hsdis/hsdis-amd64.so
export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7
mv hsdis-amd64.so $JAVA_HOME/lib/

ls -l $JAVA_HOME/bin/hsdis*

We will need the PrintAssembly option to disassemble the code generated by the compiler when running a Java program. This option requires diagnostic VM options to be unlocked. This is the full command line for generating the disassembly from the application’s execution. The output is redirected to a code.asm file since it can be voluminous.

$JAVA_HOME/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly Factorize 123890571352112309857 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 4 > code.asm

Here is a snippet of the disassembly in code.asm:

============================= C1-compiled nmethod ==============================
----------------------------------- Assembly -----------------------------------

Compiled method (c1)    2052  266       2       java.math.BigInteger::implMulAdd (81 bytes)
 total in heap  [0x00007f2e5943ca90,0x00007f2e5943d038] = 1448
 relocation     [0x00007f2e5943cbf0,0x00007f2e5943cc28] = 56
 main code      [0x00007f2e5943cc40,0x00007f2e5943ce00] = 448
 stub code      [0x00007f2e5943ce00,0x00007f2e5943ce30] = 48
 metadata       [0x00007f2e5943ce30,0x00007f2e5943ce38] = 8
 scopes data    [0x00007f2e5943ce38,0x00007f2e5943cee0] = 168
 scopes pcs     [0x00007f2e5943cee0,0x00007f2e5943d010] = 304
 dependencies   [0x00007f2e5943d010,0x00007f2e5943d018] = 8
 nul chk table  [0x00007f2e5943d018,0x00007f2e5943d038] = 32

--------------------------------------------------------------------------------
[Constant Pool (empty)]

--------------------------------------------------------------------------------

[Verified Entry Point]
  # {method} {0x00000008000a47c0} 'implMulAdd' '([I[IIII)I' in 'java/math/BigInteger'
  # parm0:    rsi:rsi   = '[I'
  # parm1:    rdx:rdx   = '[I'
  # parm2:    rcx       = int
  # parm3:    r8        = int
  # parm4:    r9        = int
  #           [sp+0x50]  (sp of caller)
  0x00007f2e5943cc40:   mov    %eax,-0x14000(%rsp)
  0x00007f2e5943cc47:   push   %rbp
  0x00007f2e5943cc48:   sub    $0x40,%rsp
  0x00007f2e5943cc4c:   movabs $0x7f2e38075370,%rax
  0x00007f2e5943cc56:   mov    0x8(%rax),%edi
  0x00007f2e5943cc59:   add    $0x2,%edi
  0x00007f2e5943cc5c:   mov    %edi,0x8(%rax)
  0x00007f2e5943cc5f:   and    $0xffe,%edi
  0x00007f2e5943cc65:   cmp    $0x0,%edi
  0x00007f2e5943cc68:   je     0x00007f2e5943cd52           ;*iload {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - java.math.BigInteger::implMulAdd@0 (line 3197)
  0x00007f2e5943cc6e:   movslq %r9d,%r9
  0x00007f2e5943cc71:   movabs $0xffffffff,%rax
  0x00007f2e5943cc7b:   and    %rax,%r9
...

Finding the Java Installation Path

In the above example, I have used a Java build in a custom path. If you are using a Java build that is already installed, then a few extra steps might be needed to determine where the JAVA_HOME path, e.g.

saint@ubuntuvm:~$ which java
/usr/bin/java
saint@ubuntuvm:~$ ls -l `which java`
saint@ubuntuvm:~$ ls -l /etc/alternatives/java

Categories: Assembly, Visual C++

Building & Disassembling ARM64 Code using Visual C++

This path C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build has various scripts to set up a command window as documented at Use the Microsoft C++ toolset from the command line | Microsoft Docs. If vcvarsx86_arm64.bat and vcvarsamd64_arm64.bat are missing in that folder on your Windows x64 machine, install the MSVC v143 – VS 2022 C++ ARM64 build tools (Latest) component in the Visual Studio 2022 installer.

Selection ARM64 Build Tools in VS Installer

Once it is installed, open a new cmd.exe window and run this command to set up the build environment:

"C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsamd64_arm64.bat"

To verify that the ARM64 compiler will be used when cl or dumpbin is executed:

D:\> where cl
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\cl.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\cl.exe

D:\> where dumpbin
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\dumpbin.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\dumpbin.exe

To see the command Visual Studio uses to build the project, create a C++ console application and use the Configuration Manager to change the Active solution platform to ARM64. Next, go to Tools > Options then expand the Projects and Solutions node. Select Build And Run then change the MSBuild project build output verbosity to Detailed. Building the project should now show the full command line used to invoke the compiler, for example here are the command lines used in the Debug and Release configurations respectively.

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /JMC /nologo /W3 /WX- /diagnostics:column /sdl /Od /Oy- /D _DEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Debug\\" /Fd"ARM64\Debug\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /nologo /W3 /WX- /diagnostics:column /sdl /O2 /Oi /Oy- /GL /D NDEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /MD /GS /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Release\\" /Fd"ARM64\Release\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

Notice the /O2 flag (maximize speed) in the release build instead of the /Od flag (no optimizations) above. The debug build also uses the just my code /JMC, runtime error checks /RTC1, and debug multithread-specific version of the run-time library /MDd flags. For our testing purposes, we can ignore most of these flags.

Calling Printf

Here is a simple program, aarch64-abi-test-printf.cpp, which calls printf with a format specifier and 4 additional arguments.

#include <stdio.h>

int main()
{
    int result = printf("%.4f,%.4f,%.4f,%s", 1.2345, 1.2345, 1.2345, "str");
}

Compiling a Debug Build

To compile and disassemble this program, run:

cl /c aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi.asm aarch64-abi-test-printf.obj
dumpbin /all /out:printf-abi.txt aarch64-abi-test-printf.obj

The disassembly is shown below with some links to the documentation for the various instructions. See the Arm Architecture Reference Manual for A-profile architecture PDF for more details about these instructions. The overview of AArch64 state at ARM Compiler armasm User Guide Version 6.6.1 is also a useful resource.

Dump of file aarch64-abi-test-printf.obj

File Type: COFF OBJECT

main:
  0000000000000000: A9BE7BFD  stp         fp,lr,[sp,#-0x20]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: 90000008  adrp        x8,$SG5571
  000000000000000C: 91000104  add         x4,x8,$SG5571
  0000000000000010: 58000183  ldr         x3,$LN3
  0000000000000014: 58000162  ldr         x2,$LN3
  0000000000000018: 58000141  ldr         x1,$LN3
  000000000000001C: 90000008  adrp        x8,$SG5572
  0000000000000020: 91000100  add         x0,x8,$SG5572
  0000000000000024: 94000000  bl          printf
  0000000000000028: 2A0003E0  mov         w0,w0
  000000000000002C: B90013E0  str         w0,[sp,#0x10]
  0000000000000030: 52800000  mov         w0,#0
  0000000000000034: A8C27BFD  ldp         fp,lr,[sp],#0x20
  0000000000000038: D65F03C0  ret
  000000000000003C: D503201F  nop
$LN3:
  0000000000000040: 126E978D
  0000000000000044: 3FF3C083

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: F90017E0  str         x0,[sp,#0x28]
  000000000000000C: F90013E1  str         x1,[sp,#0x20]
  0000000000000010: F9000FE2  str         x2,[sp,#0x18]
  0000000000000014: F9000BE3  str         x3,[sp,#0x10]
  0000000000000018: 94000000  bl          __local_stdio_printf_options
  000000000000001C: F9400BE4  ldr         x4,[sp,#0x10]
  0000000000000020: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000024: F94013E2  ldr         x2,[sp,#0x20]
  0000000000000028: F94017E1  ldr         x1,[sp,#0x28]
  000000000000002C: F9400000  ldr         x0,[x0]
  0000000000000030: 94000000  bl          __stdio_common_vfprintf
  0000000000000034: 2A0003E0  mov         w0,w0
  0000000000000038: 2A0003E0  mov         w0,w0
  000000000000003C: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000040: D65F03C0  ret

printf:
  0000000000000000: D10103FF  sub         sp,sp,#0x40
  0000000000000004: A9008BE1  stp         x1,x2,[sp,#8]
  0000000000000008: A90193E3  stp         x3,x4,[sp,#0x18]
  000000000000000C: A9029BE5  stp         x5,x6,[sp,#0x28]
  0000000000000010: F9001FE7  str         x7,[sp,#0x38]
  0000000000000014: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000018: 910003FD  mov         fp,sp
  000000000000001C: F90013E0  str         x0,[sp,#0x20]
  0000000000000020: 9100E3E8  add         x8,sp,#0x38
  0000000000000024: F9000FE8  str         x8,[sp,#0x18]
  0000000000000028: 52800020  mov         w0,#1
  000000000000002C: 94000000  bl          __acrt_iob_func
  0000000000000030: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000034: D2800002  mov         x2,#0
  0000000000000038: F94013E1  ldr         x1,[sp,#0x20]
  000000000000003C: 94000000  bl          _vfprintf_l
  0000000000000040: 2A0003E0  mov         w0,w0
  0000000000000044: B90013E0  str         w0,[sp,#0x10]
  0000000000000048: D2800008  mov         x8,#0
  000000000000004C: F9000FE8  str         x8,[sp,#0x18]
  0000000000000050: B94013E0  ldr         w0,[sp,#0x10]
  0000000000000054: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000058: 910103FF  add         sp,sp,#0x40
  000000000000005C: D65F03C0  ret

  Summary

           8 .bss
          68 .chks64
          9C .debug$S
          62 .drectve
          18 .pdata
          1A .rdata
          F8 .text$mn
          10 .xdata

In the disassembly generated by dumpbin (printf-abi.asm), notice that all 5 arguments to printf are passed in registers! x0 contains a pointer to the format string, x1-x3 contain the address of the $LN3 label. The 64-bits at that label are the IEEE double floating point representation of 1.2345. x4 contains a pointer to the null-terminated string “str“.

Which are the printf String Arguments?

To determine what symbols in instructions like adrp x8,$SG5571 mean, we use the output of dumpbin /all. The RELOCATIONS section shows $SG5571 to have symbol index 8. The COFF SYMBOL TABLE shows this symbol index 8 to be in SECT3. The raw data for section 3 contains the format string and the single string parameter passed to printf. I’m still not sure how the assembler knows the difference in offsets between these 2 strings?

.
.
.
SECTION HEADER #3
  .rdata name
       0 physical address
       0 virtual address
      1A size of raw data
     31A file pointer to raw data (0000031A to 00000333)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40400040 flags
         Initialized Data
         8 byte align
         Read Only

RAW DATA #3
  00000000: 73 74 72 00 00 00 00 00 25 2E 34 66 2C 25 2E 34  str.....%.4f,%.4
  00000010: 66 2C 25 2E 34 66 2C 25 73 00                    f,%.4f,%s.
.
.
.
RELOCATIONS #4
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000008  PAGEBASE_REL21             90000008         8  $SG5571
 0000000C  PAGEOFFSET_12A             91000104         8  $SG5571
 0000001C  PAGEBASE_REL21             90000008         9  $SG5572
 00000020  PAGEOFFSET_12A             91000100         9  $SG5572
 00000024  BRANCH26                   94000000        16  printf
.
.
.
COFF SYMBOL TABLE
000 01057A64 ABS    notype       Static       | @comp.id
001 80010190 ABS    notype       Static       | @feat.00
002 00000000 SECT1  notype       Static       | .drectve
    Section length   62, #relocs    0, #linenums    0, checksum        0
004 00000000 SECT2  notype       Static       | .debug$S
    Section length   9C, #relocs    0, #linenums    0, checksum        0
006 00000000 SECT3  notype       Static       | .rdata
    Section length   1A, #relocs    0, #linenums    0, checksum B99D9667
008 00000000 SECT3  notype       Static       | $SG5571
009 00000008 SECT3  notype       Static       | $SG5572
00A 00000000 SECT4  notype       Static       | .text$mn

Compiling an Optimized Build

Specifying the /O2 flag for speed generates optimized code.

cl /c /O2 /Fo"printf-abi-o2.obj" aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi-o2.asm printf-abi-o2.obj
dumpbin /all /out:printf-abi-o2.txt printf-abi-o2.obj

In the optimized code below, the IEEE double is loaded into d16 then copied to the x1-x3 registers by the FMOV instruction.

Dump of file printf-abi-o2.obj

File Type: COFF OBJECT

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD53F3  stp         x19,x20,[sp,#-0x30]!
  0000000000000004: A9015BF5  stp         x21,x22,[sp,#0x10]
  0000000000000008: F90013FE  str         lr,[sp,#0x20]
  000000000000000C: AA0003F6  mov         x22,x0
  0000000000000010: AA0103F5  mov         x21,x1
  0000000000000014: AA0203F4  mov         x20,x2
  0000000000000018: AA0303F3  mov         x19,x3
  000000000000001C: 94000000  bl          __local_stdio_printf_options
  0000000000000020: F9400000  ldr         x0,[x0]
  0000000000000024: AA1303E4  mov         x4,x19
  0000000000000028: AA1403E3  mov         x3,x20
  000000000000002C: AA1503E2  mov         x2,x21
  0000000000000030: AA1603E1  mov         x1,x22
  0000000000000034: 94000000  bl          __stdio_common_vfprintf
  0000000000000038: F94013FE  ldr         lr,[sp,#0x20]
  000000000000003C: A9415BF5  ldp         x21,x22,[sp,#0x10]
  0000000000000040: A8C353F3  ldp         x19,x20,[sp],#0x30
  0000000000000044: D65F03C0  ret

main:
  0000000000000000: F81F0FFE  str         lr,[sp,#-0x10]!
  0000000000000004: 5C0001B0  ldr         d16,$LN4
  0000000000000008: 90000008  adrp        x8,??_C@_03OJMAPEGJ@str@
  000000000000000C: 91000104  add         x4,x8,??_C@_03OJMAPEGJ@str@
  0000000000000010: 90000008  adrp        x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000014: 91000100  add         x0,x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000018: 9E660203  fmov        x3,d16
  000000000000001C: 9E660202  fmov        x2,d16
  0000000000000020: 9E660201  fmov        x1,d16
  0000000000000024: 94000000  bl          printf
  0000000000000028: 52800000  mov         w0,#0
  000000000000002C: F84107FE  ldr         lr,[sp],#0x10
  0000000000000030: D65F03C0  ret
  0000000000000034: D503201F  nop
$LN4:
  0000000000000038: 126E978D
  000000000000003C: 3FF3C083

printf:
  0000000000000000: A9BA53F3  stp         x19,x20,[sp,#-0x60]!
  0000000000000004: A9017BF5  stp         x21,lr,[sp,#0x10]
  0000000000000008: A9028BE1  stp         x1,x2,[sp,#0x28]
  000000000000000C: A90393E3  stp         x3,x4,[sp,#0x38]
  0000000000000010: A9049BE5  stp         x5,x6,[sp,#0x48]
  0000000000000014: F9002FE7  str         x7,[sp,#0x58]
  0000000000000018: AA0003F4  mov         x20,x0
  000000000000001C: 52800020  mov         w0,#1
  0000000000000020: 9100A3F5  add         x21,sp,#0x28
  0000000000000024: 94000000  bl          __acrt_iob_func
  0000000000000028: AA0003F3  mov         x19,x0
  000000000000002C: 94000000  bl          __local_stdio_printf_options
  0000000000000030: F9400000  ldr         x0,[x0]
  0000000000000034: D2800003  mov         x3,#0
  0000000000000038: AA1403E2  mov         x2,x20
  000000000000003C: AA1303E1  mov         x1,x19
  0000000000000040: AA1503E4  mov         x4,x21
  0000000000000044: 94000000  bl          __stdio_common_vfprintf
  0000000000000048: A9417BF5  ldp         x21,lr,[sp,#0x10]
  000000000000004C: A8C653F3  ldp         x19,x20,[sp],#0x60
  0000000000000050: D65F03C0  ret

  Summary

           8 .bss
          70 .chks64
          94 .debug$S
          62 .drectve
          18 .pdata
          16 .rdata
          E8 .text$mn
           8 .xdata

The example we have reviewed in this post passed only 5 parameters to printf. To see how more than 8 parameters are handled, see the example print call in aarch64-abi-test-printf-manyargs.cpp and printf-abi-many.asm (or for the optimized assembly code, printf-abi-many-o2.asm).

Additional resources on AArch64:


Categories: Assembly, hsdis, OpenJDK

hsdis+binutils on macOS/Linux

A previous post explored how to use LLVM as the backend disassembler for hsdis. The instructions for how to use GNU binutils (the currently supported option) are straightforward. Listing them here for completeness (assuming you have cloned the OpenJDK repo into your ~/repos/java/jdk folder). Note that they depend on more recent changes. See the docs on the Java command for more info about the -XX:CompileCommand option.

# Download and extract GNU binutils 2.37
cd ~
curl -Lo binutils-2.37.tar.gz https://ftp.gnu.org/gnu/binutils/binutils-2.37.tar.gz
tar xvf binutils-2.37.tar.gz

# Configure the OpenJDK repo for hsdis
cd ~/repos/java/jdk
bash configure --with-hsdis=binutils --with-binutils-src=~/binutils-2.37

# Build hsdis
make build-hsdis

To deploy the built hsdis library on macOS:

cd build/macosx-aarch64-server-release

# Copy the hsdis library into the JDK bin folder
cp support/hsdis/libhsdis.dylib jdk/bin/hsdis-aarch64.dylib

To deploy the built hsdis library on Ubuntu Linux (open question: is this step even necessary?):

cd build/linux-x86_64-server-release

# Copy the hsdis library into the JDK bin folder
cp support/hsdis/libhsdis.so jdk/bin/

Update 2024-03-13: use the make install-hsdis command to copy the hsdis binaries into the new OpenJDK build. This will ensure that the hsdis binary is copied to lib/hsdis-adm64.so (this file name should be used in place of any others that listed by find . -name *hsdis*).

Now we can disassemble some code, e.g. the String.checkIndex method mentioned in PR 5920.

# Disassemble some code
jdk/bin/java -XX:CompileCommand="print java.lang.String::checkIndex" -version

To see how to disassemble the code for a class, we can use the basic substitution cipher class from the post on Building HSDIS in Cygwin as an example. Download, compile and disassemble it using the commands below. Note that these commands save the .java file to a temp folder to make cleanup much easier. Also note the redirection to a file since the output can be voluminous.

cd jdk/bin
mkdir -p temp
cd temp

curl -Lo BasicSubstitutionCipher.java https://raw.githubusercontent.com/swesonga/scratchpad/main/apps/crypto/substitution-cipher/BasicSubstitutionCipher.java

../javac BasicSubstitutionCipher.java

../java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation BasicSubstitutionCipher > disassembled.txt

open disassembled.txt


Categories: Assembly, OpenJDK

LLVM as an hsdis Backend

To specify a backend for hsdis, the OpenJDK repo needs to be configured with the --with-hsdis option. As of commit 77757ba9, LLVM is not yet supported as an hsdis disassembly backend. Therefore, this error from make/autoconf/jdk-options.m4 is displayed. Here’s an example on the Windows platform:

$ bash configure --with-hsdis=llvm
...
checking what hsdis backend to use... invalid
configure: error: Incorrect hsdis backend "llvm"
configure exiting with result code 1

There has been an effort to enable using LLVM as the hsdis disassembler’s backend. To use this change, check out this branch with those changes (and some conflict resolution to incorporate more recent changes).

hsdis LLVM backend on macOS ARM64

To test the LLVM backend for hsdis on macOS, install LLVM using brew (Apple’s LLVM does not have the llvm-c include files):

# install LLVM
brew install llvm

Now build the OpenJDK. This should use Apple’s compiler since we have not made any configuration changes.

cd ~/repos/java/jdk
bash configure
make images

Now add brew’s LLVM bin directory to the PATH and run bash configure again passing the --with-hsdis=llvm option as shown below. The configuration process will detect the clang++ compiler installed by brew and set it up for use when the build-hsdis target is executed.

# Now add brew's LLVM to the PATH before running bash configure
export OLDPATH=$PATH
export PATH="/opt/homebrew/opt/llvm/bin:$PATH"

bash configure --with-hsdis=llvm
make build-hsdis
make install-hsdis
export PATH=$OLDPATH

The install-hsdis target does not appear to be copying the hsdis library to the jdk/bin folder so these commands are required:

cd build/macosx-aarch64-server-release
cp support/hsdis/libhsdis.dylib jdk/bin/hsdis-aarch64.dylib

We can now test hsdis as described in the post about Building hsdis in Cygwin.

hsdis LLVM backend on Windows x86-64

To test the LLVM backend for hsdis, we need to first clone and builld LLVM because the LLVM installer does not come with the include files needed to build the changes in PR 5920. These instructions are from Jorn.

git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build_llvm
cd build_llvm
cmake ../llvm -D"LLVM_TARGETS_TO_BUILD:STRING=X86" -D"CMAKE_BUILD_TYPE:STRING=Release" -D"CMAKE_INSTALL_PREFIX=install_local" -A x64 -T host=x64
cmake --build . --config Release --target install

Now we can configure the OpenJDK repo for hsdis, and build both the JDK and hsdis.

bash configure --with-hsdis=llvm \
     LLVM_CONFIG=C:/dev/repos/llvm-project/build_llvm/install_local/bin \
     --with-llvm=C:/dev/repos/llvm-project/build_llvm/install_local/
make build-hsdis
make images

hsdis LLVM backend on Windows ARM64

Open question: is this supported?

Testing the hsdis LLVM backend

The String.checkIndex method of PR 5920 is a good candidate for testing the hsdis LLVM backend. The -XX:CompileCommand option can be used to print the generated assembler code after compilation of the specified method.

java -XX:CompileCommand="print java.lang.String::checkIndex" -version

Tips


Categories: Assembly, Compilers

Fixing Hsdis Compile Failure in GNU binutils

The previous post on Building HSDIS in Cygwin required running this command to actually build the hsdis DLL.

make OS=Linux MINGW=x86_64-w64-mingw32 BINUTILS=~/binutils-2.37

As it turns out, this make command fails because of a bug in the GNU binutils source code. This is the error I got:

...
x86_64-w64-mingw32-gcc -c -DHAVE_CONFIG_H -O    -I. -I/home/User/binutils-2.37/libiberty/../include  -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -Wshadow=local -pedantic  -D_GNU_SOURCE  /home/User/binutils-2.37/libiberty/rust-demangle.c -o rust-demangle.o
/home/User/binutils-2.37/libiberty/rust-demangle.c:78:3: error: unknown type name ‘uint’
   78 |   uint recursion;
      |   ^~~~
/home/User/binutils-2.37/libiberty/rust-demangle.c: In function ‘demangle_path’:
/home/User/binutils-2.37/libiberty/rust-demangle.c:81:37: error: ‘uint’ undeclared (first use in this function); did you mean ‘int’?
   81 | #define RUST_NO_RECURSION_LIMIT   ((uint) -1)
      |                                     ^~~~
...
make[2]: *** [Makefile:1230: rust-demangle.o] Error 1
...

At this point, I wasn’t sure which version I used to build successfully. Searching for that error (and binutils to narrow things down) led to this bug in the sourceware.org Bugzilla that appears to be the exact bug I ran into: 28207 – error: unknown type name ‘uint’ (78 | uint recursion;) avr-gcc mingw32 Windows Build (sourceware.org). Fortunately, one Alan helpfully points out that this bug fixed on the binutils-2.37 branch with commit 999566402e3.

To figure out where the binutils git repo is, I click on the Browse button in Bugzilla then navigate to the binutils product category, which has a link to the list of bugs for the binutils component. A re-opened bug seems likely to have a link to some commits. I select 26865 – windres: –preprocessor option won’t respect space in file path (sourceware.org) and sure enough, there is a link to a commit on the binutils repo. We can now view the history of rust-demangle.c. To find the commit in question, click on any commitdiff to get the URL format then replace the hash in the URL with 999566402e3 to reveal the aforementioned fix for the unknown type name uint error.

Cloning binutils Repo

I’m used to GitHub where looking at the repo structure implies that you’re at a URL you can copy and trim to clone. In this other web view, the URL to clone is listed above the shortlog:

git clone https://sourceware.org/git/binutils-gdb.git

Tracing the Bug

At this point, it makes sense to verify that the 2.37 sources I downloaded actually contain the bug. Observe that:

  1. the tags section contains a binutils-2_37 tag described as “Official GNU Binutils 2.37 Release” and committed on Sun, 18 Jul 2021 16:46:54 +0000 (17:46 +0100).
  2. the fix for the build error shows a fix committed by Alan on Mon, 19 Jul 2021 11:32:21 +0000 (21:02 +0930)
  3. the bug fix that introduced the error was committed on Thu, 15 Jul 2021 15:51:56 +0000 (16:51 +0100)

Therefore, using binutils older than 2.37 should work just fine. However, it may still be necessary to run “rm -fr build” in the hsdis folder to enable 2.36 to be picked up when you run make (otherwise 2.37 is still baked into some of configure’s output).


Categories: Assembly, Cygwin

Building HSDIS in Cygwin

Hsdis is an externally loadable disassembler plugin. It lets you see which assembly instructions the JVM generates for your Java code. On 64-bit Windows, it is a binary called hsdis-amd64.dll (and hsdis-i386.dll on 32-bit platforms). This binary needs to be in the same directory as jvm.dll. Some good resources out there on building the hsdis binary for the OpenJDK include:

For Cygwin, the latter resource (from 2012?) is all we need. I like that Gunnar’s blog post covered how to use hsdis after building it so this writeup aims to combine both blogs into a simple Cygwin install-build-disassemble set of instructions.

Building hsdis for 64-bit JVMs

  1. Install Cygwin with the gcc-core, make, and mingw64-x86_64-gcc-core packages by launching the setup executable using this command (no need to bother selecting packages in the UI since you have already specified them on the command line)
setup-x86_64.exe -P gcc-core -P mingw64-x86_64-gcc-core -P make
  1. Launch the Cygwin64 terminal
  2. Clone the OpenJDK repo to get the hsdis sources (if you have not yet set up a Windows OpenJDK Development Environment).
mkdir ~/repos
cd ~/repos
git clone https://github.com/openjdk/jdk
  1. Run these commands to download GNU binutils and build hsdis (Update 2022-01-07: version downgraded to 2.36 to avoid build failures investigated in Fixing Hsdis Compile Failure in GNU binutils).
cd ~
curl -Lo binutils-2.36.tar.gz https://ftp.gnu.org/gnu/binutils/binutils-2.36.tar.gz
tar xvf binutils-2.36.tar.gz

cd ~/repos/jdk/src/utils/hsdis
make OS=Linux MINGW=x86_64-w64-mingw32 BINUTILS=~/binutils-2.36
  1. Copy the hsdis binary to the locally built java bin folder
cp src/utils/hsdis/build/Linux-amd64/hsdis-amd64.dll build/windows-x86_64-server-release/jdk/bin/

Testing hsdis

I have created a basic substitution cipher, which we can compile and disassemble using the commands below. Note that these commands save the .java file to a temp folder to make cleanup much easier. Also note the redirection to a file since the output can be voluminous.

cd build/windows-x86_64-server-release/jdk/bin
mkdir -p temp
cd temp

curl -Lo BasicSubstitutionCipher.java https://raw.githubusercontent.com/swesonga/scratchpad/main/apps/crypto/substitution-cipher/BasicSubstitutionCipher.java

../javac BasicSubstitutionCipher.java

../java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation BasicSubstitutionCipher > BasicSubstitutionCipher.disassembled.txt

Once the disassembly completes, we can view the instructions generated in the BasicSubstitutionCipher.disassembled.txt file.

One open question in this setup is why the installed GNU binutils cannot be used to build hsdis. Seems strange to have to build them from source when the binutils Cygwin package was also installed in step 1 above.