Monitoring Context Switching in a Linux Process

As I learn about systems performance, one question that often arises is who is responsible for the context switching on a system. In Linux, the number of context switches per second is displayed by vmstat. To see this information every second, for example, run vmstat 1. Here is sample output from my Ubuntu 22.04 VM showing about 50000 context switches per sec. I used the -SM option to display memory info in Megabytes (which reduces the amount of output per line). The last (optional) argument is the number of updates (lines) to be displayed.

saint@ubuntuvm:~$ vmstat -SM 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0   4603    386   5176    0    0     1     5   16    7 35  6 59  0  0
 4  0      0   4603    386   5176    0    0     0    88 20031 45649 51  8 41  0  0
 4  0      0   4603    386   5176    0    0     0     0 23768 52532 49  8 43  0  0
 4  0      0   4603    386   5176    0    0     0     0 23826 52931 49  7 43  0  0
 5  0      0   4603    386   5176    0    0     0     0 23328 51731 49  9 43  0  0

Use pidstat to get a per-process breakdown of context switches per second. Like vmstat, a report delay and a report count can be specified. Without specifying any specific flags, the default output is a breakdown of CPU usage.

saint@ubuntuvm:~$ pidstat 1 5
...
Average:      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:      129       691    0.40    0.00    0.00    0.00    0.40     -  mysqld
Average:     1000       896    0.60    0.40    0.00    0.00    1.00     -  Xorg
Average:     1000      1161    2.20    0.40    0.00    0.20    2.59     -  gnome-shell
Average:     1000     23655    0.40    0.00    0.00    0.20    0.40     -  gnome-terminal-
Average:     1000     23822    0.60    0.20    0.00    0.00    0.80     -  code
Average:     1000     23880    0.20    0.00    0.00    0.00    0.20     -  code
Average:     1000     23907    0.20    0.00    0.00    0.00    0.20     -  code
Average:     1000     33819    0.80    0.20    0.00    0.00    1.00     -  firefox
Average:     1000   1149158    1.40    0.20    0.00    0.00    1.60     -  Isolated Web Co
Average:     1000   1218546  294.81   37.52    0.00    0.00  332.34     -  java
Average:     1000   1227240    0.20    0.80    0.00    0.00    1.00     -  pidstat

For our purposes, we need to specify the -w flag to report task switching activity. After the specified number of reports have been displayed, the average number of context switches (voluntary and involuntary) are displayed.

saint@ubuntuvm:~$ pidstat -w 1 5
Linux 5.19.0-45-generic (ubuntuvm) 	07/03/2023 	_x86_64_	(6 CPU)

<5 reports truncated>

Average:      UID       PID   cswch/s nvcswch/s  Command
Average:        0         1      0.20      0.00  systemd
Average:        0        14      0.60      0.00  ksoftirqd/0
Average:        0        15     28.94      0.00  rcu_preempt
Average:        0        16      0.20      0.00  migration/0
Average:        0        22      0.20      0.00  migration/1
Average:        0        23      0.20      0.00  ksoftirqd/1
Average:        0        28      0.20      0.00  migration/2
Average:        0        29      0.60      0.00  ksoftirqd/2
Average:        0        34      0.20      0.00  migration/3
Average:        0        35      1.80      0.00  ksoftirqd/3
Average:        0        40      0.60      0.00  migration/4
Average:        0        41      1.00      0.00  ksoftirqd/4
Average:        0        46      0.20      0.00  migration/5
Average:        0        47      0.60      0.00  ksoftirqd/5
Average:        0        58      1.80      0.00  kcompactd0
Average:        0       195      0.40      0.00  kworker/1:1H-kblockd
Average:        0       210      0.60      0.00  kworker/4:1H-kblockd
Average:        0       238      0.60      0.20  jbd2/sda2-8
Average:        0       369      1.00      0.00  hv_balloon
Average:      108       484      3.99      0.00  systemd-oomd
Average:      101       485      0.60      0.00  systemd-resolve
Average:        0       523      0.20      0.00  acpid
Average:        0       576      0.20      0.00  wpa_supplicant
Average:     1000       896     81.44      0.00  Xorg
Average:     1000      1161     63.67     15.17  gnome-shell
Average:     1000     23655     97.60      0.80  gnome-terminal-
Average:     1000     23719      0.40      0.00  code
Average:     1000     23782      0.20      0.00  code
Average:     1000     23822      9.78      0.60  code
Average:     1000     23880      0.40      0.00  code
Average:     1000     23906      1.40      0.00  code
Average:     1000     23907      3.39      0.40  code
Average:     1000     23936      1.60      0.00  code
Average:     1000     23954      0.20      0.00  code
Average:     1000     24009      0.20      0.00  code
Average:     1000     24019      0.20      0.00  code
Average:     1000     33819      6.99      0.00  firefox
Average:     1000     34122      0.20      0.00  Privileged Cont
Average:        0    988101      4.99      0.00  kworker/0:1-mm_percpu_wq
Average:        0    993536      6.19      0.00  kworker/5:1-mm_percpu_wq
Average:        0   1124417      4.39      0.00  kworker/4:1-events
Average:        0   1148363      4.59      0.00  kworker/2:0-mm_percpu_wq
Average:        0   1148688      3.79      0.00  kworker/1:2-mm_percpu_wq
Average:        0   1149149      2.79      0.00  kworker/3:2-events
Average:     1000   1149158     61.88      2.00  Isolated Web Co
Average:     1000   1149675     10.38      6.19  Isolated Web Co
Average:     1000   1152215      0.40      0.00  code
Average:        0   1223207     71.26      0.00  kworker/u12:1-events_unbound
Average:        0   1226506     37.13      0.00  kworker/u12:3-writeback
Average:        0   1227384     27.94      0.00  kworker/u12:4-events_unbound
Average:     1000   1228275      1.00     17.96  pidstat

I happened to have my factorization Java application running on this VM. Interestingly, it does not appear in this output despite the fact that the time command displays a large number of context switches for this application. To see this, set up the application as described in the next section.

Running a Sample Multithreaded Application

Clone the scratchpad repo then compile and launch the factorization application using these instructions (with any necessary changes to the JAVA_HOME path):

# Download a Java build if necessary
mkdir -p ~/java/binaries/jdk/x64
cd ~/java/binaries/jdk/x64
wget https://aka.ms/download-jdk/microsoft-jdk-17.0.7-linux-x64.tar.gz
tar xzf microsoft-jdk-17.0.7-linux-x64.tar.gz

# Set the JAVA_HOME environment variable
export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7

# Get the Factorization source code
mkdir ~/repos
cd ~/repos
git clone https://github.com/swesonga/scratchpad
cd scratchpad/demos/java/FindPrimes

# Compile the factorization source code
$JAVA_HOME/bin/javac Factorize.java

# Factorize a number and display task statistics
/usr/bin/time -v $JAVA_HOME/bin/java Factorize 897151542039582592342572091 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 6

Notice the context switching statistics when the command completes:

	Command being timed: "/home/saint/java/binaries/jdk/x64/jdk-20+36/bin/java Factorize 897151542039582592342572091 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 6"
	User time (seconds): 37.59
	System time (seconds): 6.47
	Percent of CPU this job got: 363%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.12
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 298576
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 8
	Minor (reclaiming a frame) page faults: 70247
	Voluntary context switches: 367393
	Involuntary context switches: 1337
	Swaps: 0
	File system inputs: 0
	File system outputs: 64
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

Per-Thread Task Switching Information

How can get some insight into the context switching in the Java process? We can also display statistics for threads associated with selected tasks using the -t option. We now get insight into Java’s contribution to context switching:

saint@ubuntuvm:~$ pidstat -w -t 1 5
Average:      UID      TGID       TID   cswch/s nvcswch/s  Command
Average:        0        14         -      1.58      0.00  ksoftirqd/0
Average:        0         -        14      1.58      0.00  |__ksoftirqd/0
Average:        0        15         -     32.94      0.00  rcu_preempt
Average:        0         -        15     32.94      0.00  |__rcu_preempt
Average:        0        16         -      0.39      0.00  migration/0
Average:        0         -        16      0.39      0.00  |__migration/0
Average:        0        22         -      0.20      0.00  migration/1
Average:        0         -        22      0.20      0.00  |__migration/1
...
Average:     1000     24546     24549      0.99      0.00  (java)__VM Thread
Average:     1000         -     24554      0.99      0.00  |__Monitor Deflati
Average:     1000         -     24555      0.20      0.00  |__C2 CompilerThre
Average:     1000         -     24556      0.20      0.00  |__C1 CompilerThre
Average:     1000         -     24562     20.12      0.00  |__VM Periodic Tas
Average:     1000         -     24610      0.20      0.00  |__pool-2-thread-1
Average:     1000         -     24704      2.17      0.00  |__Attach Listener
...
Average:        0   1232414         -    641.22      0.00  kworker/u12:0-events_unbound
Average:        0         -   1232414    641.22      0.00  |__kworker/u12:0-events_unbound
Average:        0   1234513         -    174.16      0.00  kworker/u12:2-events_unbound
Average:        0         -   1234513    174.16      0.00  |__kworker/u12:2-events_unbound
Average:        0   1236229         -    703.94      0.00  kworker/u12:3-ext4-rsv-conversion
Average:        0         -   1236229    703.94      0.00  |__kworker/u12:3-ext4-rsv-conversion
Average:     1000   1236678   1236680     66.27      0.20  (java)__GC Thread#0
Average:     1000         -   1236684     16.96      0.00  |__G1 Service
Average:     1000         -   1236685     56.21      1.58  |__VM Thread
Average:     1000         -   1236690      0.99      0.00  |__Monitor Deflati
Average:     1000         -   1236691      0.20      0.00  |__C2 CompilerThre
Average:     1000         -   1236692      0.20      0.00  |__C1 CompilerThre
Average:     1000         -   1236694     20.12      0.00  |__VM Periodic Tas
Average:     1000         -   1236697     22.68     20.71  |__Thread-0
Average:     1000         -   1236698     24.26    172.98  |__Thread-1
Average:     1000         -   1236699     24.65    170.22  |__Thread-2
Average:     1000         -   1236700     22.09    219.92  |__Thread-3
Average:     1000         -   1236701     65.48      1.18  |__GC Thread#1
Average:     1000         -   1236702     64.69      0.59  |__GC Thread#2
Average:     1000         -   1236703     62.13      0.39  |__GC Thread#3
Average:     1000         -   1236704     64.50      0.20  |__GC Thread#4
Average:     1000         -   1236705     64.50      0.20  |__GC Thread#5
Average:     1000   1236750         -      0.99   1119.72  pidstat
Average:     1000         -   1236750      0.99   1119.72  |__pidstat

Per-Process Task Switching Information

Given the large number of tasks on the system, it may be helpful to focus on Java alone. Running jps shows the pids of the Java processes running. We can pass the PID of interest to pidstat. Only context switches for that process will now be displayed, e.g.

saint@ubuntuvm:~$ pidstat -w -t -p 1236678 1 5
...
Average:     1000   1236678         -      0.00      0.00  java
Average:     1000         -   1236678      0.00      0.00  |__java
Average:     1000         -   1236679      0.00      0.00  |__java
Average:     1000         -   1236680     64.00      0.00  |__GC Thread#0
Average:     1000         -   1236681      0.00      0.00  |__G1 Main Marker
Average:     1000         -   1236682      0.00      0.00  |__G1 Conc#0
Average:     1000         -   1236683      0.00      0.00  |__G1 Refine#0
Average:     1000         -   1236684     18.00      0.00  |__G1 Service
Average:     1000         -   1236685     56.00      5.40  |__VM Thread
Average:     1000         -   1236686      0.00      0.00  |__Reference Handl
Average:     1000         -   1236687      0.00      0.00  |__Finalizer
Average:     1000         -   1236688      0.00      0.00  |__Signal Dispatch
Average:     1000         -   1236689      0.00      0.00  |__Service Thread
Average:     1000         -   1236690      1.00      0.00  |__Monitor Deflati
Average:     1000         -   1236691      0.20      0.00  |__C2 CompilerThre
Average:     1000         -   1236692      0.20      0.00  |__C1 CompilerThre
Average:     1000         -   1236693      0.00      0.00  |__Notification Th
Average:     1000         -   1236694     20.20      0.00  |__VM Periodic Tas
Average:     1000         -   1236695      0.00      0.00  |__Common-Cleaner
Average:     1000         -   1236697     17.80      7.60  |__Thread-0
Average:     1000         -   1236698     20.00      8.60  |__Thread-1
Average:     1000         -   1236699     21.40     12.00  |__Thread-2
Average:     1000         -   1236700     17.80      9.40  |__Thread-3
Average:     1000         -   1236701     64.40      0.20  |__GC Thread#1
Average:     1000         -   1236702     63.20      0.20  |__GC Thread#2
Average:     1000         -   1236703     66.20      0.00  |__GC Thread#3
Average:     1000         -   1236704     63.80      0.20  |__GC Thread#4
Average:     1000         -   1236705     63.60      0.20  |__GC Thread#5

The pidstat man page describes lots of other options that can be used to customize the output.


Categories: git, Security

Storing Git Credentials on Ubuntu

To store encrypted git credentials on disk in Ubuntu, install pass and the git-credential-manager. We will use gpg to generate a key that pass will use for secure storage and retrieval of credentials. Use these commands to get everything set up for git:

cd ~/Downloads
wget https://github.com/git-ecosystem/git-credential-manager/releases/download/v2.1.2/gcm-linux_amd64.2.1.2.deb

sudo dpkg -i gcm-linux_amd64.2.1.2.deb
git-credential-manager configure
git config --global credential.credentialStore

gpg --gen-key

sudo apt install pass
pass init <generated-key>

Background

The GitHub PAT I have been using on my Ubuntu VM recently expired. Authentication failed when I tried to push to my repo. I generated a new PAT as outlined at https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token and entered it on the command line when running git push. Of course, I got something wrong entering the PAT manually (assuming it would get saved).

saint@ubuntuvm:~/repos/scratchpad$ git push
Username for 'https://github.com': swesonga
Password for 'https://swesonga@github.com': 
remote: Permission to swesonga/scratchpad.git denied to swesonga.
fatal: unable to access 'https://github.com/swesonga/scratchpad/': The requested URL returned error: 403

Instead of fighting with this command line, I decided to educate myself on the proper way to do this. ubuntu git keychain – Search (bing.com) led me to this post on git – How to store your github https password on Linux in a terminal keychain? – Stack Overflow, which states that the 2022 answer would be to use the Microsft cross-platform GCM (Git Credential Manager). The git-credential-manager/docs/install.md page links to the instructions at git-credential-manager/docs/credstores.md. I download the .deb file from Release GCM 2.1.2 · git-ecosystem/git-credential-manager (github.com).

saint@ubuntuvm:~/repos/scratchpad$ sudo dpkg -i ~/Downloads/gcm-linux_amd64.2.1.2.deb 
[sudo] password for saint: 
Selecting previously unselected package gcm.
(Reading database ... 272980 files and directories currently installed.)
Preparing to unpack .../gcm-linux_amd64.2.1.2.deb ...
Unpacking gcm (2.1.2) ...
Setting up gcm (2.1.2) ...
saint@ubuntuvm:~/repos/scratchpad$ which git-credential-manager
/usr/local/bin/git-credential-manager
saint@ubuntuvm:~/repos/scratchpad$ git-credential-manager configure
Configuring component 'Git Credential Manager'...
Configuring component 'Azure Repos provider'...

The git push experience is now different:

saint@ubuntuvm:~/repos/scratchpad$ git push
fatal: No credential store has been selected.

Set the GCM_CREDENTIAL_STORE environment variable or the credential.credentialStore Git configuration setting to one of the following options:

  secretservice : freedesktop.org Secret Service (requires graphical interface)
  gpg           : GNU `pass` compatible credential storage (requires GPG and `pass`)
  cache         : Git's in-memory credential cache
  plaintext     : store credentials in plain-text files (UNSECURE)

See https://aka.ms/gcm/credstores for more information.

Username for 'https://github.com':
saint@ubuntuvm:~/repos/scratchpad$ git config --global credential.credentialStore
saint@ubuntuvm:~/repos/scratchpad$ git push
fatal: Password store has not been initialized at '/home/saint/.password-store'; run `pass init <gpg-id>` to initialize the store.
See https://aka.ms/gcm/credstores for more information.
Username for 'https://github.com': 

Since I own the VM, I don’t mind credentials being stored on disk (but not in plain text), so I set up gpg and pass as instructed.

saint@ubuntuvm:~$ gpg --gen-key
gpg (GnuPG) 2.2.27; Copyright (C) 2021 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Note: Use "gpg --full-generate-key" for a full featured key generation dialog.

GnuPG needs to construct a user ID to identify your key.

Real name: Saint Wesonga
Email address: saint@swesonga.org
You selected this USER-ID:
    "Saint Wesonga <saint@swesonga.org>"
...

saint@ubuntuvm:~$ sudo apt install pass
[sudo] password for saint: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libqrencode4 qrencode tree xclip
Suggested packages:
  libxml-simple-perl python ruby
The following NEW packages will be installed:
  libqrencode4 pass qrencode tree xclip
0 upgraded, 5 newly installed, 0 to remove and 92 not upgraded.
Need to get 151 kB of archives.
After this operation, 442 kB of additional disk space will be used.
Do you want to continue? [Y/n]
...

saint@ubuntuvm:~$ pass init ABCDEF0123456789
mkdir: created directory '/home/saint/.password-store/'
Password store initialized for ABCDEF0123456789

Apparently I used the wrong value for the key but git push is unfazed – it pushes successfully after the browser authentication completes. I’m not sure what is happening now since browser authentication is in use but as long as I can push, I can forge ahead with other tasks.

saint@ubuntuvm:~/repos/scratchpad$ git push
info: please complete authentication in your browser...
fatal: Failed to encrypt file '/home/saint/.password-store/git/https/github.com/swesonga.gpg' with gpg. exit=2, out=, err=gpg: <WRONG HEX VALUE>: skipped: No public key
gpg: [stdin]: encryption failed: No public key

Enumerating objects: 11, done.
Counting objects: 100% (11/11), done.
Delta compression using up to 6 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 745 bytes | 745.00 KiB/s, done.
Total 6 (delta 3), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (3/3), completed with 3 local objects

Update: 2023-09-20. Use pass rm -r git to authenticate in the browser the next time git push is executed (e.g. if the password store secret is lost).


Categories: Profiling

Using perf in WSL Ubuntu Terminal

When Experimenting with perf on Linux, I used an Ubuntu VM. This can be a bit more cumbersome when simply trying to understand what various Linux commands can do. I decided to try using WSL to experiment with perf. Running wsl from the command line was sufficient to determine how to install the Ubuntu distribution.

C:\dev> wsl
Windows Subsystem for Linux has no installed distributions.
Distributions can be installed by visiting the Microsoft Store:
https://aka.ms/wslstore

C:\dev> wsl --install
Windows Subsystem for Linux is already installed.
The following is a list of valid distributions that can be installed.
Install using 'wsl --install -d <Distro>'.

NAME                                   FRIENDLY NAME
Ubuntu                                 Ubuntu
Debian                                 Debian GNU/Linux
kali-linux                             Kali Linux Rolling
Ubuntu-18.04                           Ubuntu 18.04 LTS
Ubuntu-20.04                           Ubuntu 20.04 LTS
Ubuntu-22.04                           Ubuntu 22.04 LTS
OracleLinux_7_9                        Oracle Linux 7.9
OracleLinux_8_7                        Oracle Linux 8.7
OracleLinux_9_1                        Oracle Linux 9.1
SUSE-Linux-Enterprise-Server-15-SP4    SUSE Linux Enterprise Server 15 SP4
openSUSE-Leap-15.4                     openSUSE Leap 15.4
openSUSE-Tumbleweed                    openSUSE Tumbleweed

C:\dev> wsl --install -d Ubuntu-22.04
Installing: Ubuntu 22.04 LTS
Ubuntu 22.04 LTS has been installed.
Launching Ubuntu 22.04 LTS...
Creating a UNIX user account

Installing perf

Install the linux-tools-generic package then check the perf version as follows:

sudo apt install linux-tools-generic
/usr/lib/linux-tools/5.15.0-73-generic/perf --version

Background Investigation

Once the WSL Ubuntu distro installation completed and I have created a user account, I start by checking the perf --version lets you know how it can be installed:

saint@machine:~$ perf --version
Command 'perf' not found, but can be installed with:
sudo apt install linux-intel-iotg-tools-common    # version 5.15.0-1027.32, or
sudo apt install linux-nvidia-tools-common        # version 5.15.0-1023.23
sudo apt install linux-tools-common               # version 5.15.0-71.78
sudo apt install linux-nvidia-5.19-tools-common   # version 5.19.0-1009.9
sudo apt install linux-nvidia-tegra-tools-common  # version 5.15.0-1012.12

Since I’m not looking for anything vendor specific, I try to install the linux-tools-common package.

saint@machine:~$ sudo apt install linux-tools-common
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  linux-tools-common
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 290 kB of archives.
After this operation, 823 kB of additional disk space will be used.
Ign:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-common all 5.15.0-71.78
Err:1 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-common all 5.15.0-71.78
  404  Not Found [IP: ... 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-tools-common_5.15.0-71.78_all.deb  404  Not Found [IP: ... 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

Now try the command suggested in the last error:

saint@machine:~$ sudo apt-get update
Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Get:2 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [455 kB]
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Get:5 http://security.ubuntu.com/ubuntu jammy-security/main Translation-en [122 kB]
Get:6 http://security.ubuntu.com/ubuntu jammy-security/main amd64 c-n-f Metadata [10.1 kB]
Get:7 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [349 kB]
Get:8 http://security.ubuntu.com/ubuntu jammy-security/restricted Translation-en [52.6 kB]
Get:9 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [108 kB]
...
Get:39 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 c-n-f Metadata [548 B]
Get:40 http://archive.ubuntu.com/ubuntu jammy-backports/multiverse amd64 c-n-f Metadata [116 B]
Fetched 25.1 MB in 5s (4725 kB/s)
Reading package lists... Done

That seems to do the trick:

saint@machine:~$ sudo apt install linux-tools-common
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  linux-tools-common
0 upgraded, 1 newly installed, 0 to remove and 41 not upgraded.
Need to get 277 kB of archives.
After this operation, 833 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-common all 5.15.0-73.80 [277 kB]
Fetched 277 kB in 0s (793 kB/s)
Selecting previously unselected package linux-tools-common.
(Reading database ... 24137 files and directories currently installed.)
Preparing to unpack .../linux-tools-common_5.15.0-73.80_all.deb ...
Unpacking linux-tools-common (5.15.0-73.80) ...
Setting up linux-tools-common (5.15.0-73.80) ...
Processing triggers for man-db (2.10.2-1) ...

Can we run a perf command now? No, perf not found for my kernel.

saint@machine:~$ perf --version
WARNING: perf not found for kernel 5.10.102.1-microsoft

  You may need to install the following packages for this specific kernel:
    linux-tools-5.10.102.1-microsoft-standard-WSL2
    linux-cloud-tools-5.10.102.1-microsoft-standard-WSL2

  You may also want to install one of the following packages to keep up to date:
    linux-tools-standard-WSL2
    linux-cloud-tools-standard-WSL2

Is that really my kernel version? Yes it is.

saint@mymachine:~$ uname -a
Linux mymachine 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Unfortunately, the suggested packages cannot be found:

saint@machine:~$ sudo apt install linux-tools-standard-WSL2
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-tools-standard-WSL2
saint@machine:~$ sudo apt install linux-tools-5.10.102.1-microsoft-standard-WSL2
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-tools-5.10.102.1-microsoft-standard-WSL2
E: Couldn't find any package by glob 'linux-tools-5.10.102.1-microsoft-standard-WSL2'
nt@machine:~$ sudo apt-get install linux-tools-5.10.102.1-microsoft-standard-WSL2
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package linux-tools-5.10.102.1-microsoft-standard-WSL2
E: Couldn't find any package by glob 'linux-tools-5.10.102.1-microsoft-standard-WSL2'
E: Couldn't find any package by regex 'linux-tools-5.10.102.1-microsoft-standard-WSL2'

Searching for the error message Unable to locate package linux-tools-5.10.102.1-microsoft-standard-WSL2 – Search (bing.com) reveals that this is a fairly common issue.

The interesting thing about this is that the version numbers shown in the list of packages to be installed do not match my kernel version. However, the installation succeeds.

saint@machine:~$ sudo apt install linux-tools-generic
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  linux-tools-5.15.0-73 linux-tools-5.15.0-73-generic
The following NEW packages will be installed:
  linux-tools-5.15.0-73 linux-tools-5.15.0-73-generic linux-tools-generic
0 upgraded, 3 newly installed, 0 to remove and 41 not upgraded.
Need to get 7931 kB of archives.
After this operation, 27.3 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-5.15.0-73 amd64 5.15.0-73.80 [7926 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-5.15.0-73-generic amd64 5.15.0-73.80 [1786 B]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-tools-generic amd64 5.15.0.73.71 [2308 B]
Fetched 7931 kB in 2s (5163 kB/s)
Selecting previously unselected package linux-tools-5.15.0-73.
(Reading database ... 24210 files and directories currently installed.)
Preparing to unpack .../linux-tools-5.15.0-73_5.15.0-73.80_amd64.deb ...
Unpacking linux-tools-5.15.0-73 (5.15.0-73.80) ...
Selecting previously unselected package linux-tools-5.15.0-73-generic.
Preparing to unpack .../linux-tools-5.15.0-73-generic_5.15.0-73.80_amd64.deb ...
Unpacking linux-tools-5.15.0-73-generic (5.15.0-73.80) ...
Selecting previously unselected package linux-tools-generic.
Preparing to unpack .../linux-tools-generic_5.15.0.73.71_amd64.deb ...
Unpacking linux-tools-generic (5.15.0.73.71) ...
Setting up linux-tools-5.15.0-73 (5.15.0-73.80) ...
Setting up linux-tools-5.15.0-73-generic (5.15.0-73.80) ...
Setting up linux-tools-generic (5.15.0.73.71) ...

perf --version still fails though. It’s not a symlink to anything else.

saint@machine:~$ ls -l `which perf`
-rwxr-xr-x 1 root root 1622 May 15 07:10 /usr/bin/perf

However, there is a user that was able to use perf by running the tool in the /usr/lib/linux-tools/… directory. Sure enough, this does the trick!

saint@machine:~$ /usr/lib/linux-tools/5.15.0-73-generic/perf --version
perf version 5.15.98

Sharing Files Between Windows and WSL Ubuntu

I was curious about whether I could generate a report from a perf.data file generated on another machine. The docs on Working across file systems show how easy it is to use a file on the Windows file system:

cd /mnt/c/dev/reports
/usr/lib/linux-tools/5.15.0-73-generic/perf report -n --stdio > report.txt

This doesn’t work though. The command fails after about 40 seconds with the error No kallsyms or vmlinux with build-id 5c3d8... was found.


Categories: Assembly

Trial Division Factorization Disassembly

When Experimenting with Async Profiler, I created a basic trial division factorization Java application. To run it, download the OpenJDK build if it isn’t already installed:

mkdir -p ~/java/binaries/jdk/x64
cd ~/java/binaries/jdk/x64
wget https://aka.ms/download-jdk/microsoft-jdk-17.0.7-linux-x64.tar.gz
tar xzf microsoft-jdk-17.0.7-linux-x64.tar.gz

Test the factorization application to verify that the Java build works.

export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7

cd ~/repos/scratchpad/demos/java/FindPrimes
$JAVA_HOME/bin/javac Factorize.java
$JAVA_HOME/bin/java Factorize 123890571352112309857

# Use 4 threads to speed things up
$JAVA_HOME/bin/java Factorize 123890571352112309857 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 4

Using hsdis

hsdis is a HotSpot plugin for disassembling dynamically generated code. Chriswhocodes was kind enough to build hsdis for various platforms and share the binaries on his website – hsdis HotSpot Disassembly Plugin Downloads (chriswhocodes.com). Download the appropriate hsdis binary and move it to the OpenJDK build’s lib directory, e.g.

wget https://chriswhocodes.com/hsdis/hsdis-amd64.so
export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7
mv hsdis-amd64.so $JAVA_HOME/lib/

ls -l $JAVA_HOME/bin/hsdis*

We will need the PrintAssembly option to disassemble the code generated by the compiler when running a Java program. This option requires diagnostic VM options to be unlocked. This is the full command line for generating the disassembly from the application’s execution. The output is redirected to a code.asm file since it can be voluminous.

$JAVA_HOME/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly Factorize 123890571352112309857 CUSTOM_THREAD_COUNT_VIA_THREAD_CLASS 4 > code.asm

Here is a snippet of the disassembly in code.asm:

============================= C1-compiled nmethod ==============================
----------------------------------- Assembly -----------------------------------

Compiled method (c1)    2052  266       2       java.math.BigInteger::implMulAdd (81 bytes)
 total in heap  [0x00007f2e5943ca90,0x00007f2e5943d038] = 1448
 relocation     [0x00007f2e5943cbf0,0x00007f2e5943cc28] = 56
 main code      [0x00007f2e5943cc40,0x00007f2e5943ce00] = 448
 stub code      [0x00007f2e5943ce00,0x00007f2e5943ce30] = 48
 metadata       [0x00007f2e5943ce30,0x00007f2e5943ce38] = 8
 scopes data    [0x00007f2e5943ce38,0x00007f2e5943cee0] = 168
 scopes pcs     [0x00007f2e5943cee0,0x00007f2e5943d010] = 304
 dependencies   [0x00007f2e5943d010,0x00007f2e5943d018] = 8
 nul chk table  [0x00007f2e5943d018,0x00007f2e5943d038] = 32

--------------------------------------------------------------------------------
[Constant Pool (empty)]

--------------------------------------------------------------------------------

[Verified Entry Point]
  # {method} {0x00000008000a47c0} 'implMulAdd' '([I[IIII)I' in 'java/math/BigInteger'
  # parm0:    rsi:rsi   = '[I'
  # parm1:    rdx:rdx   = '[I'
  # parm2:    rcx       = int
  # parm3:    r8        = int
  # parm4:    r9        = int
  #           [sp+0x50]  (sp of caller)
  0x00007f2e5943cc40:   mov    %eax,-0x14000(%rsp)
  0x00007f2e5943cc47:   push   %rbp
  0x00007f2e5943cc48:   sub    $0x40,%rsp
  0x00007f2e5943cc4c:   movabs $0x7f2e38075370,%rax
  0x00007f2e5943cc56:   mov    0x8(%rax),%edi
  0x00007f2e5943cc59:   add    $0x2,%edi
  0x00007f2e5943cc5c:   mov    %edi,0x8(%rax)
  0x00007f2e5943cc5f:   and    $0xffe,%edi
  0x00007f2e5943cc65:   cmp    $0x0,%edi
  0x00007f2e5943cc68:   je     0x00007f2e5943cd52           ;*iload {reexecute=0 rethrow=0 return_oop=0}
                                                            ; - java.math.BigInteger::implMulAdd@0 (line 3197)
  0x00007f2e5943cc6e:   movslq %r9d,%r9
  0x00007f2e5943cc71:   movabs $0xffffffff,%rax
  0x00007f2e5943cc7b:   and    %rax,%r9
...

Finding the Java Installation Path

In the above example, I have used a Java build in a custom path. If you are using a Java build that is already installed, then a few extra steps might be needed to determine where the JAVA_HOME path, e.g.

saint@ubuntuvm:~$ which java
/usr/bin/java
saint@ubuntuvm:~$ ls -l `which java`
saint@ubuntuvm:~$ ls -l /etc/alternatives/java

Categories: Big Data

Hadoop Native Libraries for Apache Spark

The post on Diagnosing Hadoop Native Library Load Failures was focused on Hadoop being run as a standalone application. However, it can also be one component among many in an application with broader scope, such as Apache Spark. Having not used spark before, I found the Quick Start – Spark 3.4.0 Documentation (apache.org) informative. It suggested downloading the packaged release of Spark from the Spark website but I went with this CDN https://dlcdn.apache.org/spark/spark-3.4.0/ since it was the same one I had downloaded my Hadoop build from.

Setting Up and Launching Spark

Download and extract Spark using these commands:

cd ~/java/binaries
mkdir spark
cd spark
curl -Lo spark-3.4.0-bin-hadoop3.tgz https://dlcdn.apache.org/spark/spark-3.4.0/spark-3.4.0-bin-hadoop3.tgz

tar xzf spark-3.4.0-bin-hadoop3.tgz
cd spark-3.4.0-bin-hadoop3

Spark needs JAVA_HOME to be set (otherwise the first message displayed will be ERROR: JAVA_HOME is not set and could not be found).

export JAVA_HOME=~/java/binaries/jdk/x64/jdk-11.0.19+7

Next, I started the Spark shell by running this command as per the Quick Start docs:

./bin/spark-shell

Notice that the same Hadoop warning from Diagnosing Hadoop Native Library Load Failures showed up again! However, we have already seen that the Hadoop logging level can be customized. The key question now is how to enable DEBUG logging in spark

saint@ubuntuvm:~/java/binaries/spark/spark-3.4.0-bin-hadoop3$ ./bin/spark-shell
23/06/01 10:31:38 WARN Utils: Your hostname, ubuntuvm resolves to a loopback address: 127.0.1.1; using 172.18.28.45 instead (on interface eth0)
23/06/01 10:31:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/06/01 10:31:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://ubuntuvm.mshome.net:4040
Spark context available as 'sc' (master = local[*], app id = local-1685637105440).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.4.0
      /_/
         
Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 17.0.6)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

Command Line Customization of Spark Logging Level

The Overview – Spark 3.4.0 Documentation (apache.org) page states that …

Users can also download a “Hadoop free” binary and run Spark with any Hadoop version by augmenting Spark’s classpath.

Spark 3.4.0 Documentation (apache.org)

The classpath augmentation doc (Using Spark’s “Hadoop Free” Build) is what informs me that the way Spark uses Hadoop can be customized by entries in conf/spark-env.sh. Unfortunately, there are no log level settings in the spark-env.sh.template file in that directory. After a bit of a winding journey, I discover that the way to customize the logging level is to first create a conf/log4j2.properties file by running:

cp conf/log4j2.properties.template conf/log4j2.properties

Next, change the logging level by updating this line:

logger.repl.level = warn

Launching the Spark shell now displays a much more informative error message. It is now evident that the paths being searched for native libraries do not include the path we need.

23/06/01 11:16:31 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
23/06/01 11:16:31 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
23/06/01 11:16:31 DEBUG NativeCodeLoader: java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
23/06/01 11:16:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Fixing the Spark Hadoop Native Libraries

I searched for how to pass spark extra java options. The Tuning – Spark 3.4.0 Documentation mentioned the spark.executor.defaultJavaOptions and spark.executor.extraJavaOptions arguments, which I found documented at Configuration – Spark 3.4.0 Documentation. These are the flags I (unsuccessfully) tried passing to the Spark shell to load the Hadoop native library:

The required flag is the –driver-library-path. Sounds like the extraLibraryPath options didnt’ work because the JVM has already started by the time those are being processed.

./bin/spark-shell --driver-library-path=/home/saint/java/binaries/hadoop/x64/hadoop-3.3.5/lib/native

The –driver-library-path flag allows Spark to successfully load the Hadoop native libraries. The logging messages confirm this:

...
3/06/01 11:57:06 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
23/06/01 11:57:06 DEBUG NativeCodeLoader: Loaded the native-hadoop library
...

Appendix: Resources Reviewed for Spark Logging Level Changes

It was the SPARK-7261 pull request that led me to look for the log4j2.properties file. Changing rootLogger.level did not have any effect but scrolling through revealed the key line setting logger.repl.level.


Diagnosing Hadoop Native Library Load Failures

Running a Basic Hadoop Command

The instructions for how to run hadoop haven’t changed much since I last used hadoop over 5 years ago (see Setting up Apache Hadoop). Download a recent stable release from one of the Apache Download Mirrors. I picked hadoop-3.3.5-aarch64.tar.gz from https://dlcdn.apache.org/hadoop/common/hadoop-3.3.5/.

mkdir -p ~/java/binaries/hadoop
cd ~/java/binaries/hadoop

curl -Lo hadoop-3.3.5-aarch64.tar.gz https://dlcdn.apache.org/hadoop/common/hadoop-3.3.5/hadoop-3.3.5-aarch64.tar.gz

tar xzf hadoop-3.3.5-aarch64.tar.gz

I used the instructions at Apache Hadoop 3.3.5 – Hadoop: Setting up a Single Node Cluster to test the build by running the grep example. See the Grep source code for the implementation details of the example.

export JAVA_HOME=~/java/binaries/jdk/x64/jdk-11.0.19+7/

mkdir testinput
cp etc/hadoop/*.xml testinput

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar grep testinput testoutput 'dfs[a-z.]+'

cat testoutput/*

When running this test code, I noticed this warning (first message displayed):

2023-05-31 12:31:33,686 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Checking for Loadable Native Libraries

The Apache Hadoop 3.3.5 – Native Libraries Guide explains that there is a NativeLibraryChecker that can be run using the command bin/hadoop checknative -a to show which native libraries can/cannot be loaded.

saint@ubuntuvm:~/java/binaries/hadoop/hadoop-3.3.5$ find . -name lib*.so
./lib/native/libhadoop.so
./lib/native/libhdfspp.so
./lib/native/libhdfs.so
./lib/native/libnativetask.so
saint@ubuntuvm:~/java/binaries/hadoop/hadoop-3.3.5$ uname -a
Linux ubuntuvm 5.19.0-41-generic #42~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 18 17:40:00 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
saint@ubuntuvm:~/java/binaries/hadoop/hadoop-3.3.5$ bin/hadoop checknative -a
2023-05-31 13:36:04,467 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop:  false 
zlib:    false 
zstd  :  false 
bzip2:   false 
openssl: false 
ISA-L:   false 
PMDK:    false 
2023-05-31 13:36:04,711 INFO util.ExitUtil: Exiting with status 1: ExitException

Diagnosing Native Library Load Errors

My assumption when seeing that none of these native libraries could be loaded was that I needed to install all those dependencies. I started with lib64z.

saint@ubuntuvm:~/java/binaries/hadoop/hadoop-3.3.5$ sudo apt install lib64z1
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  gcc-12-base:i386 krb5-locales libc6:i386 libc6-amd64:i386 libcom-err2:i386 libcrypt1:i386
  libgcc-s1:i386 libgssapi-krb5-2 libgssapi-krb5-2:i386 libidn2-0:i386 libk5crypto3 libk5crypto3:i386
  libkeyutils1:i386 libkrb5-3 libkrb5-3:i386 libkrb5support0 libkrb5support0:i386 libnsl2:i386
  libnss-nis:i386 libnss-nisplus:i386 libssl3 libssl3:i386 libtirpc3:i386 libunistring2:i386
Suggested packages:
  glibc-doc:i386 locales:i386 krb5-doc krb5-user krb5-doc:i386 krb5-user:i386
The following NEW packages will be installed:
  gcc-12-base:i386 krb5-locales lib64z1:i386 libc6:i386 libc6-amd64:i386 libcom-err2:i386
  libcrypt1:i386 libgcc-s1:i386 libgssapi-krb5-2:i386 libidn2-0:i386 libk5crypto3:i386
  libkeyutils1:i386 libkrb5-3:i386 libkrb5support0:i386 libnsl2:i386 libnss-nis:i386
  libnss-nisplus:i386 libssl3:i386 libtirpc3:i386 libunistring2:i386
The following packages will be upgraded:
  libgssapi-krb5-2 libk5crypto3 libkrb5-3 libkrb5support0 libssl3
5 upgraded, 20 newly installed, 0 to remove and 85 not upgraded.
Need to get 10.3 MB/12.2 MB of archives.
After this operation, 38.1 MB of additional disk space will be used.
Do you want to continue? [Y/n] 

Interestingly, rerunning checknative still showed false for all the native libraries! Next step was to inspect how the checknative argument is handled. It invokes the hadoop/NativeLibraryChecker.java class, which in turn calls the hadoop/NativeCodeLoader.java. One of the most important observations in the latter file is the additional debug logging available when the library doesn’t load!

Enabling Debug Logging

The logging code uses LoggerFactory, which is discussed in the Introduction to SLF4J | Baeldung. My question is now about how to change slf4j level at runtime? – Stack Overflow. A Google search for hadoop change log level leads me to another SO post on Setting the logging level in Hadoop to WARN – Stack Overflow but that isn’t as useful as the Hadoop commands guide at Apache Hadoop 2.7.0 –. Just need to pass the --loglevel flag to hadoop.

bin/hadoop --loglevel DEBUG checknative -a

The debug output is much now more informative! Notice the warning about the possible platform mismatch of the native library!

saint@ubuntuvm:~/java/binaries/hadoop/hadoop-3.3.5$ bin/hadoop --loglevel DEBUG checknative -a
2023-05-31 14:47:32,624 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
2023-05-31 14:47:32,625 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /home/saint/java/binaries/hadoop/hadoop-3.3.5/lib/native/libhadoop.so.1.0.0: /home/saint/java/binaries/hadoop/hadoop-3.3.5/lib/native/libhadoop.so.1.0.0: cannot open shared object file: No such file or directory (Possible cause: can't load AARCH64-bit .so on a AMD 64-bit platform)
2023-05-31 14:47:32,625 DEBUG util.NativeCodeLoader: java.library.path=/home/saint/java/binaries/hadoop/hadoop-3.3.5/lib/native
2023-05-31 14:47:32,625 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2023-05-31 14:47:32,836 DEBUG util.Shell: setsid exited with exit code 0
Native library checking:
hadoop:  false 
zlib:    false 
zstd  :  false 
bzip2:   false 
openssl: false 
ISA-L:   false 
PMDK:    false 
2023-05-31 14:47:32,847 DEBUG util.ExitUtil: Exiting with status 1: ExitException
1: ExitException
	at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:381)
	at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:369)
	at org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:154)
2023-05-31 14:47:32,856 INFO util.ExitUtil: Exiting with status 1: ExitException

To determine the architecture for which the shared library was compiled, I started with the objdump -f command as suggested by a StackOverflow post. However, it outputs architecture: UNKNOWN!, which isn’t very useful. The file command from the same post proves to be exactly what I need.

saint@ubuntuvm:~/java/binaries/hadoop/aarch64/hadoop-3.3.5$ objdump -f lib/native/libhadoop.so

lib/native/libhadoop.so:     file format elf64-little
architecture: UNKNOWN!, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x0000000000005b80

saint@ubuntuvm:~/java/binaries/hadoop/aarch64/hadoop-3.3.5$ file lib/native/libhadoop.so
lib/native/libhadoop.so: symbolic link to libhadoop.so.1.0.0
saint@ubuntuvm:~/java/binaries/hadoop/aarch64/hadoop-3.3.5$ file lib/native/libhadoop.so.1.0.0
lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, BuildID[sha1]=19fbe9b0a7449eb05b687721548251af752b869f, with debug_info, not stripped

Turns out I was using an x86-64 Ubuntu VM instead of the aarch64 Ubuntu VM I had created so naturally, hadoop couldn’t load the aarch64 hadoop native library! For the VM I had been using, I needed to get the hadoop build by running:

curl -Lo hadoop-3.3.5.tar.gz https://dlcdn.apache.org/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz

Checking the loading status of the native libraries now indicates that the hadoop native library can be successfully loaded:

saint@ubuntuvm:~/java/binaries/hadoop/x64/hadoop-3.3.5$ bin/hadoop checknative -a
2023-05-31 14:58:40,869 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2023-05-31 14:58:40,877 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2023-05-31 14:58:40,887 WARN erasurecode.ErasureCodeNative: Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory)
2023-05-31 14:58:40,887 WARN erasurecode.ErasureCodeNative: ISA-L support is not available in your platform... using builtin-java codec where applicable
2023-05-31 14:58:41,035 INFO nativeio.NativeIO: The native code was built without PMDK support.
Native library checking:
hadoop:  true /home/saint/java/binaries/hadoop/x64/hadoop-3.3.5/lib/native/libhadoop.so.1.0.0
zlib:    true /lib/x86_64-linux-gnu/libz.so.1
zstd  :  true /lib/x86_64-linux-gnu/libzstd.so.1
bzip2:   true /lib/x86_64-linux-gnu/libbz2.so.1
openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared object file: No such file or directory)!
ISA-L:   false Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory)
PMDK:    false The native code was built without PMDK support.
2023-05-31 14:58:41,056 INFO util.ExitUtil: Exiting with status 1: ExitException

Switching to the aarch64 Ubuntu VM also showed the aarch64 hadoop native library being successfully loaded on that platform. In hindsight, the 386 architecture references when I installed lib64z could have been a warning sign if I wasn’t just blasting my way through running these commands.


Categories: Security

Limiting SSH Identifier File Permissions

I recently needed to Use SSH keys to connect to a Linux Azure VMs from my primary development machine, a Windows desktop. OpenSSH has been available on Windows since 2018 as per this OpenSSH for Windows overview. I downloaded my private key for the Azure VM to a file called my_key.pem. Just to be sure I knew which executable would run when I launched ssh, I used this command line.

C:\> where ssh
C:\Windows\System32\OpenSSH\ssh.exe

I then passed the -i my_key.pem option to ssh when connecting to the VM.

ssh -J user1@ipaddress1 -i my_key.pem user2@ipaddress2

It was then that I discovered that ssh checks the file permissions on Windows and considered them too open by default. This is the error I got:

Bad permissions. Try removing permissions for user: BUILTIN\\Users (S-1-5-32-545) on file C:/.../my_key.pem.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'my_key.pem' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "my_key.pem": bad permissions
someuser@0.0.0.0: Permission denied (publickey).

The Security identifiers on Windows are well documented. BUILTIN\\Users (S-1-5-32-545)

A security identifier is used to uniquely identify a security principal or security group. Security principals can represent any entity that can be authenticated by the operating system, such as a user account, a computer account, or a thread or process that runs in the security context of a user or computer account.

Security identifiers
S-1-5-32-545UsersA built-in group. After the initial installation of the operating system, the only member is the Authenticated Users group.
Description of the “Users” Group
Security identifiers

Fortunately, someone else ran into this way before me: amazon web services – Set pem file permissions for AWS without chmod on Windows – Stack Overflow. To see the current file permissions, run icacls without any additional flags.

C:\> icacls my_key.pem
my_key.pem BUILTIN\Administrators:(I)(F)
           NT AUTHORITY\SYSTEM:(I)(F)
           BUILTIN\Users:(I)(RX)
           NT AUTHORITY\Authenticated Users:(I)(M)

Successfully processed 1 files; Failed processing 0 files

The solution from that StackOverflow post is to run the commands below.

icacls my_key.pem /reset
icacls my_key.pem /grant:r %username%:(R)
icacls my_key.pem /inheritance:r

The icacls docs explain that /reset “Replaces ACLs with default inherited ACLs for all matching files.” That doesn’t change anything on my system. The /grant option adds my personal account to the list of accounts with permission to the file.

Grants specified user access rights. Permissions replace previously granted explicit permissions.

Not adding the :r, means that permissions are added to any previously granted explicit permissions.

icacls | Microsoft Learn

The /inheritance:r option removes the 4 security identifiers shown previously from the private key file’s DACL. SSH is now happy to get the authentication identity from this private key file.


Categories: Profiling

Experimenting with perf on Linux

In the post on Experimenting with Async Profiler, I mentioned the basic (trial division) integer factorization app I wrote. I’ve been experimenting with perf to see what the system looks like when running this application. On Ubuntu, I started with this command:

perf record -F 97 -a -g -- sleep 10

Turns out perf isn’t installed by default.

WARNING: perf not found for kernel 5.19.0-41

  You may need to install the following packages for this specific kernel:
    linux-tools-5.19.0-41-generic
    linux-cloud-tools-5.19.0-41-generic

  You may also want to install one of the following packages to keep up to date:
    linux-tools-generic
    linux-cloud-tools-generic

Interestingly, running sudo apt install linux-tools-generic only picks up 5.17:

...
The following NEW packages will be installed:
  linux-tools-5.15.0-72 linux-tools-5.15.0-72-generic linux-tools-generic
...

which perf now shows /usr/bin/perf but even perf -v fails with the above warning so I have to run

sudo apt install linux-tools-5.19.0-41-generic

...
The following NEW packages will be installed:
  linux-hwe-5.19-tools-5.19.0-41 linux-tools-5.19.0-41-generic
...

Once that completes, perf can now run but perf version doesn’t display anything meaningful. Back to the original command:

perf record -F 97 -a -g -- sleep 10

This fails with an error about restricted access. Interesting reading but I just use sudo and carry on.

Error:
Access to performance monitoring and observability operations is limited.
Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
access to performance monitoring and observability operations for processes
without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
More information can be found at 'Perf events and tool security' document:
https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
perf_event_paranoid setting is 4:
  -1: Allow use of (almost) all events by all users
      Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow raw and ftrace function tracepoint access
>= 1: Disallow CPU event access
>= 2: Disallow kernel profiling
To make the adjusted perf_event_paranoid setting permanent preserve it
in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)

Once the command completes, a perf.data file is created in the current directory. To generate a report, run this command. See the sample perf-report.txt file on GitHub.

perf report -n --stdio > perf-report.txt

To generate a flame graph, use Brendan Gregg’s scripts:

cd ~/repos
git clone https://github.com/brendangregg/FlameGraph

cd -

perf script --header > stacks.txt

~/repos/FlameGraph/stackcollapse-perf.pl < stacks.txt | ~/repos/FlameGraph/flamegraph.pl --hash > myflamegraph.svg

Categories: Java, Profiling

Experimenting with Async Profiler

I have been studying the performance of a simple Java application (for integer factorization) using async-profiler. The application’s source code is on GitHub.

async-profiler is a low overhead sampling profiler for Java that does not suffer from Safepoint bias problem.

async-profiler repo

There is also a 3-part talk about async-profiler demo-ing how it works and how to use it.

I downloaded the Linux x64 async-profiler build with these commands…

mkdir -p ~/java/binaries/async-profiler
cd ~/java/binaries/async-profiler

curl -Lo async-profiler-2.9-linux-x64.tar.gz https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.9/async-profiler-2.9-linux-x64.tar.gz

tar xzf async-profiler-2.9-linux-x64.tar.gz

… then started the application with these:

# macos:
export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7/Contents/Home

# Linux:
export JAVA_HOME=~/java/binaries/jdk/x64/jdk-17.0.7+7/

cd ~/repos/scratchpad/demos/java/FindPrimes/
$JAVA_HOME/bin/java Factorize 91278398257125987

Once the application is running, use the profiler.sh script to attach to the Java process and start profiling it. I was interested in wall clock profiling. This is specified using the -e wall argument (see Part 2: Improving Performance with Async-profiler by Andrei Pangin. – YouTube). The command line below will profile the Java application with a 5ms sampling interval for a duration (-d) of 10 seconds.

# macos:
cd ~/java/binaries/async-profiler-2.9-macos

# Linux:
cd ~/java/binaries/async-profiler-2.9-linux

./profiler.sh -e wall -t -i 5ms -d 10 -f result.html jps

The jps argument above lets the profiler.sh script determine which Java process is running by calling The jps Command (oracle.com). If there are multiple Java processes, then run jps first to determine the process id of the one to be profiled then explicitly pass that pid to profiler.sh e.g.

jps
./profiler.sh -e wall -t -i 5ms -d 10 -f result.html 53361

Async-profiler can also be attached at application startup.

$JAVA_HOME/bin/java -agentlib:~/java/binaries/async-profiler-2.9-macos/build/libasyncProfiler.dylib=start,event=wall,threadsfile=out.html Factorize

Other Event Types

The event type we have used so far is the wall clock event. Other event types include cpu and lock. The latter mode is mentioned at 32:15 in Part 2: Improving Performance with Async-profiler by Andrei Pangin. – YouTube. Here’s a sample command:

~/java/binaries/async-profiler/async-profiler-2.9-linux-x64/profiler.sh -e lock -i 5ms -d 10 -f result.html jps

Output Formats

The profiling data can be written in various formats. Here is an example command line used to explore what the traces output looks like. See scratchpad/factorization-profile.sh · swesonga/scratchpad · GitHub for additional examples of output formats.

cd ~/java/binaries/async-profiler/async-profiler-2.9-linux-x64
./profiler.sh -e wall -i 5ms -d 10 -o flat,traces -f Factorize-flat+traces.html 69490

To convert the async-profiler data from one format to another, use converter.jar from the downloaded tar file:

$JAVA_HOME/bin/java -cp ~/java/binaries/async-profiler/async-profiler-2.9-linux-x64/build/converter.jar FlameGraph rawdata.txt output.html

Other Notes

To find out file types on macos, run file -I rawdata. In my case, I had flamegraph data that was shared as application/gzip (causing unzip to fail with End-of-central-directory signature not found. I needed to use gzip -d rawdata.


Categories: Fluid Flow, Simulation

Building the VM2D Source Code

One of the challenges I have run into when reading through numerical methods/simulation papers is the dearth of source code. I was therefore pleasantly surprised to find the Vortex method for 2D flow simulation program. Of course, my luck is that the comments and docs are in Russian, which I do not understand. However, availability of the source code more than makes up for that. The publication of The VM2D Open Source Code for Incompressible Flow Simulation by Using Meshless Lagrangian Vortex Methods on CPU and GPU | IEEE Conference Publication | IEEE Xplore turns out to be the documenation. To build the source code, open a developer command prompt.

git clone https://github.com/vortexmethods/VM2D
cd VM2D
cmake .

This fails with an error about MPI missing:

-- -------------------------2D CODE-------------------------
-- Checking for module 'mpi-c'
--   Can't find mpi-c.pc in any of C:/software/strawberry/c/lib/pkgconfig
use the PKG_CONFIG_PATH environment variable, or
specify extra search paths via 'search_paths'
-- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
-- Checking for module 'mpi-cxx'
--   Can't find mpi-cxx.pc in any of C:/software/strawberry/c/lib/pkgconfig
use the PKG_CONFIG_PATH environment variable, or
specify extra search paths via 'search_paths'
-- Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)
CMake Error at C:/Program Files/CMake/share/cmake-3.25/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND)
Call Stack (most recent call first):
  C:/Program Files/CMake/share/cmake-3.25/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files/CMake/share/cmake-3.25/Modules/FindMPI.cmake:1837 (find_package_handle_standard_args)
  src/VMlib/CMakeLists.txt:31 (find_package)

StackOverflow articles e.g. Installing MPI for Windows – Stack Overflow suggest installing Microsoft MPI. I Download Microsoft MPI v10.0 from Official Microsoft Download Center but why on earth does it have an IE warning when I’m already using Edge?

Microsoft MPI v10.0 Download Page

How do I know which one I want? I’ll start with the SDK MSI.

Choose the download you want
MPI SDK Setup
asdf

The publisher certificate expired in December 2021. Shouldn’t there be a warning about that? I guess the publisher is well known and not revoked? Oh well, plough ahead and install it. Reopen the developer command prompt and run cmake . again. This time, there are no errors (and I ignore all the warnings since I have things to do). A Visual Studio solution is generated. Open VM.sln in Visual Studio 2022. Building fails with these errors:

...
3>C:\repos\edu\VM2D\src\VMcuda>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc.exe"  --use-local-env ... -Wno-deprecated-gpu-targets -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -std=c++14 ... -o VMcuda.dir\Debug\cuLib2D.obj "C:\repos\edu\VM2D\src\VMcuda\Cuda\cuLib2D.cu"
3>nvcc fatal   : Unsupported gpu architecture 'compute_35'
...
3>Done building project "VMcuda.vcxproj" -- FAILED.
...
4>LINK : fatal error LNK1104: cannot open file 'VMcuda.lib'
4>Done building project "VM2D.vcxproj" -- FAILED.

The solution is to remove the segment -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 from VM2D/CMakeLists.txt · vortexmethods/VM2D (github.com). Launching the application (VM2D.exe) fails with this message: The code execution cannot proceed because msmpi.dll was not found. Reinstalling the program may fix this problem.

VM2D.exe Failing to Launch

The SDK doesn’t have that DLL, so I guess that’s what the other setup executable is for.

MPI 10.0.12498 Setup
MPI Setup

I don’t see any DLLs in that installation directory. However, mpi – msmpi.dll error message in Visual Studio C++ – Stack Overflow says reinstalling MPI is the solution. Before doing so, I run the application in Visual Studio once again and this time it launches successfully. This message is displayed: queue ERROR: file problems is not found. These files are in the VM2D/run folder. Other files will not be found though if using the run folder as the current directory.

I guess I need to translate VM2D/03_starting.rst. This is the first time I have needed to translate a web page. Looks like the problems file expects to be in the tutorials directory. Copying the file in the VM2D/run folder to the VM2D/tutorials directory allows the program to find the expect files and it appears to run now. Task manager shows 100% CPU usage on my AMD Ryzen 7 5800X 8-Core Processor. The program runs for an hour and as per VM2D/05_run.rst, creates a csv file and a snapshots directory containing vtk files. To view these files, Download ParaView and install it. Launch ParaView then open the snapshots directory (it should recognize all the vtk files as a group).

Opening snapshots

Click on the Apply button on the Properties pane (see image below).

Properties of the Snapshots

Finally, click on the Play button on the toolbar to see the animation of the snapshots. The next step will be figuring out how to use the GPU to generate these snapshots in (hopefully) much less than an hour.

Snapshots in Paraview