Categories: ZIP

Building zlib for Windows

Ensure these components are installed before building zlib:

  1. The MSVC v140 – VS 2015 C++ build tools (v14.00) individual component in the Visual Studio Installer
  2. The Windows 8.1 SDK from the Windows SDK and emulator archive

Once these are installed, run the commands below in a Visual Studio 2022 Developer Command Prompt to build zlib for Windows. The x64 directory will contain zlibwapi.dll, which can be renamed to zlib.dll according to zlib/readme.txt (from the latest commit as of this post).

git clone https://github.com/madler/zlib
cd zlib/contrib/vstudio/vc14
git switch --detach v1.2.11
msbuild zlibvc.vcxproj /p:Configuration=Release /p:Platform=x64
msbuild zlibvc.vcxproj /p:Configuration=Release /p:Platform=Win32

mkdir C:\dev\software\zlib\win32
copy x86\ZlibDllRelease\zlibwapi.dll C:\dev\software\zlib\win32\zlib.dll
copy x86\ZlibDllRelease\zlibwapi.lib C:\dev\software\zlib\win32\zdll.lib

Why Build zlib?

As part of Tracking Down Missing Headers in LLVM for Windows, I ran into NSIS compiler errors and decide to create a debug build of NSIS to debug them myself since there was no definitive solution online. Turns out, zlib is one of the prereqs for NSIS as per Code / [r7368] /NSIS/branches/WIN64/INSTALL (sourceforge.net).

Unfortunately (or maybe fortunately?), I didn’t see any binaries at zlib. There is a link to the zlib GitHub repo though and zlib/DLL_FAQ.txt at master · madler/zlib (github.com) says to review the zlib site for an alternative download location. Sure enough, it does have a link to zlib for Windows 9x/NT/2000/XP/2003 (DLL version, plus related utilities). That doesn’t inspire much confidence in the binaries though… Might as well build them myself.

Investigating How to Build zlib

Open a Visual Studio Developer Command Prompt then build the project I saw in the docs:

git clone https://github.com/madler/zlib
cd zlib/contrib/vstudio/vc14
msbuild zlibvc.vcxproj

There are other prereqs, apparently:

MSBuild version 17.3.1+2badb37d1 for .NET Framework
Build started 9/29/2022 11:01:01 AM.
Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" on node 1 (default targets).
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\Microsoft.CppBuild.targets(460,5): error MSB8020: The build tools for Visual Studio 2015 (Platform Toolset = 'v140') cannot b
e found. To build using the v140 build tools, please install Visual Studio 2015 build tools.  Alternatively, you may upgrade to the current Visual Studio tools by selecting the Project menu or right-click the
 solution, and then selecting "Retarget solution". [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
Done Building Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" (default targets) -- FAILED.


Build FAILED.

This component is 3.63 GB but that’s not the only prereq! This toolset requires the Windows 8.1 SDK! As per visual studio 2019 – VS2019 without Windows 8.1 SDK – Stack Overflow, it needs to be downloaded from the Windows SDK and emulator archive

D:\dev\repos\zlib\contrib\vstudio\vc14> msbuild zlibvc.vcxproj
MSBuild version 17.3.1+2badb37d1 for .NET Framework
Build started 9/29/2022 11:23:13 AM.
Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" on node 1 (default targets).
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\Platforms\Win32\PlatformToolsets\v140\Toolset.targets(34,5): error MSB8036: The Windows SDK version 8.1 was not found. Install the required version of Wi
ndows SDK or change the SDK version in the project property pages or by right-clicking the solution and selecting "Retarget solution". [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
Done Building Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" (default targets) -- FAILED.


Build FAILED.

After all that effort, the reward is the error below. Searching for “bat” is helpful: Issues · madler/zlib (github.com)

D:\dev\repos\zlib\contrib\vstudio\vc14> msbuild zlibvc.vcxproj
MSBuild version 17.3.1+2badb37d1 for .NET Framework
Build started 9/29/2022 11:51:10 AM.
Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" on node 1 (default targets).
PrepareForBuild:
  Creating directory "x86\ZlibDllDebug\Tmp\".
  Creating directory "x86\ZlibDllDebug\Tmp\zlibvc.tlog\".
InitializeBuildStatus:
  Creating "x86\ZlibDllDebug\Tmp\zlibvc.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
PreBuildEvent:
  cd ..\..\masmx86
  bld_ml32.bat
  :VCEnd
  The system cannot find the path specified.
  'bld_ml32.bat' is not recognized as an internal or external command,
  operable program or batch file.
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\Microsoft.CppCommon.targets(123,5): error MSB3073: The command "cd ..\..\masmx86 [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\Microsoft.CppCommon.targets(123,5): error MSB3073: bld_ml32.bat [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\Microsoft.CppCommon.targets(123,5): error MSB3073: :VCEnd" exited with code 9009. [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
Done Building Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" (default targets) -- FAILED.


Build FAILED.

To find out which commit removed this batch file, run this from the root of the repo:

git log --full-history -1 -- contrib/masmx86/bld_ml32.bat

This looks like really bad development on the zlib repo. Removing scripts without removing outdated documentation, much less documenting the new way to build. This is Please fix 1.2.12 compile · Issue #631 · madler/zlib (github.com). Instead of using the workarounds there, just build 1.2.11 and let the zlib folks deal with that mess.

git switch --detach v1.2.11
cd zlib/contrib/vstudio/vc14
msbuild zlibvc.vcxproj

Linking fails with error LNK2026:

match686.obj : error LNK2026: module unsafe for SAFESEH image. [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
inffas32.obj : error LNK2026: module unsafe for SAFESEH image. [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
     Creating library x86\ZlibDllDebug\zlibwapi.lib and object x86\ZlibDllDebug\zlibwapi.exp
x86\ZlibDllDebug\zlibwapi.dll : fatal error LNK1281: Unable to generate SAFESEH image. [D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj]
Done Building Project "D:\dev\repos\zlib\contrib\vstudio\vc14\zlibvc.vcxproj" (default targets) -- FAILED.

This option is on by default as per /SAFESEH (Image has Safe Exception Handlers) | Microsoft Learn. However, it only affects the x86 platform. I’m only interested in x64 so this command suffices:

msbuild zlibvc.vcxproj /p:Configuration=Release /p:Platform=x64

The compiler generates the zlibwapi DLL in the x64 directory. Unfortunately, NSIS requires a Win32 build so run this as well:

msbuild zlibvc.vcxproj /p:Configuration=Release /p:Platform=Win32

Deploy the binaries to the desired location, e.g.

mkdir C:\dev\software\zlib\win32
copy x86\ZlibDllRelease\zlibwapi.dll C:\dev\software\zlib\win32\zlib.dll
copy x86\ZlibDllRelease\zlibwapi.dll C:\dev\software\zlib\x64\zlib.dll

Categories: hsdis, Installers, LLVM, Windows

Tracking Down Missing Headers in LLVM for Windows

hsdis is a plugin for disassembling code dynamically generated by the Java Virtual Machine. On Linux & MacOS, it uses GNU binutils. Support for the LLVM disassembly backend was recently added to hsdis in https://github.com/openjdk/jdk/pull/7531. This was motivated by the fact that GNU binutils is not distributed with the JDK (due to licensing reasons mentioned at https://github.com/openjdk/jdk/pull/5920#issuecomment-942398786) and the LLVM disassembly may be preferrable in certain circumstances. Unfortunately, the official Windows LLVM distribution does not have the header files necessary to build the hotspot disassembler. This prevents Windows developers from easily using the LLVM disassembler backend because they now have to build LLVM themselves as well – see hsdis LLVM backend for Windows ARM64 and Building LLVM for Windows ARM64, for example. In this post, we investigate why the LLVM Windows build does not have the necessary header files. The llvm-c directory in Windows build contains these 2 files only:

C:\Program Files\LLVM\include\llvm-c>dir
 Volume in drive C is OSDisk
 Volume Serial Number is c070-2ac0

 Directory of C:\Program Files\LLVM\include\llvm-c

01/08/2022  11:54 AM    <DIR>          .
01/08/2022  11:54 AM    <DIR>          ..
09/24/2021  10:18 AM            29,760 lto.h
09/24/2021  10:18 AM             9,632 Remarks.h
               2 File(s)         39,392 bytes
               2 Dir(s)  62,273,200,128 bytes free

I created a local LLVM build (see Building LLVM with CMake) and confirmed that it has all the header files.

C:\dev\repos\llvm-project\build_llvm\install_local\include\llvm-c>dir /w
 Volume in drive C is OSDisk
 Volume Serial Number is 0087-4c48

 Directory of C:\dev\repos\llvm-project\build_llvm\install_local\include\llvm-c

[.]                   [..]                  Analysis.h
BitReader.h           BitWriter.h           blake3.h
Comdat.h              Core.h                DataTypes.h
DebugInfo.h           Deprecated.h          Disassembler.h
DisassemblerTypes.h   Error.h               ErrorHandling.h
ExecutionEngine.h     ExternC.h             Initialization.h
IRReader.h            Linker.h              LLJIT.h
lto.h                 Object.h              Orc.h
OrcEE.h               Remarks.h             Support.h
Target.h              TargetMachine.h       [Transforms]
Types.h
              28 File(s)        382,361 bytes
               3 Dir(s)  59,158,138,880 bytes free

Does this problem still exist in the latest Windows LLVM release? I went to Releases · llvm/llvm-project (github.com) to find the latest LLVM installer for Windows but couldn’t find it. Turns out it’s because the 15.0.1 release is only 14 hours old so some of the assets probably haven’t been uploaded. Notice that 15.0.0 has 47 assets. I can successfully download and install LLVM-15.0.0-win64.exe and see that the header files are still missing.

Interestingly, trying to install LLVM-15.0.0-win32.exe before uninstalling LLVM-15.0.0-win64.exe gives this dialog and clicking Yes uninstalls before the actual installation of the 32-bit build starts!

LLVM is already installed.

I assumed that would happen at this stage:

All the same, these dialogs have strings that can lead us to the sources that create the installer! The installer looks very similar to the one from Building the Elmer Install Folder so searching the llvm codebase for “ncis ” gives only a handful of hits leading to the key discovery of build_llvm_release.bat! (later learn that this needs to be executed in a (2019) developer command prompt so that the ninja command can be found). That script requires 7zip though. The script fails on my machine because it can’t find 7zip. Failure seems to be coming from the for-statement (see for | Microsoft Learn for usage). The for command uses the escape character (^) as explained at set | Microsoft Learn.

C:\dev\repos\llvm-project\llvm\utils\release> build_llvm_release.bat 15.0.0
Check 7-zip version and/or administrator permissions.
'7z.exe' is not recognized as an internal or external command,
operable program or batch file.
You need to modify the paths below:
Revision: llvmorg-15.0.0
Package version: 15.0.0
Build dir: C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0

Press any key to continue . . .

Why does the script not exit gracefully now? Git blame shows that the checking code was added by Update the Windows packaging script. · llvm/llvm-project@df7c577 (github.com). Adding the 7-Zip installation path to my user environment variables addresses this issue but the script should check for this!

Installing 7-Zip allows me to check the syntax of the command used by the script to ensure that it will work.

C:\Program Files\7-Zip> 7z.exe | findstr /r "2[1-9].[0-9][0-9]"
7-Zip 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15

Surprisingly, the script still fails, this time because

'mv' is not recognized as an internal or external command,
operable program or batch file.

This must be line 81 introduced by Update the Windows packaging script. · llvm/llvm-project@83e9225 (github.com). Changing it to “move” now displays an error but the script continues executing until this error:

-- Looking for CrashReporterClient.h
-- Looking for CrashReporterClient.h - not found
-- Looking for pfm_initialize in pfm
-- Looking for pfm_initialize in pfm - not found
-- Could NOT find ZLIB (missing: ZLIB_LIBRARY ZLIB_INCLUDE_DIR)
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find LibXml2 (missing: LIBXML2_INCLUDE_DIR)
Call Stack (most recent call first):
  C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/FindLibXml2.cmake:108 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  cmake/config-ix.cmake:156 (find_package)
  CMakeLists.txt:774 (include)


-- Configuring incomplete, errors occurred!
See also "C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/CMakeFiles/CMakeOutput.log".
See also "C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/CMakeFiles/CMakeError.log".

The script downloads and extracts LibXml in the build directory. It also changes the libxmldir path separators from \ to /. To see the exact command failing, comment out the echo off line.

cmake
 -GNinja
 -DCMAKE_BUILD_TYPE=Release
 -DLLVM_ENABLE_ASSERTIONS=OFF
 -DLLVM_INSTALL_TOOLCHAIN_ONLY=ON
 -DLLVM_BUILD_LLVM_C_DYLIB=ON
 -DCMAKE_INSTALL_UCRT_LIBRARIES=ON
 -DPython3_FIND_REGISTRY=NEVER
 -DPACKAGE_VERSION=15.0.0
 -DLLDB_RELOCATABLE_PYTHON=1
 -DLLDB_EMBED_PYTHON_HOME=OFF
 -DCMAKE_CL_SHOWINCLUDES_PREFIX="Note: including file: "
 -DLLVM_ENABLE_LIBXML2=FORCE_ON
 -DLLDB_ENABLE_LIBXML2=OFF
 -DCMAKE_C_FLAGS="-DLIBXML_STATIC"
 -DCMAKE_CXX_FLAGS="-DLIBXML_STATIC"
 -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;lld;compiler-rt;lldb;openmp"
 -DLLDB_TEST_COMPILER=C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0/build32_stage0/bin/clang.exe
 -DPYTHON_HOME=C:\Users\saint\AppData\Local\Programs\Python\Python310-32
 -DPython3_ROOT_DIR=C:\Users\saint\AppData\Local\Programs\Python\Python310-32
 -DLIBXML2_INCLUDE_DIRS=C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/libxmlbuild/install/include/libxml2
 -DLIBXML2_LIBRARIES=C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/libxmlbuild/install/lib/libxml2s.lib ..\llvm-project\llvm

Looking through FindPackageHandleStandardArgs.cmake leads me to the simple realization that the wrong define is being used on the command line. Could this be because I’m using a newer CMake? I’ve been using the VS 2022 Preview Developer Command Prompt thus far. My VS 2019 (16.11.19) installation uses CMake 3.20. Both FindLibXml2.cmake in 3.20 and FindLibXml2.cmake in 3.24 require the LIBXML2_INCLUDE_DIR variable. However, they also claim (at the top) to set these variables.

A quick review of the history of build_llvm_release.bat shows that Build Windows releases with libxml enabled, to unbreak llvm-mt · llvm/llvm-project@145835c (github.com) introduced -DLIBXML2_INCLUDE_DIR but the next commit Pass -DLIBXML2_INCLUDE_DIRS in the Windows release package script · llvm/llvm-project@7735019 (github.com) changed it to plural. Adding the singular form to the script finally unblocks the build. Now to see how packing happens..

Packaging a Regular LLVM Build

in my build folder (build_llvm), there is a CPackConfig.cmake file that sets variables like CPACK_PACKAGE_FILE_NAME and CPACK_NSIS_DISPLAY_NAME. Since it is NSIS Wiki (sourceforge.io) in use, I wonder about running the package target myself in a manner similar to that used to create my local build. I switch back to a previous build directory (created without the build_llvm_release.bat) and run:

cmake --build . --config Release --target package

The resulting failure below indicates that NSIS is required.

MSBuild version 17.4.0-preview-22466-03+48ab5664b for .NET Framework
  PipSqueak.vcxproj -> C:\dev\repos\llvm-project\build_llvm\unittests\Support\DynamicLibrary\Release\PipSqueak.dll
  SecondLib.vcxproj -> C:\dev\repos\llvm-project\build_llvm\unittests\Support\DynamicLibrary\Release\SecondLib.dll
  obj.llvm-tblgen.vcxproj -> C:\dev\repos\llvm-project\build_llvm\utils\TableGen\obj.llvm-tblgen.dir\Release\obj.llvm-tblgen.lib
  LLVMDemangle.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\lib\LLVMDemangle.lib
...
  verify-uselistorder.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\verify-uselistorder.exe
  yaml-bench.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\yaml-bench.exe
  yaml2obj.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\yaml2obj.exe
EXEC : CPack error : Cannot find NSIS compiler makensis: likely it is not installed, or not in your PATH [C:\dev\repos\llvm-project\build_llvm\package.vcxproj]
EXEC : CPack error : Could not read NSIS registry value. This is usually caused by NSIS not being installed. Please install NSIS from http://nsis.sourceforge.net [C:\dev\repos\llvm-proje
ct\build_llvm\package.vcxproj]
EXEC : CPack error : Cannot initialize the generator NSIS [C:\dev\repos\llvm-project\build_llvm\package.vcxproj]

After installing NSIS, the previous command successfully creates an LLVM for Windows installer.

...
  verify-uselistorder.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\verify-uselistorder.exe
  yaml-bench.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\yaml-bench.exe
  yaml2obj.vcxproj -> C:\dev\repos\llvm-project\build_llvm\Release\bin\yaml2obj.exe
  CPack: Create package using NSIS
  CPack: Install projects
  CPack: - Install project: LLVM [Release]
  CMake Warning (dev) at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/GNUInstallDirs.cmake:243 (messa
  ge):
    Unable to determine default CMAKE_INSTALL_LIBDIR directory because no
    target architecture is known.  Please enable at least one language before
    including GNUInstallDirs.
  Call Stack (most recent call first):
    C:/dev/repos/llvm-project/llvm/cmake/modules/LLVMInstallSymlink.cmake:5 (include)
    C:/dev/repos/llvm-project/build_llvm/tools/llvm-ar/cmake_install.cmake:48 (include)
    C:/dev/repos/llvm-project/build_llvm/tools/cmake_install.cmake:39 (include)
    C:/dev/repos/llvm-project/build_llvm/cmake_install.cmake:71 (include)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  CPack: Create package
  CPack: - package: C:/dev/repos/llvm-project/build_llvm/LLVM-16.0.0git-win64.exe generated.

This installer generates the LLVM includes on disk as expected. The issue must therefore be confined to the installer generated by the script.

Reviewing Ninja NSIS Packaging

At this point, I ran build_llvm_release.bat to create an installer. Once packaging is complete, the install_manifest.txt file can be used to determine which files are in the installer. The batch file also runs lots of tests and this was annoying when trying to generate installers. Once the tests failed on the build I was creating and I had CTRL+C’d a couple of times, I ran ninja package myself (taken from the batch file)

C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0>ninja package
[0/1] Run CPack packaging tool...CPack: Create package using NSIS
CPack: Install projects
CPack: - Install project: LLVM []
CMake Warning (dev) at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/GNUInstallDirs.cmake:243 (message):
  Unable to determine default CMAKE_INSTALL_LIBDIR directory because no
  target architecture is known.  Please enable at least one language before
  including GNUInstallDirs.
Call Stack (most recent call first):
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/llvm-project/llvm/cmake/modules/LLVMInstallSymlink.cmake:5 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/tools/llvm-ar/cmake_install.cmake:40 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/tools/cmake_install.cmake:39 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/cmake_install.cmake:114 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

CPack: Create package
CPack: - package: C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/LLVM-15.0.0-win32.exe generated.

I was curious about these warnings but found it really annoying that I can’t open these paths by pasting them into the VS Code File/Open dialog. Looks like Windows: Allow to open file paths that contain slashes · Issue #15270 · microsoft/vscode (github.com) but that points to [Windows] Bug with open file dialog with forward slash (`file_dialog::ShowOpenDialog`) · Issue #7954 · electron/electron (github.com). Unfortunately, this sounds like a Windows Open dialog limitation since Notepad is not using the same dialog as Notepad++.

I then try to find a package target in build.ninja. Search for CMakeFiles\package.util.+ include since we’re interested in include files. There are some interesting differences in the include directories of the build created manually from the local install and the one created by the script, e.g.

Directory of C:\dev\repos\llvm-project\build_llvm\include\llvm\Support
[.]
[..]
[CMakeFiles]
cmake_install.cmake
Extension.def
INSTALL.vcxproj
INSTALL.vcxproj.filters
llvm_vcsrevision_h.vcxproj
llvm_vcsrevision_h.vcxproj.filters
PACKAGE.vcxproj
PACKAGE.vcxproj.filters
VCSRevision.h
[x64]
               9 File(s)         47,877 bytes
               4 Dir(s)  34,980,511,744 bytes free
Directory of C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0\include\llvm\Support

[.]                   [..]                  [CMakeFiles]
cmake_install.cmake   Extension.def         VCSRevision.h
               3 File(s)          1,293 bytes
               3 Dir(s)  34,981,122,048 bytes free

Try searching in build.ninja for the 2 header files the installer creates in the (broken) shipping LLVM for Windows build. Nothing there but searching the file system for remarks.h gives interesting results, e.g. the existence of an NSIS project file: project.nsi. Looks like there are some tutorials showing how to create .nsi files at Invoking NSIS run-time commands on compile-time – NSIS (sourceforge.io). The way NSIS is used with CPack when building is documented at Packaging With CPack — Mastering CMake

Directory of C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0\_CPack_Packages\win32\NSIS

09/21/2022  06:41 PM    <DIR>          .
09/21/2022  06:41 PM    <DIR>          ..
09/21/2022  06:41 PM    <DIR>          LLVM-15.0.0-win32
09/21/2022  06:54 PM       256,557,945 LLVM-15.0.0-win32.exe
09/21/2022  06:41 PM               631 NSIS.InstallOptions.ini
09/21/2022  06:41 PM            55,204 project.nsi
               3 File(s)    256,613,780 bytes
               3 Dir(s)  35,416,317,952 bytes free

Directory of C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0\_CPack_Packages\win32\NSIS\LLVM-15.0.0-win32\include\llvm-c

09/21/2022  06:41 PM    <DIR>          .
09/21/2022  06:41 PM    <DIR>          ..
09/05/2022  03:48 AM            30,109 lto.h
09/05/2022  03:48 AM             9,632 Remarks.h
               2 File(s)         39,741 bytes
               2 Dir(s)  35,416,289,280 bytes free

The natural hypothesis is that NSIS is simply packing the whole LLVM-15.0.0-win32 directory into the installer. I had been comparing these two files earlier…

C:\dev\repos\llvm-project\build_llvm\cmake_install.cmake
C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0\cmake_install.cmake

… but I completely missed the fact that the 2nd didn’t have these lines from the first.

if(CMAKE_INSTALL_COMPONENT STREQUAL "llvm-headers" OR NOT CMAKE_INSTALL_COMPONENT)
  file(INSTALL DESTINATION "${CMAKE_INSTALL_PREFIX}/include" TYPE DIRECTORY FILES
    "C:/dev/repos/llvm-project/llvm/include/llvm"
    "C:/dev/repos/llvm-project/llvm/include/llvm-c"
    FILES_MATCHING REGEX "/[^/]*\\.def$" REGEX "/[^/]*\\.h$" REGEX "/[^/]*\\.td$" REGEX "/[^/]*\\.inc$" REGEX "/license\\.txt$")
endif()

if(CMAKE_INSTALL_COMPONENT STREQUAL "llvm-headers" OR NOT CMAKE_INSTALL_COMPONENT)
  file(INSTALL DESTINATION "${CMAKE_INSTALL_PREFIX}/include" TYPE DIRECTORY FILES
    "C:/dev/repos/llvm-project/build_llvm/include/llvm"
    "C:/dev/repos/llvm-project/build_llvm/include/llvm-c"
    FILES_MATCHING REGEX "/[^/]*\\.def$" REGEX "/[^/]*\\.h$" REGEX "/[^/]*\\.gen$" REGEX "/[^/]*\\.inc$" REGEX "/cmakefiles$" EXCLUDE REGEX "/config\\.h$" EXCLUDE)
endif()

Search the codebase for “llvm-headers” and find the llvm-header component definition. That whole code block is gated by the LLVM_INSTALL_TOOLCHAIN_ONLY variable! This is explicitly turned off in build_llvm_release.bat! I rerun the batch file and see tests failing after the build succeeds. CTRL+C to kill the processes so that I can get to the root issue: does turning off that flag fix the includes? makensis fails, probably because I killed the build and some things might still have been in use?

C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0>ninja package
[0/1] Run CPack packaging tool...CPack: Create package using NSIS
CPack: Install projects
CPack: - Install project: LLVM []
CMake Warning (dev) at C:/Program Files/Microsoft Visual Studio/2022/Preview/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.24/Modules/GNUInstallDirs.cmake:243 (message):
  Unable to determine default CMAKE_INSTALL_LIBDIR directory because no
  target architecture is known.  Please enable at least one language before
  including GNUInstallDirs.
Call Stack (most recent call first):
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/llvm-project/llvm/cmake/modules/LLVMInstallSymlink.cmake:5 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/tools/llvm-ar/cmake_install.cmake:40 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/tools/cmake_install.cmake:39 (include)
  C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/cmake_install.cmake:128 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

CPack: Create package
CPack Error: Problem running NSIS command: "C:/Program Files (x86)/NSIS/makensis.exe" "C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/_CPack_Packages/win32/NSIS/project.nsi"
Please check C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/_CPack_Packages/win32/NSIS/NSISOutput.log for errors
CPack Error: Problem compressing the directory
CPack Error: Error when generating package: LLVM

FAILED: CMakeFiles/package.util
cmd.exe /C "cd /D C:\dev\repos\llvm-project\llvm\utils\release\llvm_package_15.0.0\build32_stage0 && "C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cpack.exe" --config ./CPackConfig.cmake"
ninja: build stopped: subcommand failed.

NSISOutput.log failed due to an “Internal compiler error #12345: error mmapping datablock to 17235001.” However, the include files are now present in the source directory being packaged by NSIS.

Turning Off Tests

There are many tests that the build script runs and some of them are failing. Testing is not on my critical path since all I need is to generate installers so I modify the scripts to enable me to package the build without running all the tests. I then start my build without tests and go to bed only to wake up the next morning to find that I need to rerun it because there are no running programs when I log in. Event Viewer doesn’t show any reboot-related events and sure enough, Task Manager shows over 9 days of uptime still. Turns out the Desktop Window Manager crashed (C:\WINDOWS\system32\dwm.exe)! Curse you dwmcore.dll. Well, time to install those updates I’ve been putting off and reboot before jumping back in. Now on the new Windows 10.0.22621.521. The build still fails:

-- LLVM host triple: i686-pc-windows-msvc
-- LLVM default target triple: i686-pc-windows-msvc
-- Using Release VC++ CRT: MD
-- Looking for os_signpost_interval_begin
-- Looking for os_signpost_interval_begin - not found
CMake Error at C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find Python3 (missing: Python3_EXECUTABLE Interpreter) (Required
  is at least version "3.6")

      Reason given by package:
          Interpreter: Cannot use the interpreter "C:/Python310/python.exe"

Call Stack (most recent call first):
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPython/Support.cmake:3165 (find_package_handle_standard_args)
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPython3.cmake:485 (include)
  CMakeLists.txt:817 (find_package)


-- Configuring incomplete, errors occurred!
See also "C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/CMakeFiles/CMakeOutput.log".
See also "C:/dev/repos/llvm-project/llvm/utils/release/llvm_package_15.0.0/build32_stage0/CMakeFiles/CMakeError.log".

When I interrupted the tests before modifying the batch file to skip them, I noticed that they were being run by %LOCALAPPDATA%\Microsoft\WindowsApps\python3.9.exe. This is still present on my machine. Ah, turns out I’m now using the 2019 developer command prompt (and therefore an older CMake). The only difference between CMake 3.20 FindPython3.cmake and CMake 3.24 FindPython3.cmake is a comment about static libraries, so this failure is a mystery.

Diagnosing Build Failures

Since this issue also bit me when I moved to my Surface Book, it is worth understanding why it happens.

Missing CMake in Visual Studio 17.3.4 Developer Command Prompt

Here is the VS 2022 Preview vs VS 2022 Enterprise path to CMake:

C:\Program Files (x86)\Microsoft Visual Studio\Installer> where cmake
C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe

C:\dev\repos\llvm-project\llvm\utils\release> where cmake
INFO: Could not find files for the given pattern(s).

Wait… why is there no CMake in VS 2022 Enterprise on my desktop? The Visual Studio Installer shows cmake to be installed. It also warnings and it’s only now that I’m learning that there are not just troubleshooting tips but also ways to create a local layout from the command line Create an offline installation – Visual Studio (Windows) | Microsoft Learn

The View Logs link opens the Documents folder under This PC – not particularly useful. Interestingly though, clicking on the Modify button shows a Total space required 1.63 GB. How is there space required before I’ve selected anything? Something similar happens with 16.11.19 though. Without making any individual component selections, I start the install process. CMake gets (re-?)installed as shown below. This fixes the setup warnings as well and cmake is now usable in the VS2022 command prompt.

Missing Python3 in VS 17.3.4 Developer Command Prompt

This is the error I got when trying to build LLVM on my Surface Book 2 in the VS 2022 developer command prompt:

CMake Error at C:/Program Files/CMake/share/cmake-3.17/Modules/FindPackageHandleStandardArgs.cmake:164 (message):
  Could NOT find Python3 (missing: Python3_EXECUTABLE Interpreter) (Required
  is at least version "3.6")

      Reason given by package:
          Interpreter: Cannot use the interpreter "C:/Python310/python.exe"

Call Stack (most recent call first):
  C:/Program Files/CMake/share/cmake-3.17/Modules/FindPackageHandleStandardArgs.cmake:445 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files/CMake/share/cmake-3.17/Modules/FindPython/Support.cmake:2437 (find_package_handle_standard_args)
  C:/Program Files/CMake/share/cmake-3.17/Modules/FindPython3.cmake:309 (include)
  CMakeLists.txt:817 (find_package)

Here is the (fixed up) output from where python:

C:\Python310\python.exe
%LOCALAPPDATA%\Microsoft\WindowsApps\python.exe

I modify build_llvm_release.bat to pass the --trace-expand --trace-redirect=cmake_trace.txt CMake option as recommended by cmake Python: Cannot use the interpreter – Stack Overflow. That’s when I notice that the list of python versions CMake is looking for does not contain 3.10: Modules/FindPython/Support.cmake · v3.17.5 · CMake / CMake · GitLab (kitware.com). My suspicion is that this is the cause of the above error. It looks like I installed CMake a while back on this laptop.

Uninstalling CMake enables the command line to pick up the CMake distributed with Visual Studio. Python3 is now found successfully in the path below (I’ve shortened it using %LOCALAPPDATA%).

-- Found Python3: %LOCALAPPDATA%/Microsoft/WindowsApps/python3.8.exe (found suitable version "3.8.10", minimum required is "3.6") found components: Interpreter

Missing Python3 in VS 16.11.19 Developer Command Prompt

Interestingly, I still get the same error in VS 2019 despite uninstalling CMake 3.17. My earlier hypothesis is therefore invalid.

CMake Error at C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find Python3 (missing: Python3_EXECUTABLE Interpreter) (Required
  is at least version "3.6")

      Reason given by package:
          Interpreter: Cannot use the interpreter "C:/Python310/python.exe"

Call Stack (most recent call first):
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPython/Support.cmake:3165 (find_package_handle_standard_args)
  C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.20/Modules/FindPython3.cmake:485 (include)
  CMakeLists.txt:817 (find_package)

Using --trace-expand --trace-redirect=cmake_trace.txt again (and searching for “execute_process“) reveals how the python version is determined using the execute_process cmake command in Modules/FindPython/Support.cmake · v3.20.0:

C:/Python310/python.exe -c "import sys; sys.stdout.write('.'.join([str(x) for x in sys.version_info[:3]]))"

I comment out the ERROR_QUIET line to reveal the stdout and stderr output from python since the return code from the python process is causing the CMake error to be raised. Running with --trace-expand --trace-redirect=cmake_trace.txt now reveals the root cause (paths below cleaned up using %LOCALAPPDATA%):

Python path configuration:
  PYTHONHOME = '%LOCALAPPDATA%\Programs\Python\Python310-32'
  PYTHONPATH = (not set)
  program name = 'C:/Python310/python.exe'
  isolated = 0
  environment = 1
  user site = 1
  import site = 1
  sys._base_executable = 'C:\\Python310\\python.exe'
  sys.base_prefix = '%LOCALAPPDATA%\\Programs\\Python\\Python310-32'
  sys.base_exec_prefix = '%LOCALAPPDATA%\\Programs\\Python\\Python310-32'
  sys.platlibdir = 'lib'
  sys.executable = 'C:\\Python310\\python.exe'
  sys.prefix = '%LOCALAPPDATA%\\Programs\\Python\\Python310-32'
  sys.exec_prefix = '%LOCALAPPDATA%\\Programs\\Python\\Python310-32'
  sys.path = [
    'C:\\Python310\\python310.zip',
    '%LOCALAPPDATA%\\Programs\\Python\\Python310-32\\DLLs',
    '%LOCALAPPDATA%\\Programs\\Python\\Python310-32\\lib',
    'C:\\Python310',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00003174 (most recent call first):
  <no Python frame>

django – init_fs_encoding: failed to get the Python codec of the filesystem encoding – Stack Overflow is a hint that the PYTHONHOME is wrong. Sure enough, I didn’t change it in build_llvm_release.bat so the paths in the configuration above do not exist! This now raises another question: how on earth does this work in VS 2022? I notice on my desktop that python.exe does not even appear in the CMake tracing output! The difference in behavior stems from the fact that the find_program command in Modules/FindPython/Support.cmake · v3.20.0 finds python 3.10 first in the VS 2019 environment. This path is then assigned to _Python3_EXECUTABLE, preventing the 3.8 path from being used. One important difference between CMake 3.20 and 3.23 that I notice is FindPython: fix typo error (fff8d5b2) · Commits · CMake / CMake · GitLab (kitware.com). Since the fix for the build_llvm_release.bat script is straightforward and it is clear that there are some CMake implementation differences at work, we no longer need to dig into why this behavior could be happening.

Python Hangs

One of my build attempts successfully completes stage0 but hangs when CMake tries to detect the python version. Manually running the same command (copied from Process Explorer) also hangs. Even %LOCALAPPDATA%/Microsoft/WindowsApps/python3.9.exe --version hangs. Inspecting the full dump created by Task Manager reveals that python3.9.exe made a call to get (what looks like) the Package.InstalledLocation Property (Windows.ApplicationModel) – Windows UWP applications | Microsoft Learn

...
-- Looking for os_signpost_interval_begin
-- Looking for os_signpost_interval_begin - not found

Windows becomes pretty unusable as I investigate this behavior (mouse doesn’t work, changes program in focus but can’t click on anything). A reboot fixes these issues (e.g. version now works). Can’t believe we have to deal with this in 2022???

The support link is https://www.python.org/doc/ and the product link is https://www.python.org/. The privacy policy is https://www.python.org/privacy/ and the license terms link is https://docs.python.org/3.9/license.html.

I’m tempted to just remove this store app but also curious about how to get symbols and see exactly where it hang.

Comparison with macOS/Linux includes

On Windows, it is easy to get the Linux and macOS LLVM builds using curl (added to Windows in build 17063 as per Tar and Curl Come to Windows! | Microsoft Learn).

curl -L https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.1/clang+llvm-15.0.1-aarch64-linux-gnu.tar.xz -o clang+llvm-15.0.1-aarch64-linux-gnu.tar.xz

curl -L https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.1/clang+llvm-15.0.1-arm64-apple-darwin21.0.tar.xz -o clang+llvm-15.0.1-arm64-apple-darwin21.0.tar.xz

curl -L https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.1/clang+llvm-15.0.1-x86_64-apple-darwin.tar.xz -o clang+llvm-15.0.1-x86_64-apple-darwin.tar.xz

To use 7zip to extract these XZ Files:

7z x clang+llvm-15.0.1-aarch64-linux-gnu.tar.xz
7z x clang+llvm-15.0.1-arm64-apple-darwin21.0.tar.xz
7z x clang+llvm-15.0.1-x86_64-apple-darwin.tar.xz

tar xf clang+llvm-15.0.1-aarch64-linux-gnu.tar
tar xf clang+llvm-15.0.1-arm64-apple-darwin21.0.tar
tar xf clang+llvm-15.0.1-x86_64-apple-darwin.tar

To use XZ Utils (tukaani.org) to extract these XZ Files, run these commands but note that they remove the .xz files!

xz -d clang+llvm-15.0.1-aarch64-linux-gnu.tar.xz
xz -d clang+llvm-15.0.1-arm64-apple-darwin21.0.tar.xz
xz -d clang+llvm-15.0.1-x86_64-apple-darwin.tar.xz

tar xf clang+llvm-15.0.1-aarch64-linux-gnu.tar
tar xf clang+llvm-15.0.1-arm64-apple-darwin21.0.tar
tar xf clang+llvm-15.0.1-x86_64-apple-darwin.tar

Here are the ARM64 llvm include directory listings for macOS and Linux LLVM builds.

.../Downloads/clang+llvm-15.0.1-arm64-apple-darwin21.0/include
c++
clang
clang-c
clang-tidy
flang
lld
lldb
llvm
llvm-c
mlir
mlir-c
polly

.../Downloads/clang+llvm-15.0.1-aarch64-linux-gnu/include
aarch64-unknown-linux-gnu
c++
clang
clang-c
clang-tidy
flang
lld
lldb
llvm
llvm-c
mlir
mlir-c
ompt-multiplex.h
polly

Here are the directories in the include folder before the installer is created. There are also 28 include files in the include/llvm-c/ directory as desired.

Directory of llvm\utils\release\llvm_package_15.0.0\build32_stage0\_CPack_Packages\win64\NSIS\LLVM-15.0.0-win64\include
 clang
 clang-c
 clang-tidy
 lld
 lldb
 llvm
 llvm-c

Outstanding Questions

  1. Why does the NSIS project fail to build? Why are there test failures and build errors?
  2. Why does the Linux build have ompt-multiplex.h and the aarch64-unknown-linux-gnu directory?
  3. How is the Windows ARM64 installer generated?
  4. Why doesn’t the Windows build have c++, flang, mlir, mlir-c, and polly?
  5. How do we get symbols to the Python app in the Microsoft Store?

Categories: Assembly, Visual C++

Building & Disassembling ARM64 Code using Visual C++

This path C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build has various scripts to set up a command window as documented at Use the Microsoft C++ toolset from the command line | Microsoft Docs. If vcvarsx86_arm64.bat and vcvarsamd64_arm64.bat are missing in that folder on your Windows x64 machine, install the MSVC v143 – VS 2022 C++ ARM64 build tools (Latest) component in the Visual Studio 2022 installer.

Selection ARM64 Build Tools in VS Installer

Once it is installed, open a new cmd.exe window and run this command to set up the build environment:

"C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsamd64_arm64.bat"

To verify that the ARM64 compiler will be used when cl or dumpbin is executed:

D:\> where cl
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\cl.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\cl.exe

D:\> where dumpbin
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\arm64\dumpbin.exe
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\Hostx64\x64\dumpbin.exe

To see the command Visual Studio uses to build the project, create a C++ console application and use the Configuration Manager to change the Active solution platform to ARM64. Next, go to Tools > Options then expand the Projects and Solutions node. Select Build And Run then change the MSBuild project build output verbosity to Detailed. Building the project should now show the full command line used to invoke the compiler, for example here are the command lines used in the Debug and Release configurations respectively.

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /JMC /nologo /W3 /WX- /diagnostics:column /sdl /Od /Oy- /D _DEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Debug\\" /Fd"ARM64\Debug\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.32.31326\bin\HostX86\arm64\CL.exe /c /Zi /nologo /W3 /WX- /diagnostics:column /sdl /O2 /Oi /Oy- /GL /D NDEBUG /D _CONSOLE /D _ARM64_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE=1 /D _UNICODE /D UNICODE /Gm- /EHsc /MD /GS /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /std:c++17 /permissive- /Fo"ARM64\Release\\" /Fd"ARM64\Release\vc143.pdb" /external:W3 /Gd /TP /analyze- /FC /errorReport:prompt ConsoleApplication1.cpp

Notice the /O2 flag (maximize speed) in the release build instead of the /Od flag (no optimizations) above. The debug build also uses the just my code /JMC, runtime error checks /RTC1, and debug multithread-specific version of the run-time library /MDd flags. For our testing purposes, we can ignore most of these flags.

Calling Printf

Here is a simple program, aarch64-abi-test-printf.cpp, which calls printf with a format specifier and 4 additional arguments.

#include <stdio.h>

int main()
{
    int result = printf("%.4f,%.4f,%.4f,%s", 1.2345, 1.2345, 1.2345, "str");
}

Compiling a Debug Build

To compile and disassemble this program, run:

cl /c aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi.asm aarch64-abi-test-printf.obj
dumpbin /all /out:printf-abi.txt aarch64-abi-test-printf.obj

The disassembly is shown below with some links to the documentation for the various instructions. See the Arm Architecture Reference Manual for A-profile architecture PDF for more details about these instructions. The overview of AArch64 state at ARM Compiler armasm User Guide Version 6.6.1 is also a useful resource.

Dump of file aarch64-abi-test-printf.obj

File Type: COFF OBJECT

main:
  0000000000000000: A9BE7BFD  stp         fp,lr,[sp,#-0x20]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: 90000008  adrp        x8,$SG5571
  000000000000000C: 91000104  add         x4,x8,$SG5571
  0000000000000010: 58000183  ldr         x3,$LN3
  0000000000000014: 58000162  ldr         x2,$LN3
  0000000000000018: 58000141  ldr         x1,$LN3
  000000000000001C: 90000008  adrp        x8,$SG5572
  0000000000000020: 91000100  add         x0,x8,$SG5572
  0000000000000024: 94000000  bl          printf
  0000000000000028: 2A0003E0  mov         w0,w0
  000000000000002C: B90013E0  str         w0,[sp,#0x10]
  0000000000000030: 52800000  mov         w0,#0
  0000000000000034: A8C27BFD  ldp         fp,lr,[sp],#0x20
  0000000000000038: D65F03C0  ret
  000000000000003C: D503201F  nop
$LN3:
  0000000000000040: 126E978D
  0000000000000044: 3FF3C083

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000004: 910003FD  mov         fp,sp
  0000000000000008: F90017E0  str         x0,[sp,#0x28]
  000000000000000C: F90013E1  str         x1,[sp,#0x20]
  0000000000000010: F9000FE2  str         x2,[sp,#0x18]
  0000000000000014: F9000BE3  str         x3,[sp,#0x10]
  0000000000000018: 94000000  bl          __local_stdio_printf_options
  000000000000001C: F9400BE4  ldr         x4,[sp,#0x10]
  0000000000000020: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000024: F94013E2  ldr         x2,[sp,#0x20]
  0000000000000028: F94017E1  ldr         x1,[sp,#0x28]
  000000000000002C: F9400000  ldr         x0,[x0]
  0000000000000030: 94000000  bl          __stdio_common_vfprintf
  0000000000000034: 2A0003E0  mov         w0,w0
  0000000000000038: 2A0003E0  mov         w0,w0
  000000000000003C: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000040: D65F03C0  ret

printf:
  0000000000000000: D10103FF  sub         sp,sp,#0x40
  0000000000000004: A9008BE1  stp         x1,x2,[sp,#8]
  0000000000000008: A90193E3  stp         x3,x4,[sp,#0x18]
  000000000000000C: A9029BE5  stp         x5,x6,[sp,#0x28]
  0000000000000010: F9001FE7  str         x7,[sp,#0x38]
  0000000000000014: A9BD7BFD  stp         fp,lr,[sp,#-0x30]!
  0000000000000018: 910003FD  mov         fp,sp
  000000000000001C: F90013E0  str         x0,[sp,#0x20]
  0000000000000020: 9100E3E8  add         x8,sp,#0x38
  0000000000000024: F9000FE8  str         x8,[sp,#0x18]
  0000000000000028: 52800020  mov         w0,#1
  000000000000002C: 94000000  bl          __acrt_iob_func
  0000000000000030: F9400FE3  ldr         x3,[sp,#0x18]
  0000000000000034: D2800002  mov         x2,#0
  0000000000000038: F94013E1  ldr         x1,[sp,#0x20]
  000000000000003C: 94000000  bl          _vfprintf_l
  0000000000000040: 2A0003E0  mov         w0,w0
  0000000000000044: B90013E0  str         w0,[sp,#0x10]
  0000000000000048: D2800008  mov         x8,#0
  000000000000004C: F9000FE8  str         x8,[sp,#0x18]
  0000000000000050: B94013E0  ldr         w0,[sp,#0x10]
  0000000000000054: A8C37BFD  ldp         fp,lr,[sp],#0x30
  0000000000000058: 910103FF  add         sp,sp,#0x40
  000000000000005C: D65F03C0  ret

  Summary

           8 .bss
          68 .chks64
          9C .debug$S
          62 .drectve
          18 .pdata
          1A .rdata
          F8 .text$mn
          10 .xdata

In the disassembly generated by dumpbin (printf-abi.asm), notice that all 5 arguments to printf are passed in registers! x0 contains a pointer to the format string, x1-x3 contain the address of the $LN3 label. The 64-bits at that label are the IEEE double floating point representation of 1.2345. x4 contains a pointer to the null-terminated string “str“.

Which are the printf String Arguments?

To determine what symbols in instructions like adrp x8,$SG5571 mean, we use the output of dumpbin /all. The RELOCATIONS section shows $SG5571 to have symbol index 8. The COFF SYMBOL TABLE shows this symbol index 8 to be in SECT3. The raw data for section 3 contains the format string and the single string parameter passed to printf. I’m still not sure how the assembler knows the difference in offsets between these 2 strings?

.
.
.
SECTION HEADER #3
  .rdata name
       0 physical address
       0 virtual address
      1A size of raw data
     31A file pointer to raw data (0000031A to 00000333)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40400040 flags
         Initialized Data
         8 byte align
         Read Only

RAW DATA #3
  00000000: 73 74 72 00 00 00 00 00 25 2E 34 66 2C 25 2E 34  str.....%.4f,%.4
  00000010: 66 2C 25 2E 34 66 2C 25 73 00                    f,%.4f,%s.
.
.
.
RELOCATIONS #4
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000008  PAGEBASE_REL21             90000008         8  $SG5571
 0000000C  PAGEOFFSET_12A             91000104         8  $SG5571
 0000001C  PAGEBASE_REL21             90000008         9  $SG5572
 00000020  PAGEOFFSET_12A             91000100         9  $SG5572
 00000024  BRANCH26                   94000000        16  printf
.
.
.
COFF SYMBOL TABLE
000 01057A64 ABS    notype       Static       | @comp.id
001 80010190 ABS    notype       Static       | @feat.00
002 00000000 SECT1  notype       Static       | .drectve
    Section length   62, #relocs    0, #linenums    0, checksum        0
004 00000000 SECT2  notype       Static       | .debug$S
    Section length   9C, #relocs    0, #linenums    0, checksum        0
006 00000000 SECT3  notype       Static       | .rdata
    Section length   1A, #relocs    0, #linenums    0, checksum B99D9667
008 00000000 SECT3  notype       Static       | $SG5571
009 00000008 SECT3  notype       Static       | $SG5572
00A 00000000 SECT4  notype       Static       | .text$mn

Compiling an Optimized Build

Specifying the /O2 flag for speed generates optimized code.

cl /c /O2 /Fo"printf-abi-o2.obj" aarch64-abi-test-printf.cpp
dumpbin /disasm /out:printf-abi-o2.asm printf-abi-o2.obj
dumpbin /all /out:printf-abi-o2.txt printf-abi-o2.obj

In the optimized code below, the IEEE double is loaded into d16 then copied to the x1-x3 registers by the FMOV instruction.

Dump of file printf-abi-o2.obj

File Type: COFF OBJECT

__local_stdio_printf_options:
  0000000000000000: 90000008  adrp        x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000004: 91000100  add         x0,x8,?_OptionsStorage@?1??__local_stdio_printf_options@@9@4_KA
  0000000000000008: D65F03C0  ret

_vfprintf_l:
  0000000000000000: A9BD53F3  stp         x19,x20,[sp,#-0x30]!
  0000000000000004: A9015BF5  stp         x21,x22,[sp,#0x10]
  0000000000000008: F90013FE  str         lr,[sp,#0x20]
  000000000000000C: AA0003F6  mov         x22,x0
  0000000000000010: AA0103F5  mov         x21,x1
  0000000000000014: AA0203F4  mov         x20,x2
  0000000000000018: AA0303F3  mov         x19,x3
  000000000000001C: 94000000  bl          __local_stdio_printf_options
  0000000000000020: F9400000  ldr         x0,[x0]
  0000000000000024: AA1303E4  mov         x4,x19
  0000000000000028: AA1403E3  mov         x3,x20
  000000000000002C: AA1503E2  mov         x2,x21
  0000000000000030: AA1603E1  mov         x1,x22
  0000000000000034: 94000000  bl          __stdio_common_vfprintf
  0000000000000038: F94013FE  ldr         lr,[sp,#0x20]
  000000000000003C: A9415BF5  ldp         x21,x22,[sp,#0x10]
  0000000000000040: A8C353F3  ldp         x19,x20,[sp],#0x30
  0000000000000044: D65F03C0  ret

main:
  0000000000000000: F81F0FFE  str         lr,[sp,#-0x10]!
  0000000000000004: 5C0001B0  ldr         d16,$LN4
  0000000000000008: 90000008  adrp        x8,??_C@_03OJMAPEGJ@str@
  000000000000000C: 91000104  add         x4,x8,??_C@_03OJMAPEGJ@str@
  0000000000000010: 90000008  adrp        x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000014: 91000100  add         x0,x8,??_C@_0BC@OEIAMIIK@?$CF?44f?0?$CF?44f?0?$CF?44f?0?$CFs@
  0000000000000018: 9E660203  fmov        x3,d16
  000000000000001C: 9E660202  fmov        x2,d16
  0000000000000020: 9E660201  fmov        x1,d16
  0000000000000024: 94000000  bl          printf
  0000000000000028: 52800000  mov         w0,#0
  000000000000002C: F84107FE  ldr         lr,[sp],#0x10
  0000000000000030: D65F03C0  ret
  0000000000000034: D503201F  nop
$LN4:
  0000000000000038: 126E978D
  000000000000003C: 3FF3C083

printf:
  0000000000000000: A9BA53F3  stp         x19,x20,[sp,#-0x60]!
  0000000000000004: A9017BF5  stp         x21,lr,[sp,#0x10]
  0000000000000008: A9028BE1  stp         x1,x2,[sp,#0x28]
  000000000000000C: A90393E3  stp         x3,x4,[sp,#0x38]
  0000000000000010: A9049BE5  stp         x5,x6,[sp,#0x48]
  0000000000000014: F9002FE7  str         x7,[sp,#0x58]
  0000000000000018: AA0003F4  mov         x20,x0
  000000000000001C: 52800020  mov         w0,#1
  0000000000000020: 9100A3F5  add         x21,sp,#0x28
  0000000000000024: 94000000  bl          __acrt_iob_func
  0000000000000028: AA0003F3  mov         x19,x0
  000000000000002C: 94000000  bl          __local_stdio_printf_options
  0000000000000030: F9400000  ldr         x0,[x0]
  0000000000000034: D2800003  mov         x3,#0
  0000000000000038: AA1403E2  mov         x2,x20
  000000000000003C: AA1303E1  mov         x1,x19
  0000000000000040: AA1503E4  mov         x4,x21
  0000000000000044: 94000000  bl          __stdio_common_vfprintf
  0000000000000048: A9417BF5  ldp         x21,lr,[sp,#0x10]
  000000000000004C: A8C653F3  ldp         x19,x20,[sp],#0x60
  0000000000000050: D65F03C0  ret

  Summary

           8 .bss
          70 .chks64
          94 .debug$S
          62 .drectve
          18 .pdata
          16 .rdata
          E8 .text$mn
           8 .xdata

The example we have reviewed in this post passed only 5 parameters to printf. To see how more than 8 parameters are handled, see the example print call in aarch64-abi-test-printf-manyargs.cpp and printf-abi-many.asm (or for the optimized assembly code, printf-abi-many-o2.asm).

Additional resources on AArch64:


Categories: Crystal Growth

Crystal Growth Simulation on Linux

Crystal Growth Simulation – Part 1 walked through setting up a mesh for running crystal simulation in Elmer on Windows. Crystal Growth Simulation Failure described the process of trying to run the actual simulation on Windows and the resulting segmentation fault. What happens if we try this same experiment on Linux? First install Elmer using the instructions on the Elmer FEM blog.

sudo apt-add-repository ppa:elmer-csc-ubuntu/elmer-csc-ppa
sudo apt-get update
sudo apt-get install elmerfem-csc

I’m not sure if it’s because I’ve built other repos on my Ubuntu VM, but python3 is already installed. I still need to install pip though.

sudo apt-get install python3
sudo apt install python3-pip

pip install pyelmer
pip install objectgmsh

We need to clone 2 repos to execute the crystal growth experiment. The opencgs repo is a dependency of the test-cz-induction repo.

cd ~/repos/fem/elmer/research
git clone https://github.com/nemocrys/opencgs
git clone https://github.com/nemocrys/test-cz-induction

Next, install the opencgs module using pip then set up the crystal growth simulation mesh.

cd opencgs
pip install -e .

This installation output contains an error about an incompatible numpy version but ends with a message about all the components being successfully installed.

Requirement already satisfied: kiwisolver>=1.0.1 in /home/saint/.local/lib/python3.8/site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (1.4.4)
Requirement already satisfied: fonttools>=4.22.0 in /home/saint/.local/lib/python3.8/site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (4.34.4)
ERROR: pandas 1.4.3 has requirement numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10", but you'll have numpy 1.17.4 which is incompatible.
...
Successfully installed commonmark-0.9.1 meshio-5.3.4 opencgs pandas-1.4.3 pygments-2.12.0 python-dateutil-2.8.2 pytz-2022.1 rich-12.5.1 typing-extensions-4.3.0

I ignore the error and forge ahead with running the crystal simulation setup:

cd ../test-cz-induction
python3 setup.py

This fails with an error related to numpy.

python3 setup.py 
Traceback (most recent call last):
  File "setup.py", line 19, in <module>
    import opencgs.control as ctrl
  File "/home/saint/repos/fem/elmer/research/opencgs/opencgs/__init__.py", line 5, in <module>
    import opencgs.post
  File "/home/saint/repos/fem/elmer/research/opencgs/opencgs/post.py", line 2, in <module>
    import meshio
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/__init__.py", line 1, in <module>
    from . import (
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/_cli/__init__.py", line 1, in <module>
    from ._main import main
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/_cli/_main.py", line 5, in <module>
    from . import _ascii, _binary, _compress, _convert, _decompress, _info
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/_cli/_ascii.py", line 4, in <module>
    from .. import ansys, flac3d, gmsh, mdpa, ply, stl, vtk, vtu, xdmf
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/ansys/__init__.py", line 1, in <module>
    from ._ansys import read, write
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/ansys/_ansys.py", line 14, in <module>
    from .._helpers import register_format
  File "/home/saint/.local/lib/python3.8/site-packages/meshio/_helpers.py", line 7, in <module>
    from numpy.typing import ArrayLike

I guess I need to heed that error from pip. python – How can I upgrade NumPy? – Stack Overflow says to use the –upgrade flag.

$ pip install numpy --upgrade
Collecting numpy
  Downloading numpy-1.23.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
     |████████████████████████████████| 17.1 MB 3.6 MB/s 
Installing collected packages: numpy
  WARNING: The scripts f2py, f2py3 and f2py3.8 are installed in '/home/saint/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed numpy-1.23.1

That’s all that’s needed to get things to run. For example, setting up the mesh now works:

python3 setup.py

The crystal growth simulation can now be executed as well:

python3 run.py

Interestingly, the simulation completes successfully!

crucible
melt
crystal
inductor
seed
insulation
crucible_adapter
axis_bt
...
using material graphite-CZ3R6300 from self.materials_dict
using material insulation from self.materials_dict
using material steel-1.4541 from self.materials_dict
using material vacuum from self.materials_dict
Wrote sif-file.
Starting simulation  ./simdata/2022-07-31_13-35_ss_test-cz-induction_vacuum  ...
[] [] {'CPU-time': 55.87, 'real-time': 57.49}
Finished simulation  ./simdata/2022-07-31_13-35_ss_test-cz-induction_vacuum  .
Post processing...
evaluating heat fluxes

Finished post processing.

This implies that the segmentation fault I ran into on Windows was specific to the Windows Elmer build.

Visualization

Use ParaView to visualize the results of the simulation.

sudo apt install paraview

There are different variables that can be visualized. I selected temperature for the screenshot below. Other options included joule field, joule heating, potential im, newy, etc.

Visualizing the Crystal Growth Simulation

Categories: Visualization

Building ParaView

The Elmer Parallel Demo used ParaView to visualize the results of the simulation. It looks like such a useful tool so I decided to get the source code and build it to learn a little bit about it.

Building on Windows

The building documentation looks thorough. I used a similar directory structure and didn’t need the Git Bash window since I ran all the commands in the VS2019 x64 Native Tools Command Prompt.

cd \dev\repos
mkdir pv
cd pv
git clone --recursive https://gitlab.kitware.com/paraview/paraview.git

This is one of the more interesting clones I have done – a lot going on with these submodules.

Cloning into 'paraview'...
remote: Enumerating objects: 469980, done.
remote: Counting objects: 100% (4454/4454), done.
remote: Compressing objects: 100% (1787/1787), done.
remote: Total 469980 (delta 2771), reused 4233 (delta 2592), pack-reused 465526Receiving objects: 100% (469980/469980), 197.93 MiB | 2.63 MiB/s
Receiving objects: 100% (469980/469980), 198.52 MiB | 2.40 MiB/s, done.
Resolving deltas: 100% (349710/349710), done.
Updating files: 100% (9092/9092), done.
Submodule 'ThirdParty/IceT/vtkicet' (https://gitlab.kitware.com/paraview/icet.git) registered for path 'ThirdParty/IceT/vtkicet'
Submodule 'ThirdParty/QtTesting/vtkqttesting' (https://gitlab.kitware.com/paraview/qttesting.git) registered for path 'ThirdParty/QtTesting/vtkqttesting'
Submodule 'Utilities/VisItBridge' (https://gitlab.kitware.com/paraview/visitbridge.git) registered for path 'Utilities/VisItBridge'
Submodule 'VTK' (https://gitlab.kitware.com/vtk/vtk.git) registered for path 'VTK'
Cloning into 'C:/dev/repos/paraview/ThirdParty/IceT/vtkicet'...
remote: Enumerating objects: 4587, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 4587 (delta 1), reused 1 (delta 0), pack-reused 4582
Receiving objects: 100% (4587/4587), 1.15 MiB | 1.94 MiB/s, done.
Resolving deltas: 100% (3447/3447), done.
Cloning into 'C:/dev/repos/paraview/ThirdParty/QtTesting/vtkqttesting'...
remote: Enumerating objects: 2358, done.
remote: Counting objects: 100% (239/239), done.
remote: Compressing objects: 100% (103/103), done.
remote: Total 2358 (delta 136), reused 239 (delta 136), pack-reused 2119
Receiving objects: 100% (2358/2358), 670.37 KiB | 1.55 MiB/s, done.
Resolving deltas: 100% (1777/1777), done.
Cloning into 'C:/dev/repos/paraview/Utilities/VisItBridge'...
remote: Enumerating objects: 14424, done.
remote: Counting objects: 100% (235/235), done.
remote: Compressing objects: 100% (113/113), done.
remote: Total 14424 (delta 127), reused 186 (delta 122), pack-reused 14189
Receiving objects: 100% (14424/14424), 11.66 MiB | 1.99 MiB/s, done.
Resolving deltas: 100% (11036/11036), done.
Cloning into 'C:/dev/repos/paraview/VTK'...
remote: Enumerating objects: 647510, done.
remote: Counting objects: 100% (787/787), done.
remote: Compressing objects: 100% (418/418), done.
remote: Total 647510 (delta 450), reused 652 (delta 367), pack-reused 646723
Receiving objects: 100% (647510/647510), 231.49 MiB | 2.44 MiB/s, done.
Resolving deltas: 100% (497923/497923), done.
Submodule path 'ThirdParty/IceT/vtkicet': checked out '32816fe5592de3be664da6f8466a546f221d8532'
Submodule path 'ThirdParty/QtTesting/vtkqttesting': checked out '08d96e9277bc4c26804fd77ce1b4fa5c791605ae'
Submodule path 'Utilities/VisItBridge': checked out 'df098f4148a96d62c388861c1d476039e02224ae'
Submodule path 'VTK': checked out 'e38d93f8b9d7a9475593e502adefa9e02d5c60fe'
Submodule 'VTK-m' (https://gitlab.kitware.com/vtk/vtk-m.git) registered for path 'VTK/ThirdParty/vtkm/vtkvtkm/vtk-m'
Cloning into 'C:/dev/repos/paraview/VTK/ThirdParty/vtkm/vtkvtkm/vtk-m'...
remote: Enumerating objects: 86181, done.
remote: Counting objects: 100% (6317/6317), done.
remote: Compressing objects: 100% (1922/1922), done.
remote: Total 86181 (delta 4616), reused 5994 (delta 4377), pack-reused 79864
Receiving objects: 100% (86181/86181), 23.04 MiB | 2.56 MiB/s, done.
Resolving deltas: 100% (70045/70045), done.
Filtering content: 100% (102/102), 30.28 MiB | 3.84 MiB/s, done.
Submodule path 'VTK/ThirdParty/vtkm/vtkvtkm/vtk-m': checked out '982e965536b41b334bd2bbb765373e6503c823ec'

Setting up the build is simple too.

mkdir build
cd build

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DVTK_SMP_IMPLEMENTATION_TYPE=STDThread -DCMAKE_BUILD_TYPE=Release ..\paraview

Unfortunately, cmake fails on the first try because it cannot find MPI.

-- Check size of uintptr_t
-- Check size of uintptr_t - done
-- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
-- Could NOT find MPI (missing: MPI_C_FOUND C)
CMake Error at VTK/CMake/vtkModule.cmake:4578 (message):
  Could not find the MPI external dependency.
Call Stack (most recent call first):
  VTK/CMake/vtkModule.cmake:5172 (vtk_module_find_package)
  VTK/Utilities/MPI/CMakeLists.txt:1 (vtk_module_third_party_external)

The error log contains a more specific error message:

C:\PROGRA~2\MIB055~1\2019\ENTERP~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe  /nologo   /DWIN32 /D_WINDOWS /bigobj   /Zi /Ob0 /Od /RTC1 -MDd /showIncludes /FoCMakeFiles\cmTC_77827.dir\test_mpi.c.obj /FdCMakeFiles\cmTC_77827.dir\ /FS -c C:\dev\repos\pv\paraview\VTK\CMake\patches\3.22\FindMPI\test_mpi.c

C:\dev\repos\pv\paraview\VTK\CMake\patches\3.22\FindMPI\test_mpi.c(1): fatal error C1083: Cannot open include file: 'mpi.h': No such file or directory
ninja: build stopped: subcommand failed.

Some folks have already run into this before, e.g.

The problem is that I already had Microsoft MPI installed (by Elmer) but I didn’t have the SDK. Gotta take those prerequisites seriously… However, it’s good to know that the Microsoft MPI source code is on GitHub! Weird that I can’t download the MPI SDK by itself. The SDK’s default install path is “C:\Program Files (x86)\Microsoft SDKs\MPI\

Microsoft MPI SDK Setup Wizard

This addresses that cmake failure but also points out my other dereliction of prerequisite installation…

...
-- Found MPI_C: C:/Program Files (x86)/Microsoft SDKs/MPI/Lib/x64/msmpi.lib (found version "2.0")
-- Found MPI: TRUE (found version "2.0") found components: C
...
CMake Warning at VTK/CMake/vtkModule.cmake:4572 (find_package):
  By not providing "FindQt5.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "Qt5", but
  CMake did not find one.

  Could not find a package configuration file provided by "Qt5" (requested
  version 5.9) with any of the following names:

    Qt5Config.cmake
    qt5-config.cmake

  Add the installation prefix of "Qt5" to CMAKE_PREFIX_PATH or set "Qt5_DIR"
  to a directory containing one of the above files.  If "Qt5" provides a
  separate development package or SDK, be sure it has been installed.
Call Stack (most recent call first):
  VTK/GUISupport/Qt/CMakeLists.txt:43 (vtk_module_find_package)


CMake Error at VTK/CMake/vtkModule.cmake:4578 (message):
  Could not find the Qt5 external dependency.
Call Stack (most recent call first):
  VTK/GUISupport/Qt/CMakeLists.txt:43 (vtk_module_find_package)

Interestingly, there are only .zip files in the linked to 5.15.3 archive. The installer I need Download Offline Installers | Source Package Offline Installer | Qt. This thing needs an account??? I create a personal account and try to install it to C:\Qt\Qt5.12.12.

Qt 5.12.12 Setup Dialog – Account Required

Unfortunately, I don’t see an MSVC 2019 component! Could this be why they require 5.15.3?

Qt 5.12.12 Setup Dialog – Component Selection

Even more unfortunate is the discovery that the reason I can’t find 5.15 installers is because they need a commercial license. Here’s the linked to blog post: Qt offering changes 2020. Interestingly, there was pushback against the account requirement a while back too – Changing Qt Account to be Optional in the Online Installer. I had been considering learning more about Qt and perhaps porting some code to Qt but this level of friction has me reconsidering doing anything with Qt. For now, I’m setting aside the Windows platform to see what the situation is on Linux.

Building on Linux

Building on Linux is straightforward on my Ubuntu 20.04 VM.

cd ~/repos/sci
mkdir pv
cd pv
git clone --recursive https://gitlab.kitware.com/paraview/paraview.git

sudo apt-get install git cmake build-essential libgl1-mesa-dev libxt-dev qt5-default libqt5x11extras5-dev libqt5help5 qttools5-dev qtxmlpatterns5-dev-tools libqt5svg5-dev python3-dev python3-numpy libopenmpi-dev libtbb-dev ninja-build

mkdir build
cd build
cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DVTK_SMP_IMPLEMENTATION_TYPE=TBB -DCMAKE_BUILD_TYPE=Release ../paraview

time ninja

ninja takes about 3h 45min on my machine but it completes successfully. Launch ParaView by running:

bin/paraview
ParaView Built on Ubuntu

Building on macOS

The approach I took was to start building using the same cmake configuration as I did on my Ubuntu VM.

brew install ninja

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DVTK_SMP_IMPLEMENTATION_TYPE=TBB -DCMAKE_BUILD_TYPE=Release ../paraview

I hadn’t really thought much about what TBB is until cmake failed with this error:

-- Could NOT find TBB (missing: TBB_DIR)
CMake Error at VTK/CMake/vtkModule.cmake:4578 (message):
  Could not find the TBB external dependency.
Call Stack (most recent call first):
  VTK/Common/Core/vtkSMPSelection.cmake:42 (vtk_module_find_package)
  VTK/Common/Core/CMakeLists.txt:51 (include)

Searching through the source code for VTK_SMP_IMPLEMENTATION_TYPE leads me to the VTK build instructions, which list all the possible values. For now, I’ll just remove this define from the cmake command line.

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DCMAKE_BUILD_TYPE=Release ../paraview

This fails because MPI cannot be found (just like on Windows).

CMake Error at VTK/CMake/vtkModule.cmake:4578 (message):
  Could not find the MPI external dependency.
Call Stack (most recent call first):
  VTK/CMake/vtkModule.cmake:5172 (vtk_module_find_package)
  VTK/Utilities/MPI/CMakeLists.txt:1 (vtk_module_third_party_external)

Installing Open MPI using brew addresses this.

brew install openmpi

The next error is about Qt5 missing:

CMake Warning at VTK/CMake/vtkModule.cmake:4572 (find_package):
  By not providing "FindQt5.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "Qt5", but
  CMake did not find one.

  Could not find a package configuration file provided by "Qt5" (requested
  version 5.9) with any of the following names:

    Qt5Config.cmake
    qt5-config.cmake

  Add the installation prefix of "Qt5" to CMAKE_PREFIX_PATH or set "Qt5_DIR"
  to a directory containing one of the above files.  If "Qt5" provides a
  separate development package or SDK, be sure it has been installed.
Call Stack (most recent call first):
  VTK/GUISupport/Qt/CMakeLists.txt:43 (vtk_module_find_package)


CMake Error at VTK/CMake/vtkModule.cmake:4578 (message):
  Could not find the Qt5 external dependency.
Call Stack (most recent call first):
  VTK/GUISupport/Qt/CMakeLists.txt:43 (vtk_module_find_package)

Install it using the command from https://formulae.brew.sh/formula/qt@5

brew install qt@5

The summary contains more information than the ninja or Open MPI installations did.

==> Summary
🍺  /opt/homebrew/Cellar/qt@5/5.15.5_1: 10,846 files, 344.2MB
==> Running `brew cleanup qt@5`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
==> Caveats
==> qt@5
We agreed to the Qt open source license for you.
If this is unacceptable you should uninstall.

qt@5 is keg-only, which means it was not symlinked into /opt/homebrew,
because this is an alternate version of another formula.

If you need to have qt@5 first in your PATH, run:
  echo 'export PATH="/opt/homebrew/opt/qt@5/bin:$PATH"' >> ~/.zshrc

For compilers to find qt@5 you may need to set:
  export LDFLAGS="-L/opt/homebrew/opt/qt@5/lib"
  export CPPFLAGS="-I/opt/homebrew/opt/qt@5/include"

The same Qt error is displayed though. I use the approach from https://github.com/Cockatrice/Cockatrice/issues/205#issuecomment-48705334

cmake -GNinja -DPARAVIEW_USE_PYTHON=ON -DPARAVIEW_USE_MPI=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=/opt/homebrew/opt/qt@5/ ../paraview

We finally have a successful cmake run. Now we time the ninja build. It takes only 39 minutes on my MacBook M1.

date; time ninja; date
ParaView Built on macOS

Categories: Elmer, Simulation

Elmer Parallel Demo

Below is a video from the Elmer folks demonstrating how to use parallelization to improve the performance of Elmer. I followed along on Windows with the publicly downloadable Elmer installation. I’m listing the commands in this post for easy reference.

We first clone the demo repository then open the geo file that serves as the basis for the demo in gmsh. I ran these commands in a Windows command prompt and used full paths to the gmsh and elmer executables.

git clone https://github.com/tzwinger/elmer_parallel_demo
cd elmer_parallel_demo

# Open the geo file in gmsh as a background job
gmsh.exe elmer_flow.geo &

# Create elmer_flow.msh
gmsh.exe -3 elmer_flow.geo

ElmerGrid 14 2 elmer_flow.msh -autoclean

The ElmerGrid command is used to create partitions from the mesh input file. Running it without any arguments displays the available options. The options used in the example are highlighted below.

This program can create simple 2D structured meshes consisting of
linear, quadratic or cubic rectangles or triangles. The meshes may
also be extruded and revolved to create 3D forms. In addition many
mesh formats may be imported into Elmer software. Some options have
not been properly tested. Contact the author if you face problems.

The program has two operation modes
A) Command file mode which has the command file as the only argument
   'ElmerGrid commandfile.eg'

B) Inline mode which expects at least three input parameters
   'ElmerGrid 1 3 test'

The first parameter defines the input file format:
1)  .grd      : ElmerGrid file format
2)  .mesh.*   : Elmer input format
3)  .ep       : Elmer output format
4)  .ansys    : Ansys input format
5)  .inp      : Abaqus input format by Ideas
6)  .fil      : Abaqus output format
7)  .FDNEUT   : Gambit (Fidap) neutral file
8)  .unv      : Universal mesh file format
9)  .mphtxt   : Comsol Multiphysics mesh format
10) .dat      : Fieldview format
11) .node,.ele: Triangle 2D mesh format
12) .mesh     : Medit mesh format
13) .msh      : GID mesh format
14) .msh      : Gmsh mesh format
15) .ep.i     : Partitioned ElmerPost format
16) .2dm      : 2D triangular FVCOM format

The second parameter defines the output file format:
1)  .grd      : ElmerGrid file format
2)  .mesh.*   : ElmerSolver format (also partitioned .part format)
3)  .ep       : ElmerPost format
4)  .msh      : Gmsh mesh format
5)  .vtu      : VTK ascii XML format

The third parameter is the name of the input file.
If the file does not exist, an example with the same name is created.
The default output file name is the same with a different suffix.

There are several additional in-line parameters that are
taken into account only when applicable to the given format.
-out str             : name of the output file
-in str              : name of a secondary input file
-decimals            : number of decimals in the saved mesh (eg. 8)
...
-removeintbcs        : remove internal boundaries if they are not needed
-removelowdim        : remove boundaries that are two ranks lower than highest dim
-removeunused        : remove nodes that are not used in any element
-bulkorder           : renumber materials types from 1 so that every number is used
-boundorder          : renumber boundary types from 1 so that every number is used
-autoclean           : this performs the united action of the four above
...

Keywords are related to mesh partitioning for parallel ElmerSolver runs:
...
-metiskway int       : mesh will be partitioned with Metis using graph Kway routine

-metisrec int        : mesh will be partitioned with Metis using graph Recursive routine
-metiscontig         : enforce that the metis partitions are contiguous
-metisseed int       : random number generator seed for Metis algorithms
-partdual            : use the dual graph in partition method (when available)
...

Now the solver can be invoked. The demo first does a serial run using the ElmerSolver command.

ElmerSolver elmer_flow_gcr.sif

This took just over 6 minutes on my machine. The wall clock time is the 2nd time (highlighted below) according to the demo video.

...
MAIN: Version: 9.0 (Rev: Release, Compiled: 2020-11-10)
MAIN:  Running one task without MPI parallelization.
MAIN:  Running with just one thread per task.
MAIN:  Lua interpreted linked in.
...
ResultOutputSolver: Saving in unstructured VTK XML (.vtu) format
ResultOutputSolver: -------------------------------------
ElmerSolver: *** Elmer Solver: ALL DONE ***
ElmerSolver: The end
SOLVER TOTAL TIME(CPU,REAL):       369.31      369.31
ELMER SOLVER FINISHED AT: 2022/07/27 01:23:45

To increase parallelism, the demo continues with the serial mesh but uses OpenMP in multithreading mode by setting the OMP_NUM_THREADS environment variable. This does not appear to be sufficient to increase the number of threads it uses in my Windows setup so I need to get to the bottom of why OMP_NUM_THREADS is not being respected.

set OMP_NUM_THREADS=4
# setx /M OMP_NUM_THREADS 4
ElmerSolver elmer_flow_gcr.sif

Next, the demo uses ElmerGrid to partition the input, this time using the arguments 2 2 instead of 14 2 since we already have an existing serial Elmer mesh. It was interesting learning about partitioning algorithms like METIS – KarypisLab/METIS: METIS – Serial Graph Partitioning and Fill-reducing Matrix Ordering (github.com).

ElmerGrid 2 2 elmer_flow -partdual -metiskway 4

# notice the partitioning.4 directory in elmer_flow
dir elmer_flow

The demo uses mpirun but the Windows equivalent is “C:\Program Files\Microsoft MPI\Bin\mpiexec.exe“.

"C:\Program Files\Microsoft MPI\Bin\mpiexec.exe" -np 4 "C:\Program Files\Elmer 9.0-Release\bin\ElmerSolver.exe" elmer_flow_gcr.sif

Using 4 processes on my box takes 150s and increasing the processes to 6 drops the time to 121s.

MAIN: Version: 9.0 (Rev: Release, Compiled: 2020-11-10)
MAIN:  Running in parallel using 4 tasks.
MAIN:  Running with just one thread per task.
...
ResultOutputSolver: Saving in unstructured VTK XML (.vtu) format
ResultOutputSolver: -------------------------------------
ElmerSolver: *** Elmer Solver: ALL DONE ***
ElmerSolver: The end
SOLVER TOTAL TIME(CPU,REAL):       148.65      148.65

Visualization

The demo does not explicitly discuss how to visualize the results. However, this was covered in the Parallel Computing with Elmer talk.

Parallel Computing with Elmer

ParaView needs to be installed. Looks like they need to fix their installation dialog to not cut off text – or better yet, replace that long string with a user-friendly one. To visualize the ElmerSolver results, launch ParaView then open the parallel vtu file (*.pvtu).

Opening the Parallel VTU File

When the file opens, click on the green Apply button in the Properties panel. Notice that the pressure variable is preselected in the dropdown on the toolbar.

Visualizing Pressure in ParaView

Switching to the velocity variable shows all all blue rendering due to the 0 velocity. To see the animated visualization:

  1. Click on the slice button (4th from left in screenshot).
  2. Click on “Z Normal” in the Plane Parameters section on the Properties pane.
  3. Click on Apply.
  4. Click on the play button on the toolbar to see the animation of the flow.

The last part of the demo shows how to view the partitions in the mesh:

  1. Deselect the eye on Slice1 in the Pipeline Browser.
  2. Select the eye on “flow_gcr_t00” in the Pipeline Browser.
  3. Go to the Filter menu > Alphabetic > Connectivity.
  4. Click on the Apply button.

Change the selection in the combobox in the representation toolbar from from Surface to Surface with Edges to see how the partition seems to be based on the number of elements involved.

View of Surface with Edges

ParaView’s source code is available on GitLab so naturally, I’ll try to build it one of these fine days.


Categories: Crystal Growth, Elmer

Crystal Growth Simulation Failure

After completing the mesh generation in Crystal Growth Simulation – Part 1, I decided to investigate how to run the actual crystal growth simulation using Elmer. On Windows, Elmer needs to either be installed into “Program Files or be present in the path to avoid this error:

$ python run.py
crucible
melt
crystal
inductor
...
using material steel-1.4541 from self.materials_dict
using material vacuum from self.materials_dict
Wrote sif-file.
Starting simulation  ./simdata/2022-07-25_13-17_ss_test-cz-induction_vacuum  ...
Traceback (most recent call last):
  File "D:\dev\repos\fem\research\test-cz-induction\run.py", line 32, in <module>
    sim.execute()
  File "d:\dev\repos\fem\research\opencgs\opencgs\sim.py", line 182, in execute
    run_elmer_grid(self.sim_dir, "case.msh")
  File "C:\Python310\lib\site-packages\pyelmer\execute.py", line 28, in run_elmer_grid
    subprocess.run(args, cwd=sim_dir, stdout=f, stderr=f)
  File "C:\Python310\lib\subprocess.py", line 501, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Python310\lib\subprocess.py", line 966, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Python310\lib\subprocess.py", line 1435, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

I tried installing a locally built Elmer – see How to Build Elmer on Windows. Unfortunately, the layout created by setup defaulted to “C:\Program Files\Elmer 9.0-1b8e4f7ec” (ending with the hash of the commit I built, instead of “Release”). Worse still, it didn’t launch because of missing DLLs. I need to take a closer look at how NSIS is creating the setup executable, and also figure out how to set up the publisher so that a certificate is displayed with the UAC prompt. I ended up installing the publicly downloadable Elmer then retrying the python script.

using material steel-1.4541 from self.materials_dict
using material vacuum from self.materials_dict
Wrote sif-file.
Starting simulation  ./simdata/2022-07-25_17-23_ss_test-cz-induction_vacuum_1  ...
['ERROR:: systemc: Command exit status was 1'] [] {}
Finished simulation  ./simdata/2022-07-25_17-23_ss_test-cz-induction_vacuum_1  .
Post processing...
Traceback (most recent call last):
  File "D:\dev\repos\fem\research\test-cz-induction\run.py", line 32, in <module>
    sim.execute()
  File "d:\dev\repos\fem\research\opencgs\opencgs\sim.py", line 189, in execute
    self._postprocessing_probes()
  File "d:\dev\repos\fem\research\opencgs\opencgs\sim.py", line 139, in _postprocessing_probes
    names_data = self._read_names_file(self.sim_dir + "/results/probes.dat.names")
  File "d:\dev\repos\fem\research\opencgs\opencgs\sim.py", line 129, in _read_names_file
    with open(names_file) as f:
FileNotFoundError: [Errno 2] No such file or directory: './simdata/2022-07-25_17-23_ss_test-cz-induction_vacuum_1/02_simulation/results/probes.dat.names'

A dialog popped up asking whether I wanted to grant some elmer process access to the domain network or private, etc, and I’m not sure if I took too long to answer it but the script continued with the above failure. This looks like a bug in opencgs – why continue post processing when the simulation has failed – I filed this issue to let the author know: Do not start post-processing when simulation fails · Issue #2 · nemocrys/opencgs (github.com).

I had no idea why the simulation is failing so I used Process Explorer to see which command line was being used to invoke ElmerGrid and ElmerSolver. I was able to then manually invoke ElmerSolver and notice a segmentation fault!

ELMER SOLVER (v 9.0) STARTED AT: 2022/07/25 17:23:13
ParCommInit:  Initialize #PEs:            1
MAIN: 
MAIN: =============================================================
MAIN: ElmerSolver finite element software, Welcome!
...
MAIN: Reading Model: case.sif
 Caught LUA error:[string "loadfile("C:/Program Files (x86)/Elmer/shar..."]:1: attempt to call a nil value
 Caught LUA error:[string "loadstring(readsif("case.sif"))()"]:1: attempt to call global 'readsif' (a nil value)
LoadInputFile: Scanning input file: case.sif
LoadInputFile: Scanning only size info
...
OptimizeBandwidth: Half bandwidth after optimization: 1447
OptimizeBandwidth: ---------------------------------------------------------
'ViewFactors' is not recognized as an internal or external command,
operable program or batch file.
ERROR:: systemc: Command exit status was 1
RadiationFactors: Message
RadiationFactors: All done time (s)                      4.9000E-02
...
      60 0.2041E-08
      61 0.3710E-09
ComputeChange: NS (ITER=11) (NRM,RELC): ( 0.17032920E-04 0.41334346E-05 ) :: statmagsolver
StatMagSolve: Convergence after 11 iterations
StatMagSolve: Joule Heating (W):      4.1375E+01
StatMagSolve: All done, exiting
StatMagSolve: ------------------------------------------------
ComputeChange: SS (ITER=1) (NRM,RELC): ( 0.17032920E-04  2.0000000     ) :: statmagsolver
HeatSolve:  Found Control Point at distance:   0.0000000000000000
HeatSolve:  Control Point Index:           0
HeatSolve: Using Steady-state Heater Control
HeatSolve: 
HeatSolve: 
HeatSolve: -------------------------------------
HeatSolve:  TEMPERATURE ITERATION           1
HeatSolve: -------------------------------------
HeatSolve: 
HeatSolve: Starting Assembly...

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xffffffff
#1  0xffffffff
#2  0xffffffff
...
#19  0xffffffff
#20  0xffffffff

Looking at the simulation folders such as “simdata\2022-07-25_22-52_ss_test-cz-induction_vacuum_1\02_simulation” revealed that there are ElmerGrid and ElmerSolver log files present with this info, so I didn’t even need Process Explorer! At this point, I have to save the SIGSEGV investigation for another day.


Crystal Growth Simulation – Part 1

Ever since I read chapter 2 of Fabrication Engineering at the Micro- and Nanoscale, I have been interested in simulating crystal growth, primarily because of the multi-discplinary nature of this problem: finite element modeling software, modeling using high performance computing, the physics of the problem (governing laws and equations), and visualization techniques. I had run across Elmer when looking up crystal growth examples a while back. About two weeks ago, I was watching this Elmer demo in which they used gmsh to show the geometry for the simulation.

I wanted to learn more about Gmsh and how it is used, so I took a detour into this Gmsh introductory video. It was an excellent tour of Gmsh and its features.

Gmsh for Crystal Growth

At this point, I wanted to see how Gmsh could be used for crystal growth simulation so I searched for “crystal growth gmsh“. I was very pleased to find this paper: Development and validation of a thermal simulation for the Czochralski crystal growth process using model experiments – ScienceDirect. It appeared to have a corresponding set of slides and a talk as well. It especially piqued my interest that there is a script to set up a mesh for a Czochralski furnace.

Running the Code

Thankfully, the code in this paper is open source. I already had python installed so the first thing I tried was executing run.py in Git bash

$ git clone https://github.com/nemocrys/test-cz-induction
$ cd test-cz-induction
$ python run.py
Traceback (most recent call last):
  File "D:\dev\repos\fem\research\test-cz-induction\run.py", line 5, in <module>
    import opencgs.control as ctrl
ModuleNotFoundError: No module named 'opencgs'

Without using my brain, I decided to install pyelmer since I knew it was a dependency (from the talk).

$ pip install pyelmer
Collecting pyelmer
  Downloading pyelmer-1.0.0-py3-none-any.whl (27 kB)
Collecting gmsh
  Downloading gmsh-4.10.5-py2.py3-none-win_amd64.whl (38.4 MB)
     |████████████████████████████████| 38.4 MB 285 kB/s
Collecting pyyaml
  Downloading PyYAML-6.0-cp310-cp310-win_amd64.whl (151 kB)
     |████████████████████████████████| 151 kB ...
Collecting matplotlib
  Downloading matplotlib-3.5.2-cp310-cp310-win_amd64.whl (7.2 MB)
     |████████████████████████████████| 7.2 MB 6.4 MB/s
Collecting pillow>=6.2.0
  Downloading Pillow-9.2.0-cp310-cp310-win_amd64.whl (3.3 MB)
     |████████████████████████████████| 3.3 MB ...
Collecting cycler>=0.10
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting numpy>=1.17
  Downloading numpy-1.23.1-cp310-cp310-win_amd64.whl (14.6 MB)
     |████████████████████████████████| 14.6 MB 6.4 MB/s
Collecting packaging>=20.0
  Downloading packaging-21.3-py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB 2.5 MB/s
Collecting python-dateutil>=2.7
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     |████████████████████████████████| 247 kB ...
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.4.4-cp310-cp310-win_amd64.whl (55 kB)
     |████████████████████████████████| 55 kB 1.6 MB/s
Collecting pyparsing>=2.2.1
  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
     |████████████████████████████████| 98 kB 6.8 MB/s
Collecting fonttools>=4.22.0
  Downloading fonttools-4.34.4-py3-none-any.whl (944 kB)
     |████████████████████████████████| 944 kB 6.4 MB/s
Collecting six>=1.5
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, pyparsing, python-dateutil, pillow, packaging, numpy, kiwisolver, fonttools, cycler, pyyaml, matplotlib, gmsh, pyelmer
  WARNING: Failed to write executable - trying to use .deleteme logic
ERROR: Could not install packages due to an OSError: [WinError 2] The system cannot find the file specified: 'C:\\Python310\\Scripts\\f2py.exe' -> 'C:\\Python310\\Scripts\\f2py.exe.deleteme'

WARNING: You are using pip version 21.2.4; however, version 22.2 is available.
You should consider upgrading via the 'C:\Python310\python.exe -m pip install --upgrade pip' command.

Looks like I need to run this as an administrator.

C:\dev>pip install pyelmer
Collecting pyelmer
  Using cached pyelmer-1.0.0-py3-none-any.whl (27 kB)
Collecting pyyaml
  Using cached PyYAML-6.0-cp310-cp310-win_amd64.whl (151 kB)
Collecting gmsh
  Using cached gmsh-4.10.5-py2.py3-none-win_amd64.whl (38.4 MB)
Collecting matplotlib
  Using cached matplotlib-3.5.2-cp310-cp310-win_amd64.whl (7.2 MB)
Requirement already satisfied: python-dateutil>=2.7 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (2.8.2)
Requirement already satisfied: packaging>=20.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (21.3)
Requirement already satisfied: fonttools>=4.22.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (4.34.4)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (1.4.4)
Requirement already satisfied: pillow>=6.2.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (9.2.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (3.0.9)
Requirement already satisfied: numpy>=1.17 in c:\python310\lib\site-packages (from matplotlib->pyelmer) (1.23.1)
Collecting cycler>=0.10
  Using cached cycler-0.11.0-py3-none-any.whl (6.4 kB)
Requirement already satisfied: six>=1.5 in c:\python310\lib\site-packages (from python-dateutil>=2.7->matplotlib->pyelmer) (1.16.0)
Installing collected packages: cycler, pyyaml, matplotlib, gmsh, pyelmer
Successfully installed cycler-0.11.0 gmsh-4.10.5 matplotlib-3.5.2 pyelmer-1.0.0 pyyaml-6.0
WARNING: You are using pip version 21.2.4; however, version 22.2 is available.
You should consider upgrading via the 'C:\Python310\python.exe -m pip install --upgrade pip' command.

Installing pyelmer is not sufficient to avoid the “No module named ‘opencgs’” error. The solution is to clone the opencgs repo and use pip to install (as administrator) from that repo.

git clone https://github.com/nemocrys/opencgs
cd opencgs
D:\...\research\opencgs>pip install -e .
Obtaining file:///D:/dev/repos/fem/research/opencgs
Collecting meshio
  Using cached meshio-5.3.4-py3-none-any.whl (167 kB)
Collecting pandas
  Using cached pandas-1.4.3-cp310-cp310-win_amd64.whl (10.5 MB)
Requirement already satisfied: pyyaml in c:\python310\lib\site-packages (from opencgs==0.3.1) (6.0)
Requirement already satisfied: pyelmer in c:\python310\lib\site-packages (from opencgs==0.3.1) (1.0.0)
Collecting rich
  Using cached rich-12.5.1-py3-none-any.whl (235 kB)
Requirement already satisfied: numpy in c:\python310\lib\site-packages (from meshio->opencgs==0.3.1) (1.23.1)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\python310\lib\site-packages (from pandas->opencgs==0.3.1) (2.8.2)
Collecting pytz>=2020.1
  Using cached pytz-2022.1-py2.py3-none-any.whl (503 kB)
Requirement already satisfied: six>=1.5 in c:\python310\lib\site-packages (from python-dateutil>=2.8.1->pandas->opencgs==0.3.1) (1.16.0)
Requirement already satisfied: matplotlib in c:\python310\lib\site-packages (from pyelmer->opencgs==0.3.1) (3.5.2)
Requirement already satisfied: gmsh in c:\python310\lib\site-packages (from pyelmer->opencgs==0.3.1) (4.10.5)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (1.4.4)
Requirement already satisfied: pyparsing>=2.2.1 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (3.0.9)
Requirement already satisfied: cycler>=0.10 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (9.2.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (4.34.4)
Requirement already satisfied: packaging>=20.0 in c:\python310\lib\site-packages (from matplotlib->pyelmer->opencgs==0.3.1) (21.3)
Collecting commonmark<0.10.0,>=0.9.0
  Using cached commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in c:\python310\lib\site-packages (from rich->meshio->opencgs==0.3.1) (2.12.0)
Installing collected packages: commonmark, rich, pytz, pandas, meshio, opencgs
  Running setup.py develop for opencgs
Successfully installed commonmark-0.9.1 meshio-5.3.4 opencgs-0.3.1 pandas-1.4.3 pytz-2022.1 rich-12.5.1
WARNING: You are using pip version 21.2.4; however, version 22.2 is available.
You should consider upgrading via the 'C:\Python310\python.exe -m pip install --upgrade pip' command.

Trying to run gives an error about a missing module named objectgmsh. My assumption was the installing pyelmer was supposed to bring in all such modules.

D:\dev\repos\fem\research\test-cz-induction>python run.py
Traceback (most recent call last):
  File "D:\dev\repos\fem\research\test-cz-induction\run.py", line 5, in <module>
    import opencgs.control as ctrl
  File "d:\dev\repos\fem\research\opencgs\opencgs\__init__.py", line 3, in <module>
    import opencgs.geo
  File "d:\dev\repos\fem\research\opencgs\opencgs\geo.py", line 1, in <module>
    from objectgmsh import *
ModuleNotFoundError: No module named 'objectgmsh'

I notice while reviewing the paper yet again that it mentions the setup being in a Dockerfile. Sure enough, objectgmsh is a separate package that needs to be installed using pip. I upgrade pip for good measure since those warnings are getting annoying.

\test-cz-induction>pip install objectgmsh
Collecting objectgmsh
  Downloading objectgmsh-0.9-py3-none-any.whl (25 kB)
Requirement already satisfied: gmsh in c:\python310\lib\site-packages (from objectgmsh) (4.10.5)
Requirement already satisfied: numpy in c:\python310\lib\site-packages (from objectgmsh) (1.23.1)
Installing collected packages: objectgmsh
Successfully installed objectgmsh-0.9

I realize I should probably be running setup.py instead of run.py. Here’s the output now!

\test-cz-induction>python setup.py
crucible
melt
crystal
inductor
seed
insulation
crucible_adapter
axis_bt
vessel
axis_top
crucible 0.001999999999999999
melt 0.003
crystal 0.0004
inductor 0.001
seed 0.00017999999999999998
insulation 0.005
crucible_adapter 0.0039
axis_bt 0.0025
vessel 0.0025
axis_top 0.0025
filling 0.0263
Warning: Mesh size = 0 for bnd_melt. Ignoring this shape...
Warning: Mesh size = 0 for bnd_seed. Ignoring this shape...
Warning: Mesh size = 0 for bnd_crystal_side. Ignoring this shape...
Warning: Mesh size = 0 for bnd_crystal_top. Ignoring this shape...
Warning: Mesh size = 0 for bnd_axtop_side. Ignoring this shape...
Warning: Mesh size = 0 for bnd_axtop_bt. Ignoring this shape...
Warning: Mesh size = 0 for bnd_crucible_bt. Ignoring this shape...
Warning: Mesh size = 0 for bnd_crucible_outside. Ignoring this shape...
Warning: Mesh size = 0 for bnd_ins. Ignoring this shape...
Warning: Mesh size = 0 for bnd_adp. Ignoring this shape...
Warning: Mesh size = 0 for bnd_axbt. Ignoring this shape...
Warning: Mesh size = 0 for bnd_vessel_inside. Ignoring this shape...
Warning: Mesh size = 0 for bnd_vessel_outside. Ignoring this shape...
Warning: Mesh size = 0 for bnd_inductor_outside. Ignoring this shape...
Warning: Mesh size = 0 for bnd_inductor_inside. Ignoring this shape...
Warning: Mesh size = 0 for bnd_symmetry_axis. Ignoring this shape...
Warning: Mesh size = 0 for if_crucible_melt. Ignoring this shape...
Warning: Mesh size = 0 for if_melt_crystal. Ignoring this shape...
Warning: Mesh size = 0 for if_crystal_seed. Ignoring this shape...
Warning: Mesh size = 0 for if_seed_axtop. Ignoring this shape...
Warning: Mesh size = 0 for if_axtop_vessel. Ignoring this shape...
Warning: Mesh size = 0 for if_crucible_ins. Ignoring this shape...
Warning: Mesh size = 0 for if_ins_adp. Ignoring this shape...
Warning: Mesh size = 0 for if_adp_axbt. Ignoring this shape...
Warning: Mesh size = 0 for if_axbt_vessel. Ignoring this shape...
Info    : Meshing 1D...
Info    : [  0%] Meshing curve 3 (Line)
Info    : [ 10%] Meshing curve 4 (Line)
Info    : [ 10%] Meshing curve 15 (Line)
Info    : [ 10%] Meshing curve 16 (Line)
Info    : [ 10%] Meshing curve 17 (Line)
Info    : [ 10%] Meshing curve 18 (BSpline)
Info    : [ 20%] Meshing curve 19 (Circle)
...
Info    : [100%] Meshing surface 12 order 2
Info    : [100%] Meshing surface 13 order 2
Info    : [100%] Meshing surface 14 order 2
Info    : Surface mesh: worst distortion = 0.526353 (0 elements in ]0, 0.2]); worst gamma = 0.252838
Info    : Done meshing order 2 (Wall 0.990998s, CPU 0.75s)
-------------------------------------------------------
Version       : 4.10.5
License       : GNU General Public License
Build OS      : Windows64-sdk
Build date    : 20220701
Build host    : gmsh.info
Build options : 64Bit ALGLIB[contrib] ANN[contrib] Bamg Blas[petsc] Blossom Cgns DIntegration DomHex Eigen[contrib] Fltk Gmm[contrib] Hxt Jpeg Kbipack Lapack[petsc] MathEx[contrib] Med Mesh Metis[contrib] Mmg Mpeg Netgen NoSocklenT ONELAB ONELABMetamodel OpenCASCADE OpenCASCADE-CAF OpenGL OpenMP OptHom PETSc Parser Plugins Png Post QuadMeshingTools QuadTri Solver TetGen/BR Voro++[contrib] WinslowUntangler Zlib
FLTK version  : 1.4.0
PETSc version : 3.15.0 (real arithmtic)
OCC version   : 7.6.1
MED version   : 4.1.0
Packaged by   : nt authority system
Web site      : https://gmsh.info
Issue tracker : https://gitlab.onelab.info/gmsh/gmsh/issues
-------------------------------------------------------

Gmsh launches (although not as a separate process) with the generated mesh:

Crystal Growth Gmsh Model

setup.py outputs the path to the mesh after closing Gmsh.

Info    : Writing './simdata/_test/case.msh'...
Info    : Done writing './simdata/_test/case.msh'
Gmsh model created with objectgmsh.
Shapes: ['crucible', 'melt', 'crystal', 'inductor', 'seed', 'insulation', 'crucible_adapter', 'axis_bt', 'vessel', 'axis_top', 'filling', 'bnd_melt', 'bnd_seed', 'bnd_crystal_side', 'bnd_crystal_top', 'bnd_axtop_side', 'bnd_axtop_bt', 'bnd_crucible_bt', 'bnd_crucible_outside', 'bnd_ins', 'bnd_adp', 'bnd_axbt', 'bnd_vessel_inside', 'bnd_vessel_outside', 'bnd_inductor_outside', 'bnd_inductor_inside', 'bnd_symmetry_axis', 'if_crucible_melt', 'if_melt_crystal', 'if_crystal_seed', 'if_seed_axtop', 'if_axtop_vessel', 'if_crucible_ins', 'if_ins_adp', 'if_adp_axbt', 'if_axbt_vessel']

The next step will be to figure out how to simulate crystal growth using pyelmer.


Categories: CUDA, Graphics

Testing nVidia Cuda Samples

I have been toying around with the idea of doing a fluid dynamics or crystal growth simulation using nVidia CUDA. I decided to try out nVidia’s cuda samples to see what their approach looks like, in particular when rendering using OpenGL. I am using Visual Studio 2022 so I simply cloned the cuda samples repo, opened the fluidsGL_vs2022.sln solution, right click on the fluidsGL project, then selected Build.

Build started...
1>------ Build started: Project: fluidsGL, Configuration: Debug x64 ------
1>D:\dev\...\cuda-samples\Samples\5_Domain_Specific\fluidsGL\fluidsGL_vs2022.vcxproj(37,5): error MSB4019: The imported project "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 11.6.props" was not found. Confirm that the expression in the Import declaration "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\\BuildCustomizations\CUDA 11.6.props" is correct, and that the file exists on disk.
1>Done building project "fluidsGL_vs2022.vcxproj" -- FAILED.
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

The prerequisites section does mention that the CUDA Toolkit 11.6 is required, so I close VS and install it. I end up with version 11.7 though:

When reopening the fluidsGL solution, I still get the same error about CUDA 11.6.props not being found. A quick look at the directory this file is expected to be in reveals that this is a simple version mismatch problem – see the hard coded version in the fluidsGL.vcxproj file. Instead of fixing every example .vcxproj file to match CUDA 11.7, we can patch the VS folder by running these commands from an admin command prompt:

cd "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\BuildCustomizations"

copy "CUDA 11.7.props" "CUDA 11.6.props"
copy "CUDA 11.7.targets" "CUDA 11.6.targets"

The code now builds in Visual Studio and I can now oooh, aaaah over the demo. Visual Studio does seem a bit sluggish at opening the entire samples solution though… I get this information about my device in the console window after the demo launches:

GPU Device 0: "Pascal" with compute capability 6.1

CUDA device [Quadro P1000] has 5 Multi-Processors
After clicking and dragging mouse around

Categories: Compilers, Fortran, LLVM

Failing to Build Flang with Visual C++

Background

Elmer is the first codebase that I have dug into that has a substantial (or really any) amount of Fortran code. I used GFortran to build it but went digging around for a clang based compiler. I found llvm-project/flang and since I had been building LLVM earlier this year, I figured it should be straightforward to build flang and perhaps explore it in a debugger.

My first attempt to build flang (on Windows, my primary OS) resulted in many build errors. Unfortunately, I was using a preview Visual Studio build, so I didn’t want to compare the errors with those from a different machine because it wasn’t the same compiler version in use. I decided to use an RTM Visual Studio compiler (VS 17.2.5) to avoid possible compiler bugs present only in VS preview builds since most people would not be using preview VS builds anyway.

Without giving it much thought, my suspicion was that any build failures probably arose from not using the correct C++ version. The source code I was trying to build (commit c0702ac0) states that it uses C++17. I set this in CMake by defining the CXX_STANDARD property. Here is the full cmake command line I used to set up the build.

cd llvm-project
mkdir build
cd build

cmake \
  -G Ninja \
  ../llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DFLANG_ENABLE_WERROR=On \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_TARGETS_TO_BUILD=host \
  -DCMAKE_INSTALL_PREFIX=../install
  -DLLVM_LIT_ARGS=-v \
  -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" \
  -DLLVM_ENABLE_RUNTIMES="compiler-rt" \
  -DCXX_STANDARD=17

# Shown here without \ to be executable in cmd.exe
cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt" -DCXX_STANDARD=17

That took about 2 minutes on my machine after which I ran ninja to start the build

ninja

Unfortunately, the build failed! The first error I encountered was in fold-real.cpp. Here is the command line used to invoke the compiler (shown with newlines to simplify interpretation, see Compiler options listed alphabetically | Microsoft Docs for the complete list of compiler options).

 C:\PROGRA~1\MIB055~1\2022\ENTERP~1\VC\Tools\MSVC\1432~1.313\bin\Hostx64\x64\cl.exe
 /nologo
 /TP
 -DFLANG_LITTLE_ENDIAN=1
 -DGTEST_HAS_RTTI=0 -DUNICODE
 -D_CRT_NONSTDC_NO_DEPRECATE
 ...
 -D__STDC_LIMIT_MACROS
 -ID:\dev\repos\llvm-project\build-cpp17\tools\flang\lib\Evaluate
 ...
 -ID:\dev\repos\llvm-project\llvm\include
 -external:I D:\dev\repos\llvm-project\llvm\..\mlir\include
 ...
 -external:I D:\dev\repos\llvm-project\llvm\..\clang\include
 -external:W0
 /DWIN32
 /D_WINDOWS
 /Zc:inline
 /Zc:__cplusplus
 /Oi
 /bigobj
 /permissive-
 /W4
 -wd4141
 ...
 -wd4324
 -w14062
 -we4238
 /Gw
 /WX
 /MD
 /O2
 /Ob2
 /EHs-c-
 /GR-
 -UNDEBUG
 -std:c++17
 /showIncludes
 /Fotools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\fold-real.cpp.obj
 /Fdtools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\
 /FS
 -c
 D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-real.cpp

I had to look up the meaning of the C++ syntax on that line to understand why the build could be failing. Turns out to be a lambda, as explained at c++ – What is a lambda expression in C++11? – Stack Overflow.

I tried manually creating a repro for this compiler issue by creating a new Visual C++ project in Visual Studio and recreating the structure of the code failing to build. One of the questions I had was how to set conformance mode in a Visual Studio Cmake project. I still haven’t yet figured this out. However, one of the issues I ran into was that my cmake project was building the code without the /permissive- flag! I ended up switching to a regular Visual C++ project (.vcxproj) since I knew how to change the compiler options reliably for such projects. After struggling with recreating the code, I realized that I would make more progress removing code from flang’s fold-real.cpp instead. Here are some of the other searches and concepts I had to look up to understand the code while trying to create a minimal repro of the build failure.

  1. c++11 – how to remove error : X is not a class template – Stack Overflow
  2. c++ – What does template <unsigned int N> mean? – Stack Overflow
  3. what does this … (three dots) means in c++ – Stack Overflow
  4. create a reference in c++ -> Reference of Reference in C++ – Stack Overflow
  5. c++ using namespace -> Namespaces (C++) | Microsoft Docs
  6. using typedef -> Aliases and typedefs (C++) | Microsoft Docs
  7. move mechanics -> C++ Move Semantics Introduction | hacking C++ (hackingcpp.com)
  8. c++ – What is std::decay and when it should be used? – Stack Overflow

I was eventually able to create a simpler test case showing that the flang code could not build with my RTM compiler.

cl /std:c++17 /permissive- flang-msvc-clang-test.cpp

Microsoft (R) C/C++ Optimizing Compiler Version 19.32.31332 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

flang-msvc-clang-test.cpp
flang-msvc-clang-test.cpp(159): error C2065: 'T': undeclared identifier
flang-msvc-clang-test.cpp(48): note: see reference to function template instantiation 'auto FoldIntrinsicFunction::<lambda_1>::operator ()<_First>(const _T1 &) const' being compiled
        with
        [
            _First=Expr<Type<TypeCategory::Real,1>>,
            _T1=Expr<Type<TypeCategory::Real,1>>
        ]
flang-msvc-clang-test.cpp(171): note: see reference to function template instantiation 'Expr<Type<TypeCategory::Real,2>> FoldIntrinsicFunction<2>(FoldingContext &,FunctionRef<Type<TypeCategory::Real,2>> &&)' being compiled
flang-msvc-clang-test.cpp(159): error C2923: 'Scalar': 'T' is not a valid template type argument for parameter 'T'
flang-msvc-clang-test.cpp(159): note: see declaration of 'T'

So after all that, the RTM LTS Visual C++ compiler turned out to have a bug. Turns out the Visual C++ folks had already fixed this issue so the way to unblock myself was to switch to the preview Visual Studio build :(! The irony…

Suppressing Warnings

Armed with a preview build that correctly compiled the test case, the next obstacle in the build process was a set of warnings that were treated as errors: C4661 and C4101.

FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold.cpp.obj
C:\...\cl.exe ... -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold.cpp
D:\dev\repos\llvm-project\flang\include\flang\Evaluate\expression.h(101): error C2220: the following warning is treated as an error
D:\dev\repos\llvm-project\flang\include\flang\Evaluate\expression.h(101): warning C4661: 'std::optional<Fortran::evaluate::DynamicType> Fortran::evaluate::ExpressionBase<Fortran::evaluate::SomeDerived>::GetType(void) const': no suitable definition provided for explicit template instantiation request
...
FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-complex.cpp.obj
C:\...\cl.exe ... -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-complex.cpp
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-implementation.h(1583): error C2220: the following warning is treated as an error
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-implementation.h(1583): warning C4101: 'buffer': unreferenced local variable

I tried to suppressed them to keep marching forward:

cd \dev\repos\llvm-project
mkdir build-nowarn
cd build-nowarn

cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt" -DCXX_STANDARD=17 -DCXX_FLAGS="-wd4661 -wd4101"

ninja

Defining CXX_FLAGS like that did not work so I end up looking around for how to disable warnings in cmake. This was when I discovered that CMAKE_CXX_STANDARD is not necessary on the command line because flang/CMakeLists.txt already requires C++17. Trying to append the warning disable option /wdXXXX to that file didn’t work either. However, the comment on line 329 made me explore HandleLLVMOptions.cmake. There, I discovered support for setting the number of parallel jobs (via /MP for Visual C++). This file also contained the code that sets up most of the compiler options used when building! Closer to the task at hand is the discover of the LLVM_ENABLE_WARNINGS option and the hard-coded list of MSVC warning flags! I therefore made this change (before running cmake and ninja) to get the warning flags to be respected:

diff --git a/llvm/cmake/modules/HandleLLVMOptions.cmake b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 56d05f5b5fce..589281b232f1 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -648,6 +648,8 @@ if (MSVC)
           # v15.8.8. Re-evaluate the usefulness of this diagnostic when the bug
           # is fixed.
       -wd4709 # Suppress comma operator within array index expression
+      -wd4101  # Suppress ...
+      -wd4661  # Suppress ...

       # Ideally, we'd like this warning to be enabled, but even MSVC 2019 doesn't
       # support the 'aligned' attribute in the way that clang sources requires (for
cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DFLANG_ENABLE_WERROR=On -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD=host -DCMAKE_INSTALL_PREFIX=../install -DLLVM_LIT_ARGS=-v -DLLVM_ENABLE_PROJECTS="clang;mlir;flang" -DLLVM_ENABLE_RUNTIMES="compiler-rt"

Another Compiler Failure

With the aforementioned change, the build proceeded to a different build failure, this time in fold-integer.cpp.

FAILED: tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-integer.cpp.obj
C:\PROGRA~1\MIB055~1\2022\Preview\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\cl.exe  /nologo /TP -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -ID:\dev\repos\llvm-project\build-vsmain\tools\flang\lib\Evaluate -ID:\dev\repos\llvm-project\flang\lib\Evaluate -ID:\dev\repos\llvm-project\flang\include -ID:\dev\repos\llvm-project\build-vsmain\tools\flang\include -ID:\dev\repos\llvm-project\build-vsmain\include -ID:\dev\repos\llvm-project\llvm\include -external:ID:\dev\repos\llvm-project\llvm\..\mlir\include -external:ID:\dev\repos\llvm-project\build-vsmain\tools\mlir\include -external:ID:\dev\repos\llvm-project\build-vsmain\tools\clang\include -external:ID:\dev\repos\llvm-project\llvm\..\clang\include -external:W0 /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4101 -wd4661 -wd4324 -w14062 -we4238 /Gw /WX /MD /O2 /Ob2  /EHs-c- /GR- -UNDEBUG -std:c++17 /showIncludes /Fotools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\fold-integer.cpp.obj /Fdtools\flang\lib\Evaluate\CMakeFiles\obj.FortranEvaluate.dir\ /FS -c D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(771): error C2672: 'invoke': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.33.31627\include\type_traits(1552): note: could be 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(771): note: Failed to specialize function template 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.33.31627\include\type_traits(1552): note: see declaration of 'std::invoke'
...

By this point, I knew that simplifying the function containing the error was the fastest path to a repro. One of the little problems I ran into was how to figure out the type of fptr since it is declared using the auto keyword. I ended up assigning it to a new temporary variable of a different type, e.g. char et voila!

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(505): error C2440: 'initializing': cannot convert from 'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const' to 'char'

I then removed the temporary assignment and explicitly specified this type as the type of fptr:

using T2 = int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const;

T2 fptr{&Scalar<TI>::LEADZ};

The build then failed because the function pointer types are not the same, which was really confusing given that I had just checked the type of fptr.

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(504): error C2440: 'initializing': cannot convert from
'int (__cdecl Fortran::evaluate::value::Integer<16,true,16,unsigned short,unsigned int>::* )(void) const' to
'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const'

I switched the type of fptr and got a different error:

D:\dev\repos\llvm-project\flang\include\flang\Evaluate\integer.h(66): error C2607: static assertion failed
D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(490): note: see reference to class template instantiation 'Fortran::evaluate::value::Integer<16,true,16,unsigned char,unsigned short>' being compiled

Here is a different change I tried:

using T2 = int (__cdecl Fortran::evaluate::value::Integer<8>::* )(void) const;

T2 fptr{&Scalar<TI>::LEADZ};

That still failed with the following error:

D:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(504): error C2440: 'initializing': cannot convert from
'int (__cdecl Fortran::evaluate::value::Integer<8,true,8,unsigned char,unsigned short>::* )(void) const' to
'int (__cdecl Fortran::evaluate::value::Integer<32,true,32,unsigned int,unsigned __int64>::* )(void) const'

It was at this point that I realized that it was time to learn a bit more about decay. What is decay and array-to-pointer conversion? | C++ FAQ (64.github.io) had a good explanation of why the term decay is used. Perhaps a reexamination of std::decay – cppreference.com might lead to some insight. I wasn’t sure what Result referred to in the statement using TI = typename std::decay_t<decltype(n)>::Result; One idea I got was to append a number to the typename and examine the compiler error. Here’s the new line 752 of llvm-project/fold-integer.cpp and the resulting compiler error showing that this name cannot be arbitrary.

using TI = typename std::decay_t<decltype(n)>::Result3;


C:\dev\repos\llvm-project\flang\lib\Evaluate\fold-integer.cpp(502): error C2039: 'Result3': is not a member of 'Fortran::evaluate::Expr<Fortran::evaluate::Type<Fortran::common::TypeCategory::Integer,1>>'

Aha, so what it was referring to is the using statement in llvm-project/expression.h!

template <int KIND>
class Expr<Type<TypeCategory::Integer, KIND>>
    : public ExpressionBase<Type<TypeCategory::Integer, KIND>> {
public:
  using Result = Type<TypeCategory::Integer, KIND>;

...

The problematic lambda is therefore expecting a Scalar<Type<TypeCategory::Integer, KIND>>. Scalar is defined using decay and Type<TypeCategory::Integer, KIND>::Scalar is defined in llvm-project/type.h as the type value::Integer<8 * KIND>. This is when I see the reason for the previous build errors about mismatched Integer sizes no matter which size I picked – the fixed type I was using didn’t allow for the different template instantiations! Note that the problematic lambda is defined as a ScalarFunc.

By this point, I had a self-contained repro of the compiler bug, which ironically, compiled successfully on the RTM C++ compiler so I could use neither the preview nor the RTM to build the flang code.

cl /c /TP /std:c++17 /permissive- flang-msvc-clang-test-02.cpp

This compiler invocation gives the same error seen when compiling the flang code:

Microsoft (R) C/C++ Optimizing Compiler Version 19.33.31627.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

flang-msvc-clang-test-02.cpp
flang-msvc-clang-test-02.cpp(193): error C2672: 'invoke': no matching overloaded function found
C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.34.31721\include\type_traits(1552): note: could be 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'
flang-msvc-clang-test-02.cpp(193): note: Failed to specialize function template 'unknown-type std::invoke(_Callable &&,_Ty1 &&,_Types2 &&...) noexcept(<expr>)'

I ended up reporting this compiler bug via the Visual Studio feedback system – see C++17 lambda fails to compile on latest VS preview compiler.