Categories: Compilers

Ubuntu VM Setup for OpenJDK Development

I’m using a Windows 10 physical machine for my OpenJDK 17 development. Unfortunately, I ran into some issues getting the environment set up to build the JDK on Windows. To work around this, I created a Linux virtual machine. Although the instructions for building on Linux are on the OpenJDK site, I would like to have all the instructions in one spot, hence this post.

Creating an Ubuntu VM in Hyper-V

  1. Download an LTS Ubuntu .iso from the Ubuntu Desktop download page. I selected Ubuntu 20.04.3 LTS.
  2. Go to New > Virtual Machine in Hyper-V manager.
    1. Enter your VM name, generation, memory amount and type
    2. Select the connection type (Default Switch) and create a new virtual hard disk
    3. Select “Install an operating system from a bootable CD/DVD-ROM” then enter the path to the downloaded .iso file then click on Finish.
    4. Before starting the VM, set the number of virtual processors (it defaults to 1, which is less than ideal)!
  3. Perform a normal Ubuntu installation including erasing the disk

Let us now review the more interesting steps – those related to configuring the Ubuntu environment.

Increase the Resolution of the Ubuntu Guest OS

The default 1024×768 screen resolution of the Ubuntu guest is rather restrictive. The solution to this comes from https://askubuntu.com/questions/384602/ubuntu-hyper-v-guest-display-resolution. We need to configure the Hyper-V Synthetic Video Frame Buffer Driver by adding ” video=hyperv_fb:1680×1050” to the GRUB_CMDLINE_LINUX_DEFAULT value in the /etc/default/grub file.

sudo apt-get install linux-image-extra-virtual
sudo apt-get install vim
sudo vim /etc/default/grub
sudo update-grub
reboot

Install the development dependencies

The table below lists the JDK build dependencies and the commands to install them.

ComponentInstallation Command
autoconfsudo apt-get install autoconf
Gitsudo apt-get install git
C Compilersudo apt-get install build-essential
X11 librariessudo apt-get install libx11-dev libxext-dev libxrender-dev libxrandr-dev libxtst-dev libxt-dev
cupssudo apt-get install libcups2-dev
fontconfigsudo apt-get install libfontconfig1-dev
alsasudo apt-get install libasound2-dev
JDK Build Dependencies

This single command suffices to install all these components.

sudo apt-get install autoconf git build-essential libx11-dev libxext-dev libxrender-dev libxrandr-dev libxtst-dev libxt-dev libcups2-dev libfontconfig1-dev libasound2-dev

Install a code editor

Download the Visual Studio Code .deb file from https://code.visualstudio.com/Download. We can then install VS Code by running:

sudo apt install ~/Downloads/code_1.62.3-1637137107

Install a Boot JDK

I use the Microsoft OpenJDK build as the boot JDK. Here are the Ubuntu instructions for Installing the Microsoft Build of OpenJDK:

# Valid values are only '18.04' and '20.04'
# For other versions of Ubuntu, please use the tar.gz package
ubuntu_release=`lsb_release -rs`
cd ~/Downloads/
wget https://packages.microsoft.com/config/ubuntu/${ubuntu_release}/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
sudo apt-get install apt-transport-https
sudo apt-get update
sudo apt-get install msopenjdk-17

Verify that everything is working by running “java -version”

Clone and Build the JDK

Clone the JDK. Note that cloning a fork might be much slower than cloning the upstream Github repo! I was averaging about 60KiB/s on my rork whereas cloning the upstream OpenJDK was averaging 6 MiB/s when receiving objects!

mkdir ~/repos
cd ~/repos
git clone https://github.com/openjdk/jdk

The JDK repo can now be configured and built

cd jdk
bash configure
make images

The configure command should display any missing dependencies that it needs and a suggestion for how to install them.

To try out your new build, switch to the bin folder and check the Java version:

cd ~/repos/jdk/build/linux-x86_64-server-release/jdk/bin
./java -version

To browse through the contents of the build folder in a file manager:

xdg-open ./build


Categories: Compilers

Entering the Compiler Space

Last week was my first week in the Java engineering group. It has been about 11 years since I took a compiler course (while in the CS MS program at BYU). A quick review of the history of Java was in order. Turns out I last used Java in 2012 in grad school. That must have been Java SE 7 from 2011 and Java SE 6 before that. Since I have not been in the compiler space since then, I have a steep learning curve ahead. That is the exciting thing about technology though – there is always more to learn!

I am currently a programmer in the developer division at Microsoft so it was helpful going through some of the Java development with Microsoft documentation for a high level overview of all our offerings. Also informative given my long absence from Java-land were the docs on how to Transition from Java 7 to Java 8 and from Java 8 to Java 11. It hadn’t yet dawned on me by the time I read through these that the reason references to 8, 11, and 17 keep coming up is because they are LTS releases.

As a newbie to the Java development world, I started by watching this 2019 OpenJDK Development talk on how to become an OpenJDK contributor. It is a great overview of concepts like project roles (author, committer, reviewer, etc), the contributor agreement, and (perhaps most importantly to me), how to find an issue to work on and build the OpenJDK. The breakdown of commonly used terminology and abbreviations was great to have as well.

For an introduction to the hotspot compiler, I started going through “A Simple Graph-Based Intermediate Representation“. I ended up watching Cliff Click’s talk on The Sea of Nodes and the HotSpot JIT before I got that far along in the paper. It was fascinating seeing details such as the CPU L1/L2 cache size playing into the design! Some of the concepts that I need to review after that talk include:

The sea of nodes talk also revealed to me how little I know about companies in the Java space. I don’t think I had heard of Azul before, for example. In fact, it’s not just companies but also technologies! I was going through some build documentation when I ran into mentions of AdoptOpenJDK and Adoptium, both of which were foreign to me. I was glad though to see my old friend Eclipse doing well.

One of the most enjoyable things about being a programmer is working with very skilled people, especially watching them in action! I always learn a lot! My colleagues David and Mat were kind enough to pull me into their triage and reporting of [JDK-8277299] STACK_OVERFLOW in Java_sun_awt_shell_Win32ShellFolder2_getIconBits – Java Bug System so I could get my feet wet with how things are done in OpenJDK development.

The OpenJDK process is certainly different from the other open source communities I’ve been a part of (.NET and Mozilla Firefox). My manager and I poked around the bug DB to see what compiler starter bugs are out there. I picked bug [JDK-7077093] labelOper::label() should return Label& but since I must start out as an author, issues cannot be assigned to me. Unusual to me but the logic appears sound. Here is the query for C2 starter bugs.

Other highlights of the week were setting up my dev box to build the OpenJDK source code (unsuccessfully), discovering that compiler explorer is a thing (and an open source one at that), learning from my teammates how to investigate a failure of a fairly complex test on MacOS (they were using LLDB). I hope to write follow-up entries on these at some point.


Categories: Big Data

Exploring Apache’s Hadoop MapReduce Tutorial

In the last post, I described the straightforward process of setting up and Ubuntu VM in which to run Hadoop. Once you can successfully run the Hadoop MapReduce example in the MapReduce Tutorial, you may be interested in examining the source code using an IDE like Eclipse. To do so, install eclipse:

sudo apt-get install eclipse-platform

Some common Eclipse settings to adjust:

  1. Show line numbers (Window > Preferences > General > Editors > Text Editors > Show Line Numbers
  2. To make Eclipse use spaces instead of tabs (or vice versa), see this StackOverflow question.
  3. To auto-remove trailing whitespace in Eclipse, see this StackOverflow question.

To generate an Eclipse project for the Hadoop source code, the src/BUILDING.txt file lists these steps (which we cannot yet run):

cd ~/hadoop-2.7.4/src/hadoop-maven-pluggins
mvn install
cd ..
mvn eclipse:eclipse -DskipTests

To be able to run these commands, we need to install the packages required for building Hadoop. They are also listed in the src/BUILDING.txt file. For the VM we set up, we do not need to install the packages listed under Oracle JDK 1.7. Instead, run these commands to install Maven, native libraries, and ProtocolBuffer:

sudo apt-get -y install maven
sudo apt-get -y install build-essential autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev
sudo apt-get -y install libprotobuf-dev protobuf-compiler

Now here’s where things get interesting. The last command installs version 2.6.1 of the ProtocolBuffer. The src/BUILDING.txt file states that version 2.5.0 is required. Turns out they aren’t kidding – if you try generating the Eclipse project using version 2.6.1 (or some non 2.5.0 version), you’ll get an error similar to this one:

As suggested here and here, you can check the version by typing:

protoc --version

How do we install 2.5.0? Turns out we have to build ProtocolBuffer 2.5.0 from the source code ourselves but we need to grab the sources from Github now (unlike those now outdated instructions): https://github.com/google/protobuf/releases/tag/v2.5.0

mkdir ~/protobuf
cd ~/protobuf
wget https://github.com/google/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz
tar xvzf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0

Now follow the instructions in the README.txt file to build the source code.

./configure --prefix=/usr
make
make check
sudo make install
protoc --version

The output from the last command should now be “libprotoc 2.5.0“. Note: you most likely need to pass the –prefix option to ./configure to avoid errors like the one below.

Now we can finally generate the Eclipse project files for the Hadoop sources.

cd ~/hadoop-2.7.4/src/hadoop-maven-plugins
mvn install
cd ..
mvn eclipse:eclipse -DskipTests

Once project-file generation is complete:

  1. Type eclipse to launch the IDE.
  2. Go to the File > Import… menu option.
  3. Select the Existing Projects into Workspace option under General.
  4. Browse to the ~/hadoop-2.7.4/src folder in the Select root directory: input. A list of the projects in the src folder should be displayed.
  5. Click Finish to import the projects.

You should now be able to navigate to the WordCount.java file and inspect the various Hadoop classes.


Categories: Big Data

Setting up Apache Hadoop

As part of my Dynamic Big Data course, I have to set up a distributed file system to experiment with various mapreduce concepts. Let’s use Hadoop since it’s widely adopted. Thankfully, there are instructions on how to set up Apache Hadoop – we’re starting with a single cluster for now.

Once installation is complete, log onto the Ubuntu OS. Set up shared folders and enable the bidirectional clipboard as follows:

  1. From the VirtualBox Devices menu, choose Insert Guest Additions CD image… A prompt will be displayed stating that “VBOXADDITIONS_5.1.26_117224” contains software intended to be automatically started. Just click on the Run button to continue and enter the root password. When the guest additions installer completes, press Return to close the window when prompted.
  2. From the VirtualBox Devices menu, choose Shared Clipboard > Bidirectional. This enables two way clipboard functionality between the guest and host.
  3. From the VirtualBox Devices menu, choose Shared Folders > Shared Folders Settings… Click on the add Shared Folder button and enter a path to a folder on the host that you would like to be shared. Optionally select Auto-mount and Make Permanent.
  4. Open a terminal window. Enter these commands to mount the shared folder (assuming you named it vmshare in step 3 above):
mkdir ~/vmshare
sudo mount -t vboxsf -o uid=$UID,gid=$(id -g) vmshare ~/vmshare

To start installing the software we need, enter these commands:

sudo apt-get update
sudo apt install default-jdk

Next, get a copy of the Hadoop binaries from an Apache download mirror.

cd ~/Downloads
wget http://apache.cs.utah.edu/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz
mkdir ~/hadoop-2.7.4
tar xvzf hadoop-2.7.4.tar.gz -C ~/
cd ~/hadoop-2.7.4

The Apache Single Node Cluster Tutorial says to

export JAVA_HOME=/usr/java/latest

in the etc/hadoop/hadoop-env.sh script. On this Ubuntu setup, we end up needing to

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/

If you skip setting up this export, running bin/hadoop will give this error:

Error: JAVA_HOME is not set and could not be found.

Note: I found that setting JAVA_HOME=/usr caused subsequent processes (like generating Eclipse projects from the source using mvn) to fail even though the steps in the tutorial worked just fine.

To verify that Hadoop is now configured and ready to run (in a non-distributed mode as a single Java process), execute the commands listed in the tutorial.

$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'
$ cat output/*

The bin/hadoop jar command runs the code in the .jar file, specifically the code in Grep.java, passing it the last 3 arguments. The output should resemble this summary:

If you’re interested in the details of this example (e.g. to inspect Grep.java), examine the src subfolder. If you don’t need the binaries and just want to look at the code, you can wget it from a download mirror, e.g.:

wget http://apache.cs.utah.edu/hadoop/common/stable/hadoop-2.7.4-src.tar.gz

Categories: C++

Common C++ Standard Library Compiler Errors (by Rusty C++ Coders)

The first programming assignment in the Operating Systems course can be a challenge for students that haven’t written C++ code in a while. While working with the std::queue data structure in C++, it’s easy to make certain types of mistakes (especially if C++ isn’t your native tongue):

  1. Not “using namespace std” when using standard library containers. This can result in some ugly error messages in Visual Studio, e.g. error/warning codes C2143, C4430, and C2238 for the class member array below (is there a better way for students/developers to find out what is happening when they make such a trivial mistake)?
  2. Not understanding the assignment operator semantics on a container like a queue. If we write queue<type> myqueue = array[i]; we get a copy of the queue array[i] (we might have simply wanted a reference/alias). For such a mistake, the code obviously compiles but doesn’t function as intended.
  3. Declaring a fixed-sized data structure to hold all values from a variable-sized container! Runtime errors take care of informing students about this bug (if they’re not lucky enough to have almost empty variable-size containers). The correct declaration of dynamic arrays of templated items is also not usually obvious: T* all_elements = new T[dynamic_integer_size];

Categories: Windows

Fighting Windows Update Failures

For the past month or so, I have been unable to update my 7.5 year-old machine from Windows 10 Build 14393 because “we couldn’t connect to the update service”! Some folks online suggested trying the “Fix problems with Windows Update” wizard. Unfortunately (or fortunately), this Wizard identified a Potential Windows Update Database Error – but it couldn’t fix it!

The next idea was trying the system file checker as mentioned here. This tool did not find any issues on my filesystem. The DISM.exe commands seemed irrelevant so I took a quick peek at the Event Viewer. Lo and behold, a long list of WindowsUpdateFailure3 events. The supposed cure? These commands suggested in this thread (found by Googling “WindowsUpdateFailure3”).

net stop wuauserv
net stop cryptSvc
net stop bits
net stop msiserver
ren C:\Windows\SoftwareDistribution SoftwareDistribution.old
ren C:\Windows\System32\catroot2 Catroot2.old
net start wuauserv
net start cryptSvc
net start bits
net start msiserver

These didn’t solve the problem either. Digging around in log files led me to error code 0x80240438, but then again, no useful hints from queries on this code. The registry keys mentioned didn’t even exist on my machine. OK Google. Now there is a list of connection error codes, but of course it doesn’t include the one from the update failure log.

Desperation: what else haven’t I tried? This page with connection error codes mentions ensuring that http://*.update.microsoft.com is reachable. http://update.microsoft.com redirects to a page describing how to get the Windows 10 Creators Update now. Aha! I end up downloading and running the Windows 10 Upgrade tool. The update hangs (or appears to, after a couple of hours) at 25% so I simply hit the reset button hoping for the best. Sure enough, I end up back at build 14393 after a while. Desperation led me to try this tool again yesterday. I was encouraged to see that it actually downloaded an update to the Windows 10 Upgrade tool. When I popped in after rebooting to update, it was still at 25% but I decided to let it run overnight. Quite pleased was I this morning to come and find a Welcome screen prompting me to select my telemetry, etc, settings. I’m now finally running an up to date OS!


Common Verilog Mistakes by SystemVerilog Newbies

As part of my HDL Digital Design course at the University of Wyoming, I have to implement various modules using Verilog. Interestingly, the course textbook (Digital Design and Computer Architecture) uses SystemVerilog. This leaves an HDL newbie like me in an interesting spot since I tend to make assumptions about Verilog based on what I’ve seen in SystemVerilog, even though the latter is newer. Some common newbie issues I ran into (that were easily resolved by finding examples of Verilog modules):

  1. Declaring inputs as “input xyz[3:0]” instead of “input [3:0]xyz”.
  2. Trying to use assertions. It was quite surprising to me that assertions as explained in SystemVerilog weren’t “a thing” in Verilog. Some digging around led me to this article discussing assertions (and implying there wasn’t a direct way to assert facts in Verilog).
  3. Concatenating values
  4. No assertions? What… oh, I guess I already mentioned that, but I’m new to this HDL arena.

I guess I could have been more diligent in my search for Verilog tutorials for beginners (like this one) to alleviate all the annoyances I ran into assuming that learning SystemVerilog would translate in a completely seamless transition into writing Verilog code. Oh well, I guess learning a new language doesn’t always involve taking the most efficient path from A to B.


Resolving iverilog __gxx_personality_v0 errors

I ran into a strange error while trying to compile a Verilog module using the Icarus Verilog compiler:

iverilog DLL error

A quick lookup of the __gxx_personality_v0 error on StackOverflow revealed to me that the cause of this strange error is a mismatched/missing libstdc++-6.dll, fixable with a simple command:

copy C:\dev\tools\iverilog\bin\libstdc++-6.dll C:\dev\tools\iverilog\lib\ivl

I still need to investigate exactly why this libstdc++-6.dll problem occurs. It might be related to my Git environment (Git Bash?)


Categories: Theorem Proving

Installing Isabelle 2011

I’ve spent the past few days making a few skills strides – I finally buckled down and learned how to use Emacs. The tutorial felt somewhat long but was worthwhile. Having discovered how easily I can set margins and have my text automatically wrap at 80 characters has made me a convert! Interesting since Emacs is a program I simply never thought was worth the time until I decided to try out the Isabelle theorem prover. I grabbed the 32-bit Linux bundle, ran the simple commands listed on the download page, and was up and running.

tar -xzf Isabelle2011-1_bundle_x86-linux.tar.gz
Isabelle2011-1/bin/isabelle emacs

Having completed my Emacs tutorial, I then worked through the Basic Script Management portion of the Proof General User Manual. These tasks were sufficient to prepare me for my task of analysing some proofs from The Archive of Formal Proofs.


Simple Regex for Finding C++ Blocks

Not particularly accurate (obviously) but sufficient for what I needed:

(?s:\{[^{]+\})

Enables single line matching mode (that matches line termination characters as well), followed by the left curly brace, followed by all subsequent characters that aren’t the right curly brace, and finally ending with a curly brace.