Categories: Compilers

Entering the Compiler Space

Last week was my first week in the Java engineering group. It has been about 11 years since I took a compiler course (while in the CS MS program at BYU). A quick review of the history of Java was in order. Turns out I last used Java in 2012 in grad school. That must have been Java SE 7 from 2011 and Java SE 6 before that. Since I have not been in the compiler space since then, I have a steep learning curve ahead. That is the exciting thing about technology though – there is always more to learn!

I am currently a programmer in the developer division at Microsoft so it was helpful going through some of the Java development with Microsoft documentation for a high level overview of all our offerings. Also informative given my long absence from Java-land were the docs on how to Transition from Java 7 to Java 8 and from Java 8 to Java 11. It hadn’t yet dawned on me by the time I read through these that the reason references to 8, 11, and 17 keep coming up is because they are LTS releases.

As a newbie to the Java development world, I started by watching this 2019 OpenJDK Development talk on how to become an OpenJDK contributor. It is a great overview of concepts like project roles (author, committer, reviewer, etc), the contributor agreement, and (perhaps most importantly to me), how to find an issue to work on and build the OpenJDK. The breakdown of commonly used terminology and abbreviations was great to have as well.

For an introduction to the hotspot compiler, I started going through “A Simple Graph-Based Intermediate Representation“. I ended up watching Cliff Click’s talk on The Sea of Nodes and the HotSpot JIT before I got that far along in the paper. It was fascinating seeing details such as the CPU L1/L2 cache size playing into the design! Some of the concepts that I need to review after that talk include:

The sea of nodes talk also revealed to me how little I know about companies in the Java space. I don’t think I had heard of Azul before, for example. In fact, it’s not just companies but also technologies! I was going through some build documentation when I ran into mentions of AdoptOpenJDK and Adoptium, both of which were foreign to me. I was glad though to see my old friend Eclipse doing well.

One of the most enjoyable things about being a programmer is working with very skilled people, especially watching them in action! I always learn a lot! My colleagues David and Mat were kind enough to pull me into their triage and reporting of [JDK-8277299] STACK_OVERFLOW in Java_sun_awt_shell_Win32ShellFolder2_getIconBits – Java Bug System so I could get my feet wet with how things are done in OpenJDK development.

The OpenJDK process is certainly different from the other open source communities I’ve been a part of (.NET and Mozilla Firefox). My manager and I poked around the bug DB to see what compiler starter bugs are out there. I picked bug [JDK-7077093] labelOper::label() should return Label& but since I must start out as an author, issues cannot be assigned to me. Unusual to me but the logic appears sound. Here is the query for C2 starter bugs.

Other highlights of the week were setting up my dev box to build the OpenJDK source code (unsuccessfully), discovering that compiler explorer is a thing (and an open source one at that), learning from my teammates how to investigate a failure of a fairly complex test on MacOS (they were using LLDB). I hope to write follow-up entries on these at some point.


Categories: C++

Common C++ Standard Library Compiler Errors (by Rusty C++ Coders)

The first programming assignment in the Operating Systems course can be a challenge for students that haven’t written C++ code in a while. While working with the std::queue data structure in C++, it’s easy to make certain types of mistakes (especially if C++ isn’t your native tongue):

  1. Not “using namespace std” when using standard library containers. This can result in some ugly error messages in Visual Studio, e.g. error/warning codes C2143, C4430, and C2238 for the class member array below (is there a better way for students/developers to find out what is happening when they make such a trivial mistake)?
  2. Not understanding the assignment operator semantics on a container like a queue. If we write queue<type> myqueue = array[i]; we get a copy of the queue array[i] (we might have simply wanted a reference/alias). For such a mistake, the code obviously compiles but doesn’t function as intended.
  3. Declaring a fixed-sized data structure to hold all values from a variable-sized container! Runtime errors take care of informing students about this bug (if they’re not lucky enough to have almost empty variable-size containers). The correct declaration of dynamic arrays of templated items is also not usually obvious: T* all_elements = new T[dynamic_integer_size];

My Resolved Mozilla Bugs

Before starting my masters, I worked on a couple of Mozilla/Gecko items on Bugzilla. Here is a list of all tasks I tackled.


Categories: Valgrind

Quick Introduction to Valgrind

For my Computer Aided Geometric Design course, a simple program called CPLOT is used for some of the project work. Its documentation is on the course website. After my initial draft of my project 1 implementation crashed, I was inspired to try out Valgrind on CPLOT (even though I was able to find the bugs using my debugger). All that’s needed to install Valgrind on Ubuntu is sudo apt-get install valgrind. The CPLOT code supplied needed a minor change in order to compile, so I changed

#include <string.h>;

into

#include <string.h>;

and all was well with the world again. The updated CPLOT code is available in my public repo. Line 1 below compiles the code. The Valgrind Quick Start Guide recommends using the -g flag to produce debugging information so that Memcheck’s error messages include exact line numbers.

g++ -g -o cplot.out cplot.cpp
valgrind --leak-check=yes ./cplot.out eg1.dat eg1.eps

The eg1.dat file I used is the one on page 3 of the CPLOT documentation. Below is the output from valgrind:

==1594== Memcheck, a memory error detector
==1594== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==1594== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==1594== Command: ./cplot.out eg1.dat eg1.eps
==1594==
In initps: eg1.eps
File processed successfully!
==1594==
==1594== HEAP SUMMARY:
==1594==     in use at exit: 120 bytes in 6 blocks
==1594==   total heap usage: 8 allocs, 2 frees, 824 bytes allocated
==1594==
==1594== 120 (8 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==1594==    at 0x402641D: operator new(unsigned int) (vg_replace_malloc.c:255)
==1594==    by 0x8049224: readCurve() (cplot.cpp:182)
==1594==    by 0x8048D89: main (cplot.cpp:127)
==1594==
==1594== LEAK SUMMARY:
==1594==    definitely lost: 8 bytes in 1 blocks
==1594==    indirectly lost: 112 bytes in 5 blocks
==1594==      possibly lost: 0 bytes in 0 blocks
==1594==    still reachable: 0 bytes in 0 blocks
==1594==         suppressed: 0 bytes in 0 blocks
==1594==
==1594== For counts of detected and suppressed errors, rerun with: -v
==1594== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 17 from 6)

This program can be improved by changing the return statements in the body of the main for(;;) loop into break statements. By so doing, all the cleanup code can be placed after that loop, just before the program exits. This also prevents the duplication of the cleanup code. The code after the for(;;) loop then becomes:

    // Deallocate all allocated memory
    for (int i=0; i < NCURVES; i++) {
        delete curves[i];
    }

    // Close the file resources
    fclose(in);
    fclose(ps);
    return 0;

The Curve class destructor then becomes:

Curve::~Curve() {
    for (int j=0; j <= degree; j++)
        delete points[j];

    delete [] points;
};

Running Valgrind on the updated program then gives the following output.

==6539== Memcheck, a memory error detector
==6539== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==6539== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==6539== Command: ./cplot.out eg1.dat eg1.eps
==6539==
In initps: eg1.eps
File processed successfully!
==6539==
==6539== HEAP SUMMARY:
==6539==     in use at exit: 0 bytes in 0 blocks
==6539==   total heap usage: 8 allocs, 8 frees, 824 bytes allocated
==6539==
==6539== All heap blocks were freed -- no leaks are possible
==6539==
==6539== For counts of detected and suppressed errors, rerun with: -v
==6539== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 6)

There was another bug in the program since it could attempt to plot control polygons for uninitialized curves because of a missing else. I pushed the trivial fix to that. The program also crashed if the input file specified did not exist because fscanf would try to use a NULL parameter. The fix is a simple NULL check.


Modelling the Infamous Dining Philosophers

A sample implementation of the dining philosophers problem is provided in the JPF core source code. The solution I gave to this problem was to use the following locking scheme: if a thread has an even id i, then it should lock fork i first and then fork (i + 1) % N. If its id i is odd, then it should lock fork (i + 1) % N first and then lock fork i. Replacing

new Philosopher(forks[i], forks[(i + 1) % N]);

with

Fork leftFork, rightFork;
if (i % 2 == 0) {
    leftFork = forks[i];
    rightFork = forks[(i + 1) % N];
} else {
    leftFork = forks[(i + 1) % N];
    rightFork = forks[i];
}
new Philosopher(leftFork, rightFork);

Should do the trick. DiningPhil.jpf can then be launched to verify the absence of deadlock with this new locking scheme.


Categories: JPF

Copying Image Files to Build Folder in Eclipse

Part of my Google Summer of Code project this summer involved building a visualization tool to help examine trace files used by JPF Guided Test and a related under-approximation scheduler. The visualization tool is in the Guided Test repository. I was using Eclipse to manage the project. Everything built correctly but the program didn’t work because the icons were not in the build folder as expected. A quick hint from stackoverflow was that the solution is to use Ant.

<target name="-pre-jar" description="Copy Images">
  <property name="images.dir" value="build/main/edu/byu/cs/guided/search/visualize/graphics" />

  <copy todir="${images.dir}">
    <fileset dir="./src/main/edu/byu/cs/guided/search/visualize/graphics" />
  </copy>
  <echo level="info" message="Visualization icons was copied."/>
</target>

The key is to use the “-pre-jar” target in the build.xml file to copy the files into the right spot. This change was part of the visualization check-in into the repository. Some documentation on Ant targets is available in the Apache Ant User Manual. This StackOverflow entry on adding resource files to a jar file with Ant has some useful hints as well.


Categories: Eclipse

Attaching Java Source Files to an Eclipse JRE

As per this ubuntu thread, the solution is:

  1. Window > Preferences > Java > Installed JREs.
  2. Double click on the JRE for which you want to attach source code.
  3. Under “JRE System Libraries”, select rt.jar.
  4. Click on the “Source Attachment…” button.
  5. Supply a path to the source zip file, e.g. C:/Program Files/Java/jdk1.6.0_23/src.zip

That zip file was already on my system (since the JDK was installed) but some projects were using the JRE rather than the JDK. Therefore, I needed to specify the JDK source zip file for the JREs as well.


Building Racket in Linux

The Racket website has documentation on how to clone the PLT repository.

git clone git://git.racket-lang.org/plt.git

Next, the src/README file has all the gory details on how to build Racket. The procedure is rather straightforward for Linux:

mkdir build
cd build
../configure
make
make install

I’m yet to figure out how to successfully build racket in Visual C++ (2008 Professional), so that’s the next item on my list.

Update (03/11/11): Actually rather straightforward for Visual C++ as well, run vsvars32.bat to ensure that devenv and other commands are in the path:

cd plt\src\worksp\
"C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\Tools\vsvars32.bat"
build

I got the hint from this thread.


Implementing canvas putImageData’s optional arguments

Bug 498826 is about the HTML canvas putImageData method. It did not implement the optional arguments specified in the WHATWG spec. These optional arguments specify the dirty rectangle for the image data transfer (specifically, these arguments are the coordinates of the top left corner and the dimensions of the dirty rectangle, any of which are allowed to be negative). A quick glance at the description of the algorithm for handling of the optional arguments may not reveal the overall intent of the algorithm. Some of its key aspects are:

  1. Adjusting the dimensions of the rectangle to be positive by shifting the top left corner if necessary (step 2).
  2. Ensuring the top left corner of the dirty rectangle is in the first quadrant (step 3) which effectively eliminates all negative arguments.
  3. Clipping the dirty rectangle to ensure its lower right corner does not extend beyond the bounds of the incoming imagedata’s dimensions (step 4).
  4. Verifying that the newly adjusted dirty rectangle has positive dimensions (step 5), and if so, using the region bounded by the dirty rectangle on the incoming imagedata object as the source for the operation.

The patch is rather straightforward (although admittedly, it was not as straightforward to create on my part). If there are enough arguments to specify a dirty rectangle, then the JS_ValueToECMAInt32 function is used to convert the JavaScript values into integers. The CheckedInt32 and the gfxRect classes do most of the heavy lifting in the patch, and then only the dirty rectangle is redrawn.


Remove unused nsCookieService::SetCookieString argument

Bug 520914 is the only Bugzilla item I got to look at this month – since it required the least amount of time :). All that needed to be done was to remove the aPrompt argument currently in nsICookieService’s SetCookieString method. The locations in need of change were easily located with a simple MXR search.