Saint's Log

27Jan/120

Installing Isabelle 2011

I've spent the past few days making a few skills strides - I finally buckled down and learned how to use Emacs. The tutorial felt somewhat long but was worthwhile. Having discovered how easily I can set margins and have my text automatically wrap at 80 characters has made me a convert! Interesting since Emacs is a program I simply never thought was worth the time until I decided to try out the Isabelle theorem prover. I grabbed the 32-bit Linux bundle, ran the simple commands listed on the download page, and was up and running.

tar -xzf Isabelle2011-1_bundle_x86-linux.tar.gz
Isabelle2011-1/bin/isabelle emacs

Having completed my Emacs tutorial, I then worked through the Basic Script Management portion of the Proof General User Manual. These tasks were sufficient to prepare me for my task of analysing some proofs from The Archive of Formal Proofs.

11Nov/110

Simple Regex for Finding C++ Blocks

Not particularly accurate (obviously) but sufficient for what I needed:

(?s:\{[^{]+\})

Enables single line matching mode (that matches line termination characters as well), followed by the left curly brace, followed by all subsequent characters that aren't the right curly brace, and finally ending with a curly brace.

7Oct/110

My Resolved Mozilla Bugs

Before starting my masters, I worked on a couple of Mozilla/Gecko items on Bugzilla. Here is a list of all tasks I tackled.

17Sep/110

Quick Introduction to Valgrind

For my Computer Aided Geometric Design course, a simple program called CPLOT is used for some of the project work. Its documentation is on the course website. After my initial draft of my project 1 implementation crashed, I was inspired to try out Valgrind on CPLOT (even though I was able to find the bugs using my debugger). All that's needed to install Valgrind on Ubuntu is sudo apt-get install valgrind. The CPLOT code supplied needed a minor change in order to compile, so I changed

#include <string.h>;

into

#include <string.h>;

and all was well with the world again. The updated CPLOT code is available in my public repo. Line 1 below compiles the code. The Valgrind Quick Start Guide recommends using the -g flag to produce debugging information so that Memcheck's error messages include exact line numbers.

g++ -g -o cplot.out cplot.cpp
valgrind --leak-check=yes ./cplot.out eg1.dat eg1.eps

The eg1.dat file I used is the one on page 3 of the CPLOT documentation. Below is the output from valgrind:

==1594== Memcheck, a memory error detector
==1594== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==1594== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==1594== Command: ./cplot.out eg1.dat eg1.eps
==1594==
In initps: eg1.eps
File processed successfully!
==1594==
==1594== HEAP SUMMARY:
==1594==     in use at exit: 120 bytes in 6 blocks
==1594==   total heap usage: 8 allocs, 2 frees, 824 bytes allocated
==1594==
==1594== 120 (8 direct, 112 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==1594==    at 0x402641D: operator new(unsigned int) (vg_replace_malloc.c:255)
==1594==    by 0x8049224: readCurve() (cplot.cpp:182)
==1594==    by 0x8048D89: main (cplot.cpp:127)
==1594==
==1594== LEAK SUMMARY:
==1594==    definitely lost: 8 bytes in 1 blocks
==1594==    indirectly lost: 112 bytes in 5 blocks
==1594==      possibly lost: 0 bytes in 0 blocks
==1594==    still reachable: 0 bytes in 0 blocks
==1594==         suppressed: 0 bytes in 0 blocks
==1594==
==1594== For counts of detected and suppressed errors, rerun with: -v
==1594== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 17 from 6)

This program can be improved by changing the return statements in the body of the main for(;;) loop into break statements. By so doing, all the cleanup code can be placed after that loop, just before the program exits. This also prevents the duplication of the cleanup code. The code after the for(;;) loop then becomes:

    // Deallocate all allocated memory
    for (int i=0; i < NCURVES; i++) {
        delete curves[i];
    }

    // Close the file resources
    fclose(in);
    fclose(ps);
    return 0;

The Curve class destructor then becomes:

Curve::~Curve() {
    for (int j=0; j <= degree; j++)
        delete points[j];

    delete [] points;
};

Running Valgrind on the updated program then gives the following output.

==6539== Memcheck, a memory error detector
==6539== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==6539== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==6539== Command: ./cplot.out eg1.dat eg1.eps
==6539==
In initps: eg1.eps
File processed successfully!
==6539==
==6539== HEAP SUMMARY:
==6539==     in use at exit: 0 bytes in 0 blocks
==6539==   total heap usage: 8 allocs, 8 frees, 824 bytes allocated
==6539==
==6539== All heap blocks were freed -- no leaks are possible
==6539==
==6539== For counts of detected and suppressed errors, rerun with: -v
==6539== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 6)

There was another bug in the program since it could attempt to plot control polygons for uninitialized curves because of a missing else. I pushed the trivial fix to that. The program also crashed if the input file specified did not exist because fscanf would try to use a NULL parameter. The fix is a simple NULL check.

Filed under: Valgrind No Comments
6Sep/110

Modelling the Infamous Dining Philosophers

A sample implementation of the dining philosophers problem is provided in the JPF core source code. The solution I gave to this problem was to use the following locking scheme: if a thread has an even id i, then it should lock fork i first and then fork (i + 1) % N. If its id i is odd, then it should lock fork (i + 1) % N first and then lock fork i. Replacing

new Philosopher(forks[i], forks[(i + 1) % N]);

with

Fork leftFork, rightFork;
if (i % 2 == 0) {
    leftFork = forks[i];
    rightFork = forks[(i + 1) % N];
} else {
    leftFork = forks[(i + 1) % N];
    rightFork = forks[i];
}
new Philosopher(leftFork, rightFork);

Should do the trick. DiningPhil.jpf can then be launched to verify the absence of deadlock with this new locking scheme.

26Aug/110

Getting into Motorcycling

I've had the desire to learn how to ride a motorcycle for a few years now. This summer, I decided to act on this bucket list item and enrolled in the Basic Rider Course with UtahRiderEd. I signed up for the weekend option. The classes ran all day Saturday and Sunday (until about 5pm). We were given a Motorcycle Safety Foundation text (from which the written portion of the test administered on Sunday came). My classes were at Murray High School and my bike for the course was a blue Yamaha XT225. Below are a few pictures of the setup. I now have my 649cc and below endorsement on my DL. It was nice only having to take the written test at the DLD since the MSF course eliminates the need for a road test. Now, to get a bike... a Ninja perhaps, anyone?

31Jul/110

Copying Image Files to Build Folder in Eclipse

Part of my Google Summer of Code project this summer involved building a visualization tool to help examine trace files used by JPF Guided Test and a related under-approximation scheduler. The visualization tool is in the Guided Test repository. I was using Eclipse to manage the project. Everything built correctly but the program didn't work because the icons were not in the build folder as expected. A quick hint from stackoverflow was that the solution is to use Ant.

<target name="-pre-jar" description="Copy Images">
  <property name="images.dir" value="build/main/edu/byu/cs/guided/search/visualize/graphics" />

  <copy todir="${images.dir}">
    <fileset dir="./src/main/edu/byu/cs/guided/search/visualize/graphics" />
  </copy>
  <echo level="info" message="Visualization icons was copied."/>
</target>

The key is to use the "-pre-jar" target in the build.xml file to copy the files into the right spot. This change was part of the visualization check-in into the repository. Some documentation on Ant targets is available in the Apache Ant User Manual. This StackOverflow entry on adding resource files to a jar file with Ant has some useful hints as well.

Filed under: JPF No Comments
11May/110

Attaching Java Source Files to an Eclipse JRE

As per this ubuntu thread, the solution is:

  1. Window > Preferences > Java > Installed JREs.
  2. Double click on the JRE for which you want to attach source code.
  3. Under "JRE System Libraries", select rt.jar.
  4. Click on the "Source Attachment..." button.
  5. Supply a path to the source zip file, e.g. C:/Program Files/Java/jdk1.6.0_23/src.zip

That zip file was already on my system (since the JDK was installed) but some projects were using the JRE rather than the JDK. Therefore, I needed to specify the JDK source zip file for the JREs as well.

Filed under: Eclipse No Comments
6May/110

Creating a Rudimentary Tokenizer for the Delicious Data Set

In order to tokenize all the files in the Delicious data set to facilitate searching, I came up with the following basic scheme:

  1. Decode all XML entities leaving only the actual character they represent (see html_entity_decode).
  2. Strip out all HTML tags (see strip_tags).
  3. Convert everything to lower case (see strtolower).
  4. Strip out all numbers using the \b\d+\b regular expression (see preg_replace).
  5. Remove all non-alphabetic characters left.
  6. Collapse multiple spaces into one space.

This strategy translates very cleanly into PHP as shown below:

class WordTokenizer
{
	public static function tokenizeFile($file)
	{
		$contents = file_get_contents($file);
		
		$contents = html_entity_decode($contents);
		
		$contents = strtolower(strip_tags($contents));
		
		// strip numbers
		$contents = preg_replace('/\b\d+\b/', ' ', $contents);
		
		// strip punctuation symbols & non-alphabetic symbols
		$contents = preg_replace('/[^a-z]/', ' ', $contents);
		
		// replace multiple spaces with just one
		$contents = preg_replace('/\s+/', ' ', $contents);
		
		return $contents;
	}
}

The full source code is available in the information retrieval repository.

html_entity_decode
30Apr/110

Setting Up a Tag Database for the Delicious Data Set

One of my assignments this past semester involved performing search operations on the tags supplied with the delicious data set. I created a database named "contentindex" and then a table for the tag information from the XML file using the following SQL:

CREATE TABLE `tag` (
	`docname` VARCHAR(255) NOT NULL,
	`tagname` VARCHAR(255) NOT NULL,
	`weight` INT NOT NULL,
	UNIQUE index_doc_tag_pair_weight (`docname`, `tagname`)
)
ENGINE = InnoDB;

I set up an Eclipse Java project for the assignment. The MySQL Connector/J JDBC driver is required for interaction with the database. You will need to add it to the unzipped JAR file to the Delicious Eclipse project. To do so, right click on the project -> Build Path -> Add External Archives...

A list of XML parsers that could be used on the tag info XML file is available on the SAX website. I settled on the Xerces parser. Setup was a straightforward extraction into a folder, creation of a corresponding Java project in Eclipse, and finally adding the Xerces project to the Delicious Eclipse project's build path - Right click on the project -> Build Path -> Configure Build Path.... -> "Projects" Tab -> Add...

The source code for the XML tag info parser is available on my github repo.