Monthly Archives: February 2009

Parser in action

Here is a screen shot of my parser in action. I have run a sample log that was generated by some of my patches:

Isn’t that just the most interesting thing you have ever seen ;)

Some lines spill over, but that isn’t highly important right now. There is also tons of bogus information.

And now for something completely different

Today I did my second demo for Dave. I showed him the starts of my parser and went over my plans. My parser currently has a couple issues with some test and is waiting for time stamps from Reftests and Mochitests. I have also posted this to my bug as a WIP link. Dave and I were discussing how my parser was going to get invoked. Right now my plan is to have my parser as a standalone and library. As a standalone it will be able to parse output from tests and as a library it will be integrated to the special Mozilla BuildBot steps to parse the output. I have noticed that there is a Log File Observer in buildbot which might prove fruitful. Eventually the parsed data will be going into a DB which will likely be SQLite. Dave suggested that I queue up all my transaction in memory then do one large commit so I don’t add too much overhead. I feel that I am on track as far as the parser goes for scanning passing unit tests. I wouldn’t mind writing some simple tests that I know are going to fail so that I can verify my parser’s handling of real failed steps.

On the patch front I am having more success with adding timestamps. My first patch got an r- for a number of totally valid reasons. Ted, who did the review, mentioned that there is little value to being able to specify the time stamp program and format. I think that is right, so in that respect I have decided that I will be using unix time as my format. This makes sense because every language I need to interface with seems to support this format (JavaScript new Date().getTime()/1000, Python time.Time() and Bash date +%s. This makes my life a TON easier! Ted recommended renaming –enable-timestamp to –enable-unit-test-timestamps as the former is very generic sounding. This will be rolled into one big hyper-mega-super-patch I hope to have done very soon. I asked if there was even any value in having timestamps being able to be disabled. Ted said he is ok with that, but that I should talk to the Mochitest and Reftest owners about the possibility.

Regarding the option of enabling/disabling unittest timestamps, I spoke to Dave and Nino about this and found out that I can make javascript files get preprocessed by adding them to the Makefile.in “EXTRA_PP_COMPONENTS” section. This would allow me to use standard preprocessor directives in JavaScript with which I could surround my special printing

This morning I finally got around to asking about the anomalous testing output from the feed unit tests and it seems that it is indeed a bug. I have filed bug 479976 along with a patch which corrects this and adds timestamps to unit test output.

I have had a lot of fun working on this over the past two weeks and feel that I have made some progress. I am also LOVING Python. It is a really neat language. I plan on looking at some sort of Python web framework for the web interface to my parser. Anyone have any recommendations? Ideally it would be something already in use by Mozilla to simplify deployment.

Unit Tests

After reading Benjamin Smedberg’s blog post on pymake, I realised that I should be writing unit tests for my parser. I didn’t realise that python had a built in unit testing framework. I am glad I found this out while still in the early stages of my python learning!

0.6 – The end of ridiculousness

I give up on my 1/7th release numbers. I will be following the standard numbers from now on.

0.6

This release involved getting a patch submitted to Bugzilla, doing a ton of thinking and starting writing my parser.

Initial Patch Submitted and Thinking

I have submitted a patch to Bugzilla on my bug. This patch:

  • Creates a mozconfig enable timing, timing program and timestamp format option
  • Puts timestamps using the above options into some tests. Right now, it is doing them for XPCShell test which use test_all.sh and the check:: make target (minus the feeds test suite)

Working on this patch has showed me how varied the testing frameworks are. I am a little nervous about continuing along this path without confirmation of my approach. Originally I was hoping to customize the time stamp formatting but I now realise that there is very little pay off for a ton of effort. From now on, I am going to be using exclusively epoch based times. This makes it easy because if I continue with this approach, I’d need to get a make option visible in JavaScript, which I don’t know how to do.

My other option would be to modify BuildBot. This option would involve overriding the existing LogFile class to have one which did timing. This new type of LogFile would only be used on Mozilla testing buildsteps. This would make it a lot easier to add the timestamps but would be harder to parse and would be useless outside of BuildBot.

If you are at all familiar with Mozilla’s testing framework I’d love your opinion.

Parser

I have started writing my log file parser. I have most standard output being parsed fine. Right now, everything but Mochitest based tests and some obscure tests are able to be parsed. I will be making a screen capture this week to highlight my parser for my second Demo.

It works by reading all the lines in the specified file into an array. It then goes through each line and decides if it is part of a unit test. If it is, it is narrowed down to the type of test it is (reftest, xpcshell test, etc). Once that is done, the line is sent to the initialiser for that type of test’s output. These classes understand each type of test output and parse accordingly. Once this is done, the parser spits out a list of output as specified my bug.

Some things this parser doesn’t do

  • There are obscurely formatted tests early on in the “make -k check” output. They aren’t looked at
  • Mochitest output is not parsed because there are no timestamps.
  • The Log class doesn’t have information about itself and won’t until it is injected into the log somehow
  • Some test failures generate multi-line output bounded by “<<<<<<" and ">>>>>>” lines. This is not being handled right now, but I have an idea that will take care of it
  • Parse all non-epoch date/time information into Unix time

Sample output

1/1 | 20090213-014025 | ../../_tests/xpcshell-simple/test_necko/unit/test_bug321706.js | local-osx | TEST-PASS | all tests passed | http://www.example.com/waterfall
1/1 | 20090213-014025 | ../../_tests/xpcshell-simple/test_necko/unit/test_bug331825.js | local-osx | TEST-PASS | all tests passed | http://www.example.com/waterfall
1/1 | 20090213-014025 | ../../_tests/xpcshell-simple/test_necko/unit/test_bug336501.js | local-osx | TEST-PASS | all tests passed | http://www.example.com/waterfall
1/1 | 20090213-014025 | ../../_tests/xpcshell-simple/test_necko/unit/test_bug337744.js | local-osx | TEST-PASS | all tests passed | http://www.example.com/waterfall

The 1/1, local-osx and url are placeholder values.

I would like to investigate using a parser generator at some point in the coming week.

Project Update

I have put my first patch on Mozilla’s Bugzilla for timestamps. This is the option for changing the code to do timing information. Again, the other option is to modify buildbot. This patch would still require a change to insert machine information into the log file, so we can figure out which machines are failing. I am at a crossroads and I need to decide if I am going to do this with a buildbot specific approach or modify the actual testing infrastructure.

My options include:

1) In BuildBot Subclass LogFile to TimedLogFile:
This option involves subclassing LogFile to a specialized LogFile which will implement the timing steps. Rather than adding this in the initial LogFile implementation, the subclassing removes the breakage of parallel, multi-line build task output. This subclass would only be used by specialized build steps.

2) In BuildBot Subclass LogFile to DatabaseLogFile:
This option involves intercepting LogFile addStdout() calls and sending them straight to a database. This has the disadvantage of requiring DB libraries on the client machines. I don’t know if it would be possible to route the database connection through the slaveport or not, so it might even require a second socket connection. This option has the [advantage|disadvantage] of not changing the actual build output. This could also lead to simpler parser writing because the parser would only need to know where the database is, no files

3) Modify test harnesses:
This is what I have started doing with my patch. It requires a lot of changes all over the place so it might be hard to get landed. This also changes the build output for everyone. It also requires that one timestamp format/option is used, so one format must be used in C, JS and Py.

4) Investigate pymake. This is not really an option until pymake is more mainstream.

I need to pick one of these methods very soon but I would like to get someone else’s opinion on this first.