Archive for February, 2009

Testing PDFs with Cucumber and Rails

Saturday, February 14th, 2009 by Alexander Lang

On a recent Rails project we have been working a lot with PDF documents. The application generates PDF invoices and account statements, places markers and comments into existing documents and also merges multiple single page PDFs into one. We do all this with a few PDF tools:

  • the PDF::Writer library for Ruby to generate PDFs
  • pdftk – to merge multiple PDFs into one or overlay them
  • pdftotext part of the xpdf library, to extract texts from PDFs

On OSX all of these can be installed via MacPorts. Debian has packages as well.

While PDF::Writer is a Ruby library pdftk is a command line tool. We simple call it using Kernel.system and check the return code via the $? variable.

unless system (“pdftk #{source_pdf} … output #{target_pdf}”)
raise PdfError(“pdftk returned error code #{$?}”)
end

With all this PDF processing the need for testing the contents of the generated documents arose. We have factored out all the PDF processing into a bunch of extra classes which we simply unit test with RSpec: make sure the parameters are passed to the command line correctly, that the right exception is thrown for each return code etc.

In addition to unit testing we also write customer driven acceptance tests with Cucumber, where we assert on a high level the outcome of certain actions. With HTML pages we can simply use the built-in steps that in turn use Webrat to parse the HTML like this:

Given a purchase over 200 EUR
And an invoice
When I go to the start page
And I follow “Invoice”
Then I should see “200 EUR”

Now in our case the invoice link links to a PDF but we still want to know what’s inside the document. The solution we came up with looks like this:

Given a purchase over 200 EUR
And an invoice
When I go to the start page
And I follow the PDF link “Invoice”
Then I should see “200 EUR”

What this does in the background is follow the link as usual, write the response into a temporary file, convert that to text using pdftotext and write the result back into the response. This way we can make assertions about the contents of the PDF almost as if it were an HTML page (except for tags of course). Here is the implementation:

When ‘I follow the PDF link “$label”‘ do |label|
click_link(label)
temp_pdf = Tempfile.new(‘pdf’)
temp_pdf << response.body
temp_pdf.close
temp_txt = Tempfile.new(‘txt’)
temp_txt.close
`pdftotext -q #{temp_pdf.path} #{temp_txt.path}`
response.body = File.read temp_txt.path
end

Bug fighting with Test Driven Development Follow-Up

Sunday, February 8th, 2009 by Thilo Utke

On Thursday I gave a talk about Bug fighting with TDD and how it will help you to write better software at the Ruby User Group in Berlin. Sorry, I won’t put the slides online, although people asked about it. I think they are a pretty useless collection of keywords and pictures without the talking. Instead I like to provide a collection of links to dive into the whole testing matters.

First a collection of tools you can choose from for various testing scenarios. They all have their pros and cons, it just comes down to which will do the job best for you.

Tools for Unit Testing

Tools for Mocking

Tools for Acceptance Testing

Tools to generate test objects

Misc tools

To save you the initial searching where to start, I collected some links where you find basic information about testing in general or with certain tools.

Getting started

For everyone who wants to get more into the subject, I drilled down my feed reader to collect some interesting blog posts on testing matters.

Opinions and In-sign

So much for my two cents to make the software world better.

Does your company need help with software testing? We can help: check out our new product Scene Investigation.

Hacking CouchDB, learning Erlang and testing in Javascript

Sunday, February 1st, 2009 by Alexander Lang

Last week was my first Erlang gig. I paired up with Jan Lehnardt (core committer of CouchDB). Our goal was to implement a new statistics module for CouchDB within one week. Jan knew more about the CouchDB code and Erlang, I contributed my knowledge about testing and pair programming.

Integration Testing in JavaScript

Since the testing infrastructure in CouchDB left some room for improvement we decided to introduce some new concepts. The existing tests were written in JavaScript penetrating the database via its HTTP/JSON interface. The tests were really long and coarse grained which lead to a problem: When one of the tests broke it wasn’t very easy to figure out which one. We didn’t want to replace or add too much to the existing, self-made test framework so we just added a bit of sugar: BDD style specs with meaningful names:

var response_count_tests = {
’should show the number of 404 responses’: function(name) {
// Given
restartServer();

// When
CouchDB.request(‘GET’, ‘/some_nonexistant_url’);

// Then
TEquals(1, CouchDB.stats_request(‘httpd’, ‘not_found’), name);
}
};

var all_tests = [response_count_tests, ...];

for(var i in all_tests) {
test_group = all_tests[i];
for(var j in test_group {
test_group[j](j);
}
}

The TEquals function is just our extension of an existing function T which is responsible for reporting errors when the assertion doesn’t match. By passing the name of the test spec we can now display that name along with the error message:

Error in ’should show the number of 404 responses’: ‘CouchDB.stats_request(‘httpd’, ‘not_found’)’ should be ‘1′ but was ‘0′.

Unit Testing in Erlang

In addition to those integration tests we wanted to add some unit testing on a lower level. Turns out the only unit testing framework for Erlang seems to be EUnit, which is a very barebones (at least when you come from something like RSpec) xUnit type of tool. You define a test by appending “_test” to your function name and then you have a bunch of assert statements at your disposal. Well that’s at least something and it worked. Here is an example:

should_return_zero_if_nothing_has_been_counted_yet_test() ->
test_setup(fun() ->
Result = ?MODULE:get({httpd, average_request_count}),
?assertEqual(< <"0.00">>, Result)
end).

We added the test_setup for starting and tearing down processes needed for our tests. Then we pass a fun with the actual test code. We got through the week pretty ok with this but I really missed a few things:

  • The test output was not red/green – ok I could live with this, at least for a while, if the output was at least formatted in a way that would allow me to spot the problems more easily. The error output of erlang seems to be a big mess in general.
  • No mocking/stubbing framework. I don’t even know if this is possible at all, at least I’m now aware of any way to change the behavior of Erlang code after compilation. (would that contradict Erlang’s share nothing philosophy?)
  • contexts for tests – I want to be able to group my tests, not sure how this would work in Erlang syntax as there are no nested namespaces or things you can use to group methods (except for modules, but I want something smaller and you can’t nest them). You would probably end up using lists of funs or something.
  • Erlang

    I’ve read the book, I’ve watched numerous screencasts but this was my first real life project in Erlang. Many people complain about its syntax and while this isn’t the most beautiful language I’ve ever seen I have to say it’s not really a big problem. I got used to it on the first day.

    What is really powerful in Erlang is pattern matching. Instead of simply assigning a value to a variable you can always match against the structure of that data to extract parts of it:

    Data = {“Hans”, “Wurst”, {born, 1980, 10, 23}}.
    {FirstName, _, {born, Year, _, Day} = Data.

    This example lets you pull out numerous details from the tuple in a single statement. You can do the same for case statements or when overloading functions. So this pattern matching is a powerful concept seen everywhere in Erlang code. It makes the code you write pretty dense which is cool on the one hand but sometimes also makes it hard to read so you have to be careful how much you want to cram into a single line of code.

    CouchDB is built using make – which felt like 1990. I’ve never used make much myself so I’m not really qualified to talk about it but we did spend a good amount of time in our MakeFile because of some whitespace problem, and we still haven’t managed to integrate our new tests into the build process. I’d love to see something like Rake being used instead.

    All in all this have been very pleasant first steps though. Pairing up for this turned out to be a very good idea, I have learned a lot about Erlang and am eager for more now.