winterkoninkje: shadowcrane (clean) (Default)

Okay, so my plans for weekly postings about the papers I've been reading failed. Mostly because I haven't been reading as many these last few weeks than I had been. But I left off with a comment about testing in Haskell and I figured I should pop that off my stack before going on to a different post.

One of the glorious things about having a pure, side-effect free, language is that you can state properties of your program as mathematical laws. When you, the programmer, know your code should adhere to some law like x + y == y + x, the usual thing most languages force you to do is enumerate a few hundred unit tests stating the law over and over with different values for x and y. But what if you could just state the law itself and let your test harness make up the examples? While mathematics is obviously amenable to such laws, the realm of lawful programming extends far beyond mathematics.

Consider for example that you have some function you want to write which has an implementation that's entirely straightforward yet painfully slow, and another implementation that's horrifically complex but fast. Naturally we'd like to use the fast version, but how can we know it's implemented correctly? The slow implementation is easy to decipher so it's easy to spot check for being correct. The easiest way to test the complex version is to implement both, dictate a law stating they're equal (i.e. they give the same output for the same inputs), and then let our test harness try a few thousand arbitrary inputs and check that the outputs are indeed the same.

If that idea sounds so outrageous, so unfathomable, so simple that it just might work then QuickCheck is the test harness that made it happen. For those who've been around the Haskell community for a while, you may be wondering why I'd bring up this oft-trotted-out example of the beauty of pure functional programming. Well, other than introducing the new folks, QuickCheck ain't the only lawful kid on the block.

QuickCheck randomly generates values to feed into your laws. While this is good for informally testing code, there's a lot to beware of. For recursive types like lists or trees, usually the values QuickCheck generates are fairly small so it may not actually be as rigorous as you think. Moreover, because the values are also random, often you'll be testing the same small inputs over and over. You can tweak the functions for generating these values to bias them to be longer, which is usually enough to feel secure that passing the trial is meaningful. But it would be nice to be more systematic about it all.

SmallCheck takes the same laws as QuickCheck but, instead of passing random inputs, it will exhaustively try all values up to some depth (e.g. list length, tree height). If there are any errors, even corner cases, chances are they'll show up within that search space. Unfortunately, most of the time the search space increases exponentially with the depth, which means you can only test up to a rather limited depth.

LazySmallCheck does the same thing as SmallCheck but it employs laziness to prune the search space and so it can search deeper, faster. Let's say you have a function like this: f xs = if null xs then False else (head xs == 1). Because the function only looks at the first element of the list, it's possible to simultaneously test every list with a given first element. In situations like this LazySmallCheck can test a potentially infinite number of values in constant time. (Eventually LazySmallCheck is intended to merge with the strict SmallCheck.)

Speaking of laziness, it's often the case that being too strict means you're doing too much work and taking up too much memory. So how can we be sure a function we've written is as lazy as possible? It's still in an experimental state but, StrictCheck [hs, slides] aims to test just that. Moreover, if it finds that the function is not least-strict then it will suggest ways to make it less strict.

StrictCheck is built using the ChasingBottoms library. ChasingBottoms is certainly "unsafe" in the unsafePerformIO sense because it gives you access to information about the state of Haskell's runtime environment —information which can allow you to break the semantics of the language—, and it's GHC-specific (sorry Hugs, nhc98, Yhc, and Jhc folks), but when you're trying to test the laziness abstraction itself, sometimes you gotta murder a few people.

But away from murder, let's get back to the lawful. One of the nice features of QuickCheck is that you can place guards on your laws to say "if these preconditions hold, then this law holds". One problem is that if the precondition is too restrictive, QuickCheck will exhaust its allocation of samples without finding enough valid inputs for testing the law. We can raise the allocation limit, but when the valid inputs are sparse, even that won't help enough. Enter SparseCheck. Using the paradigm of logic programming, SparseCheck lets you define generators of arbitrary values which pass the preconditions, without even trying any values which don't.

While formal validation is an excellent sledgehammer only pure languages can succeed at, sometimes it's not enough. Sometimes you have to deal with the RealWorld, and sometimes you want to write explicit regression tests to trap the known corner cases where bugs have been found. When it comes to such unit testing HUnit is the framework to use.

This just in, delving one step further towards traditional testing methods, HTF offers a single framework for combining HUnit and QuickCheck along with doing file-based tests (`runMyProgram | diff - expectedOutput`).

So go forth and make agile with the test-driven development my friends!

Edit 2008.07.15: Oh, and don't forget HPC

April 2019

S M T W T F S
 123456
78910111213
14151617181920
212223242526 27
282930    

Tags

Page generated 5 Jun 2025 05:03 pm
Powered by Dreamwidth Studios