The bytestring-lexing package offers extremely efficient bytestring parsers for some common lexemes: namely integral and fractional numbers. In addition, it provides efficient serializers for (some of) the formats it parses.
As of version 0.3.0, bytestring-lexing offers the best-in-show parsers for integral values. (According to the Warp web server's benchmark of parsing the Content-Length field of HTTP headers.) And as of this version (0.5.0) it offers (to my knowledge) the best-in-show parser for fractional/floating numbers.
Changes since 0.4.3 (2013-03-21)
I've completely overhauled the parsers for fractional numbers.
Data.ByteString.Lex.Lazy.Double modules have been removed, as has their reliance on Alex as a build tool. I know some users were reluctant to use bytestring-lexing because of that dependency, and forked their own version of bytestring-lexing-0.3.0's integral parsers. This is no longer an issue, and those users are requested to switch over to using bytestring-lexing.
The old modules are replaced by the new
Data.ByteString.Lex.Fractional module. This module provides two variants of the primary parsers. The
readExponential functions are very simple and should suffice for most users' needs. The
readExponentialLimited are variants which take an argument specifying the desired precision limit (in decimal digits). With care, the limited-precision parsers can perform far more efficiently than the unlimited-precision parsers. Performance aside, they can also be used to intentionally restrict the precision of your program's inputs.
The Criterion output of the benchmark discussed below, can be seen here. The main competitors we compare against are the previous version of bytestring-lexing (which already surpassed text and attoparsec/scientific) and bytestring-read which was the previous best-in-show.
The unlimited-precision parsers provide 3.3× to 3.9× speedup over the
readDouble function from bytestring-lexing-0.4.3.3, as well as being polymorphic over all
Fractional values. For
Double: these functions have essentially the same performance as bytestring-read on reasonable inputs (1.07× to 0.89×), but for inputs which have far more precision than
Double can handle these functions are much slower than bytestring-read (0.30× 'speedup'). However, for
Rational: these functions provide 1.26× to 1.96× speedup compared to bytestring-read.
The limited-precision parsers do even better, but require some care to use properly. For types with infinite precision (e.g.,
Rational) we can pass in an 'infinite' limit by passing the length of the input string plus one. For
Rational: doing so provides 1.5× speedup over the unlimited-precision parsers (and 1.9× to 3× speedup over bytestring-read), because we can avoid intermediate renormalizations. Whether other unlimited precision types would see the same benefit remains an open question.
For types with inherently limited precision (e.g.,
Double), we could either pass in an 'infinite' limit or we could pass in the actual inherent limit. For types with inherently limited precision, passing in an 'infinite' limit degrades performance compared to the unlimited-precision parsers (0.51× to 0.8× 'speedup'). Whereas, passing in the actual inherent limit gives 1.3× to 4.5× speedup over the unlimited-precision parsers. They also provide 1.2× to 1.4× speedup over bytestring-read; for a total of 5.1× to 14.4× speedup over bytestring-lexing-0.4.3.3!
- Homepage: http://code.haskell.org/~wren/
- Hackage: http://hackage.haskell.org/package/
- Darcs: http://community.haskell.org/~wren/
- Haddock: Darcs version