6 Jun 2015

winterkoninkje: shadowcrane (clean) (Default)

bytestring-lexing 0.5.0

The bytestring-lexing package offers extremely efficient bytestring parsers for some common lexemes: namely integral and fractional numbers. In addition, it provides efficient serializers for (some of) the formats it parses.

As of version 0.3.0, bytestring-lexing offers the best-in-show parsers for integral values. (According to the Warp web server's benchmark of parsing the Content-Length field of HTTP headers.) And as of this version (0.5.0) it offers (to my knowledge) the best-in-show parser for fractional/floating numbers.

Changes since 0.4.3 (2013-03-21)

I've completely overhauled the parsers for fractional numbers.

The old Data.ByteString.Lex.Double and Data.ByteString.Lex.Lazy.Double modules have been removed, as has their reliance on Alex as a build tool. I know some users were reluctant to use bytestring-lexing because of that dependency, and forked their own version of bytestring-lexing-0.3.0's integral parsers. This is no longer an issue, and those users are requested to switch over to using bytestring-lexing.

The old modules are replaced by the new Data.ByteString.Lex.Fractional module. This module provides two variants of the primary parsers. The readDecimal and readExponential functions are very simple and should suffice for most users' needs. The readDecimalLimited and readExponentialLimited are variants which take an argument specifying the desired precision limit (in decimal digits). With care, the limited-precision parsers can perform far more efficiently than the unlimited-precision parsers. Performance aside, they can also be used to intentionally restrict the precision of your program's inputs.

Benchmarks

The Criterion output of the benchmark discussed below, can be seen here. The main competitors we compare against are the previous version of bytestring-lexing (which already surpassed text and attoparsec/scientific) and bytestring-read which was the previous best-in-show.

The unlimited-precision parsers provide 3.3× to 3.9× speedup over the readDouble function from bytestring-lexing-0.4.3.3, as well as being polymorphic over all Fractional values. For Float/Double: these functions have essentially the same performance as bytestring-read on reasonable inputs (1.07× to 0.89×), but for inputs which have far more precision than Float/Double can handle these functions are much slower than bytestring-read (0.30× 'speedup'). However, for Rational: these functions provide 1.26× to 1.96× speedup compared to bytestring-read.

The limited-precision parsers do even better, but require some care to use properly. For types with infinite precision (e.g., Rational) we can pass in an 'infinite' limit by passing the length of the input string plus one. For Rational: doing so provides 1.5× speedup over the unlimited-precision parsers (and 1.9× to 3× speedup over bytestring-read), because we can avoid intermediate renormalizations. Whether other unlimited precision types would see the same benefit remains an open question.

For types with inherently limited precision (e.g., Float/Double), we could either pass in an 'infinite' limit or we could pass in the actual inherent limit. For types with inherently limited precision, passing in an 'infinite' limit degrades performance compared to the unlimited-precision parsers (0.51× to 0.8× 'speedup'). Whereas, passing in the actual inherent limit gives 1.3× to 4.5× speedup over the unlimited-precision parsers. They also provide 1.2× to 1.4× speedup over bytestring-read; for a total of 5.1× to 14.4× speedup over bytestring-lexing-0.4.3.3!

Links

March 2017

S M T W T F S
    1234
567891011
12131415161718
19202122232425
262728293031 

Tags

Page generated 30 Mar 2017 12:38 pm
Powered by Dreamwidth Studios