winterkoninkje: shadowcrane (clean) (Default)

data-fin 0.1.0

The data-fin package offers the family of totally ordered finite sets, implemented as newtypes of Integer, etc. Thus, you get all the joys of:

data Nat = Zero | Succ !Nat

data Fin :: Nat -> * where
    FZero :: (n::Nat) -> Fin (Succ n)
    FSucc :: (n::Nat) -> Fin n -> Fun (Succ n)

But with the efficiency of native types instead of unary encodings.

Notes

I wrote this package for a linear algebra system I've been working on, but it should also be useful for folks working on Agda, Idris, etc, who want something more efficient to compile down to in Haskell. The package is still highly experimental, and I welcome any and all feedback.

Note that we implement type-level numbers using [1] and [2], which works fairly well, but not as nicely as true dependent types since we can't express certain typeclass entailments. Once the constraint solver for type-level natural numbers becomes available, we'll switch over to using that.

[1] Oleg Kiselyov and Chung-chieh Shan. (2007) Lightweight static resources: Sexy types for embedded and systems programming. Proc. Trends in Functional Programming. New York, 2–4 April 2007.

[2] Oleg Kiselyov and Chung-chieh Shan. (2004) Implicit configurations: or, type classes reflect the values of types. Proc. ACM SIGPLAN 2004 workshop on Haskell. Snowbird, Utah, USA, 22 September 2004. pp.33–44.

Links

winterkoninkje: shadowcrane (clean) (Default)

prelude-safeenum 0.1.0

The prelude-safeenum package offers a safe alternative to the Prelude's Enum class in order to render it safe. While we're at it, we also generalize the notion of enumeration to support types which can only be enumerated in one direction.

Description

The prelude-safeenum package offers an alternative to the notion of enumeration provided by the Prelude. For now it is just a package, but the eventual goal is to be incorporated into haskell prime. Some salient characteristics of the new type-class hierarchy are:

Removes partial functions
The Haskell Language Report section 6.3.4 defines pred, succ, fromEnum, and toEnum to be partial functions when the type is Bounded, but this is unacceptable. The new classes remove this problem by correcting the type signatures for these functions.
Generalizes the notion of enumeration
Rather than requiring that the type is linearly enumerable, we distinguish between forward enumeration (which allows for multiple predecessors) and backward enumeration (which allows for multiple successors).
Adds new functions: enumDownFrom, enumDownFromTo
One of the big problems with the partiality of pred is that there is no safe way to enumerate downwards since in the border case enumFromThen x (pred x) will throw an error rather than evaluating to [x] as desired. These new functions remove this problem.
Removes the requirement...
...that the enumeration order coincides with the Ord ordering (if one exists). Though, of course, it's advisable to keep them in sync if possible, for your sanity.
Ensures that the notion of enumeration is well-defined
This much-needed rigor clarifies the meaning of enumeration. In addition, it rules out instances for Float and Double which are highly problematic and often confuse newcomers to Haskell. Unfortunately, this rigor does render the instance for Ratio problematic. However, Ratio instances can be provided so long as the base type is enumerable (and Integral, naturally); but they must be done in an obscure order that does not coincide with Ord.
The obscure order required for well-defined enumeration of Ratio is provided.

Links

winterkoninkje: shadowcrane (clean) (Default)

unification-fd 0.7.0

The unification-fd package offers generic functions for single-sorted first-order structural unification (think of programming in Prolog, or of the metavariables in type inference)[1][2]. The library is sufficient for implementing higher-rank type systems à la Peyton Jones, Vytiniotis, Weirich, Shields, but bear in mind that unification variables are the metavariables of type inference— not the type-variables.

An effort has been made to make the package as portable as possible. However, because it uses the ST monad and the mtl-2 package it can't be H98 nor H2010. However, it only uses the following common extensions which should be well supported[3]:

Rank2Types
MultiParamTypeClasses
FunctionalDependencies -- Alas, necessary for type inference
FlexibleContexts       -- Necessary for practical use of MPTCs
FlexibleInstances      -- Necessary for practical use of MPTCs
UndecidableInstances   -- For Show instances due to two-level types

Changes (since 0.6.0)

This release is another major API breaking release. Apologies, but things are a lot cleaner now and hopefully the API won't break again for a while. The biggest change is that the definition of terms has changed from the previous:

    data MutTerm v t
        = MutVar  !v
        | MutTerm !(t (MutTerm v t))

To the much nicer:

    data UTerm t v
        = UVar  !v
        | UTerm !(t (UTerm t v))

The old mnemonic of "mutable terms" was inherited from the code's previous life implementing a logic programming language; but when I was playing around with implementing a type checker I realized that the names don't really make sense outside of that original context. So the new mnemonic is "unification terms". In addition to being a bit shorter, it should help clarify the separation of concerns (e.g., between unification variables vs lambda-term variables, type variables, etc.).

The swapping of the type parameters is so that UTerm can have instances for Functor, Monad, etc. This change should've been made along with the re-kinding of variable types back in version 0.6.0, since the UTerm type is the free monad generated by t. I've provided all the category theoretic instances I could imagine some plausible reason for wanting. Since it's free, there are a bunch more I haven't implemented since they don't really make sense for structural terms (e.g., MonadTrans, MonadWriter, MonadReader, MonadState, MonadError, MonadCont). If you can come up with some compelling reason to want those instances, I can add them in the future.

Since the order of type parameters to BindingMonad, UnificationFailure, Rank, and RankedBindingMonad was based on analogy to the order for terms, I've also swapped the order in all of them for consistency.

I've removed the eqVar method of the Variable class, and instead added an Eq superclass constraint. Again, this should've happened with the re-kinding of variables back in version 0.6.0. A major benefit of this change is that now you can use all those library functions which require Eq (e.g., many of the set-theoretic operations on lists, like (\\) and elem).

I've added new functions: getFreeVarsAll, applyBindingsAll, freshenAll; which are like the versions without "All", except they're lifted to operate over Foldable/Traversable collections of terms. This is crucial for freshenAll because it allows you to retain sharing of variables among the collection of terms. Whereas it's merely an optimization for the others (saves time for getFreeVarsAll, saves space for applyBindingsAll).

The type of the seenAs function has also changed, to ensure that variables can only be seen as structure rather than as any UTerm.

Thanks to Roman Cheplyaka for suggesting many of these changes.

Description

The unification API is generic in the type of the structures being unified and in the implementation of unification variables, following the two-level types pearl of Sheard (2001). This style mixes well with Swierstra (2008), though an implementation of the latter is not included in this package.

That is, all you have to do is define the functor whose fixed-point is the recursive type you're interested in:

    -- The non-recursive structure of terms
    data S a = ...

    -- The recursive term type
    type PureTerm = Fix S

And then provide an instance for Unifiable, where zipMatch performs one level of equality testing for terms and returns the one-level spine filled with pairs of subterms to be recursively checked (or Nothing if this level doesn't match).

    class (Traversable t) => Unifiable t where
        zipMatch :: t a -> t b -> Maybe (t (a,b))

The choice of which variable implementation to use is defined by similarly simple classes Variable and BindingMonad. We store the variable bindings in a monad, for obvious reasons. In case it's not obvious, see Dijkstra et al. (2008) for benchmarks demonstrating the cost of naively applying bindings eagerly.

There are currently two implementations of variables provided: one based on STRefs, and another based on a state monad carrying an IntMap. The former has the benefit of O(1) access time, but the latter is plenty fast and has the benefit of supporting backtracking. Backtracking itself is provided by the logict package and is described in Kiselyov et al. (2005).

In addition to this modularity, unification-fd implements a number of optimizations over the algorithm presented in Sheard (2001)— which is also the algorithm presented in Cardelli (1987).

  • Their implementation uses path compression, which we retain. Though we modify the compression algorithm in order to make sharing observable.
  • In addition, we perform aggressive opportunistic observable sharing, a potentially novel method of introducing even more sharing than is provided by the monadic bindings. Basically, we make it so that we can use the observable sharing provided by the modified path compression as much as possible (without introducing any new variables).
  • And we remove the notoriously expensive occurs-check, replacing it with visited-sets (which detect cyclic terms more lazily and without the asymptotic overhead of the occurs-check). A variant of unification which retains the occurs-check is also provided, in case you really need to fail fast.
  • Finally, a highly experimental branch of the API performs weighted path compression, which is asymptotically optimal. Unfortunately, the current implementation is quite a bit uglier than the unweighted version, and I haven't had a chance to perform benchmarks to see how the constant factors compare. Hence moving it to an experimental branch.

These optimizations pass a test suite for detecting obvious errors. If you find any bugs, do be sure to let me know. Also, if you happen to have a test suite or benchmark suite for unification on hand, I'd love to get a copy.

Notes and limitations

[1] At present the library does not appear amenable for implementing higher-rank unification itself; i.e., for higher-ranked metavariables, or higher-ranked logic programming. To be fully general we'd have to abstract over which structural positions are co/contravariant, whether the unification variables should be predicative or impredicative, as well as the isomorphisms of moving quantifiers around. It's on my todo list, but it's certainly non-trivial. If you have any suggestions, feel free to contact me. [back]

[2] At present it is only suitable for single-sorted (aka untyped) unification, à la Prolog. In the future I aim to support multi-sorted (aka typed) unification, however doing so is complicated by the fact that it can lead to the loss of MGUs; so it will likely be offered as an alternative to the single-sorted variant, similar to how the weighted path-compression is currently offered as an alternative. [back]

[3] With the exception of fundeps which are notoriously difficult to implement. However, they are supported by Hugs and GHC 6.6, so I don't feel bad about requiring them. Once the API stabilizes a bit more I plan to release a unification-tf package which uses type families instead, for those who feel type families are easier to implement or use. There have been a couple requests for unification-tf, so I've bumped it up on my todo list. [back]

References

Luca Cardelli (1987)
Basic polymorphic typechecking. Science of Computer Programming, 8(2): 147–172.
Atze Dijkstra, Arie Middelkoop, S. Doaitse Swierstra (2008)
Efficient Functional Unification and Substitution. Technical Report UU-CS-2008-027, Utrecht University.
Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, Mark Shields (2007)
Practical type inference for arbitrary-rank types. JFP 17(1). The online version has some minor corrections/clarifications.
Oleg Kiselyov, Chung-chieh Shan, Daniel P. Friedman, and Amr Sabry (2005)
Backtracking, Interleaving, and Terminating Monad Transformers. ICFP.
Tim Sheard (2001)
Generic Unification via Two-Level Types and Paramterized Modules, Functional Pearl. ICFP.
Tim Sheard and Emir Pasalic (2004)
Two-Level Types and Parameterized Modules. JFP 14(5): 547–587. This is an expanded version of Sheard (2001) with new examples.
Wouter Swierstra (2008)
Data types à la carte, Functional Pearl. JFP 18: 423–436.

Links

winterkoninkje: shadowcrane (clean) (Default)

bytestring-lexing 0.3.0

The bytestring-lexing package offers efficient reading and packing of common types like Double and Integral types.

Administrative Changes (since 0.2.1)

Change of maintainer. Don Stewart handed maintainership of the package over to myself when I voiced interest.

Change of repo type. The old repo for the package used Darcs-1 style patches. I've converted the repository to Darcs-2 hashed. This means that the new repository cannot exchange patches with the old Darcs-1 repo (or any other Darcs-2 conversions that may be floating around out there). So anyone who's interested in contributing should scrap their local copies and get the new repo.

Code Changes (since 0.2.1)

Added Data.ByteString.Lex.Integral which provides efficient implementations for reading and packing/showing integral types in ASCII-compatible formats including decimal, hexadecimal, and octal.

The readDecimal function in particular has been highly optimized. The new version is wicked fast and perfectly suitable for hot code locations like parsing headers for HTTP servers like Warp. In addition, attention has been paid to ensuring that parsing is efficient for larger than native types like Int64 on 32-bit systems (including 64-bit OS X), as well as Integer. The optimization of this function was done in collaboration with Erik de Castro Lopo, Vincent Hanquez, and Christoph Breitkopf following a blog post by Erik and ensuing discussion on Reddit.

A Criterion report is available for 64-bit Intel OS X running 32-bit GHC 6.12.1. The benchmark is included in the repo and has also been run on 64-bit GHC 7 systems, which differ primarily in not showing slowdown for Int64 vs Int (naturally). If you're curious about the different implementations:

  • readIntBS / readIntegerBS --- are the readInt and readInteger functions in Data.ByteString
  • readDecimalOrig (correct) --- was my original implementation, prior to collaboration with Erik, Vincent, and Christoph.
  • readIntegralMH (buggy) --- or rather a non-buggy version very much like it, is the implementation currently used in Warp.
  • readDecimal (current) --- is the current implementation used in this package.

Links

winterkoninkje: shadowcrane (clean) (Default)

exact-combinatorics 0.2.0

The exact-combinatorics package offers efficient exact computation of common combinatorial functions like the binomial coefficients and factorial. (For fast approximations, see the math-functions package instead.)

Description

Math.Combinatorics.Exact.Primes
Provides the prime numbers via Runciman's lazy wheel sieve algorithm. Provided here since efficient algorithms for combinatorial functions often require prime numbers. The current version memoizes the primes as an infinite list CAF, which could lead to memory leaks in long-running programs with irregular access to large primes. I'm looking into a GHC patch to allow resetting individual CAFs from within compiled programs so that you can explicitly decide when to un-memoize the primes. (In GHCi when you reload a module all the CAFs are reset. However, there's no way to access this feature from within compiled programs as yet.)
Math.Combinatorics.Exact.Binomial
Offers a fast computation of the binomial coefficients based on the prime factorization of the result. As it turns out, it's easier to compute the prime factorization of the answer than it is to compute the answer directly! And you don't even need the factorial function to do so. Albeit, with a fast factorial function, the naive definition of binomial coefficients gives this algorithm a run for its money.
Math.Combinatorics.Exact.Factorial
Offers a fast computation of factorial numbers. As Peter Luschny comments, the factorial function is often shown as a classic example of recursive functions, like addition of Peano numbers, however that naive way of computing factorials is extremely inefficient and does a disservice to those learning about recursion. The current implementation uses the split-recursive algorithm which is more than sufficient for casual use. I'm working on implementing the parallel prime-swing algorithm, which should be a bit faster still.

Links

winterkoninkje: shadowcrane (clean) (Default)

data-or 1.0.0

The data-or package offers a data type for non-exclusive disjunction. This is helpful for things like a generic merge function on sets/maps which could be union, mutual difference, etc. based on which Or value a function argument returns. Also useful for non-truncating zips (cf. zipOr) and other cases where you sometimes want an Either and sometimes want a pair.

Links

winterkoninkje: shadowcrane (clean) (Default)

pointless-fun 1.1.0

The pointless-fun package offers some common point-free combinators (common for me at least).

Description

Perhaps the most useful is that it packages up Matt Hellige's classic multicomposition trick[1]. These combinators allow you to easily modify the types of a many-argument function with syntax that looks like giving type signatures. For example,

    foo    :: A -> B -> C
    
    albert :: X -> A
    beth   :: Y -> B
    carol  :: C -> Z
    
    bar :: X -> Y -> Z
    bar = foo $:: albert ~> beth ~> carol

I've found this to be especially helpful for defining non-derivable type class instances for newtypes since it both abstracts away the plumbing and also makes explicit what you mean.

Other prevalent combinators include, (.:) for binary composition:

    (f .: g) x y = f (g x y)
    -- or,
    f .: g = curry (f . uncurry g)

This is the same as the common idiom (f .) . g but more easily extended to multiple uses, due to the fixity declaration.

And (.!) for function composition which calls the right-hand function eagerly; i.e., making the left-hand function strict in its first argument.

    (f .! g) x = f $! g x

This defines the composition for the sub-category of strict Haskell functions. If the Functor class were parameterized by the domain and codomain categories (e.g., a regular Functor f would be CFunctor (->) (->) f instead) then this would allow us to define functors CFunctor (->) (!->) f where fmap f . fmap g = fmap (f .! g)

[1] http://matt.immute.net/content/pointless-fun

Links

winterkoninkje: shadowcrane (clean) (Default)

unification-fd 0.5.0

The unification-fd package offers generic functions for first-order structural unification (think Prolog programming or Hindley–Milner type inference). I've had this laying around for a few years, so I figured I might as well publish it.

An effort has been made to try to make this package as portable as possible. However, because it uses the ST monad and the mtl-2 package it can't be H98 nor H2010. However, it only uses the following common extensions which should be well supported[1]:

Rank2Types
MultiParamTypeClasses
FunctionalDependencies
FlexibleContexts
FlexibleInstances
UndecidableInstances

[1] With the exception of fundeps which are notoriously difficult to implement. However, they are supported by Hugs and GHC 6.6, so I don't feel bad about requiring it. Once the API stabilizes a bit more I plan to release a unification-tf package which uses type families instead, for those who feel type families are easier to implement or use.

Description

The unification API is generic in the type of the structures being unified and in the implementation of unification variables, following the two-level types pearl of Sheard (2001). This style mixes well with Swierstra (2008), though an implementation of the latter is not included in this package.

That is, all you have to do is define the functor whose fixed-point is the recursive type you're interested in:

    -- The non-recursive structure of terms
    data S a = ...
    
    -- The recursive term type
    type PureTerm = Fix S

And then provide an instance for Unifiable, where zipMatch performs one level of equality testing for terms and returns the one-level spine filled with pairs of subterms to be recursively checked (or Nothing if this level doesn't match).

    class (Traversable t) => Unifiable t where
        zipMatch :: t a -> t b -> Maybe (t (a,b))

The choice of which variable implementation to use is defined by similarly simple classes Variable and BindingMonad. We store the variable bindings in a monad, for obvious reasons. In case it's not obvious, see Dijkstra et al. (2008) for benchmarks demonstrating the cost of naively applying bindings eagerly.

There are currently two implementations of variables provided: one based on STRefs, and another based on a state monad carrying an IntMap. The former has the benefit of O(1) access time, but the latter is plenty fast and has the benefit of supporting backtracking. Backtracking itself is provided by the logict package and is described in Kiselyov et al. (2005).

In addition to this modularity, unification-fd implements a number of optimizations over the algorithm presented in Sheard (2001)— which is also the algorithm presented in Cardelli (1987).

  • Their implementation uses path compression, which we retain. Though we modify the compression algorithm in order to make sharing observable.
  • In addition, we perform aggressive opportunistic observable sharing, a potentially novel method of introducing even more sharing than is provided by the monadic bindings. Basically, we make it so that we can use the observable sharing provided by the previous optimization as much as possible (without introducing any new variables).
  • And we remove the notoriously expensive occurs-check, replacing it with visited-sets (which detect cyclic terms more lazily and without the asymptotic overhead of the occurs-check). A variant of unification which retains the occurs-check is also provided, in case you really need to fail fast for some reason.
  • Finally, a highly experimental branch of the API performs weighted path compression, which is asymptotically optimal. Unfortunately, the current implementation is quite a bit uglier than the unweighted version, and I haven't had a chance to perform benchmarks to see how the constant factors compare. Hence moving it to an experimental branch.

I haven't had a chance to fully debug these optimizations, though they pass some of the obvious tests. If you find any bugs, do be sure to let me know. Also, if you happen to have a test suite or benchmark suite for unification on hand, I'd love to get a copy.

References

Luca Cardelli (1987)
Basic polymorphic typechecking. Science of Computer Programming, 8(2): 147–172.
Atze Dijkstra, Arie Middelkoop, S. Doaitse Swierstra (2008)
Efficient Functional Unification and Substitution, Technical Report UU-CS-2008-027, Utrecht University.
Oleg Kiselyov, Chung-chieh Shan, Daniel P. Friedman, and Amr Sabry (2005)
Backtracking, Interleaving, and Terminating Monad Transformers, ICFP.
Tim Sheard (2001)
Generic Unification via Two-Level Types and Paramterized Modules, Functional Pearl, ICFP.
Tim Sheard and Emir Pasalic (2004)
Two-Level Types and Parameterized Modules. JFP 14(5): 547–587. This is an expanded version of Sheard (2001) with new examples.
Wouter Swierstra (2008)
Data types a la carte, Functional Pearl. JFP 18: 423–436.

Links

winterkoninkje: shadowcrane (clean) (Default)

stm-chans 1.0.0

The stm-chans package offers a collection of channel types, similar to TChan but with additional features. In particular it offers these types:

TBChan: Bounded FIFO channels.
When the channel is full, writers will block/retry. This ensures that the writers do not get too far ahead of the readers, which helps to make sure that memory and cpu resources are used responsibly.
TMChan: Closeable FIFO channels.
This is like TChan (Maybe a) but with a monotonicity guarantee that once Nothing is returned all future reads will be Nothing as well.
TBMChan: Bounded Closeable FIFO channels.
This combines the capabilities of TBChan and TMChan.

In addition, the stm-chans package offers a (partial) compatibility layer for some API improvements still making their way into the stm package[1]. These new functions include:

tryReadTChan :: TChan a -> STM (Maybe a)
A version of readTChan which does not retry. Instead it returns Nothing if no value is available.
peekTChan :: TChan a -> STM a
Get the next value from the TChan without removing it, retrying if the channel is empty.
tryPeekTChan :: TChan a -> STM (Maybe a)
A version of peekTChan which does not retry. Instead it returns Nothing if no value is available.

Links

winterkoninkje: shadowcrane (clean) (Default)

unix-bytestring 0.3.2

The unix-bytestring package offers a full selection of Unix/Posix-specific functions for reading and writing ByteStrings to file descriptors.

The Story Behind It All

A while back I needed a ByteString-based version of the System.Posix.IO String-based API. I coded up the new versions and submitted a patch to the unix package for adding them in. But apparently, noone's much pleased with the System.Posix.IO API! In the ensuing discussion folks brought up their specific complaints and offered suggestions. So now, the couple of functions have become enough to release as its own package— with the hope that one day it'll be rolled into the unix package.

That's where you come in. The current API is more than good enough for anything I've needed, so I need your feedback on what you want from a ByteString-based Unix/Posix I/O library.

I've looked through Roel van Dijk's reverse dependency analysis to find the folks currently using the unix package. The maintainers of the following packages may be particularly interested in unix-bytestring:

HFuse has complained about the lack of bindings for pread(2) and pwrite(2), which are included in unix-bytestring. The only standing complaint I haven't addressed is one expressed in the iteratee, iteratee-mtl, and liboleg packages which would prefer a return type of (Either Errno _) instead of throwing exceptions.

Links