### John C. Reynolds (1935–2013)

12:43 pm

For those who haven't heard, John Reynolds (of separation logic fame) just passed on. I don't have anything particular to add to what's mentioned on LtU and Neel's blog, but I recommend reading them both.

### PSA: deduction proofs in LaTeX

01:29 am

Until recently I've been using semantic for typesetting my deduction rules and deduction proofs. But in writing the last few papers I've become acutely aware of its limitations: in particular, it offers no way to do dotted or doubled inference lines. After looking around for a while I found bussproofs; and I don't think I'll ever look back.

The semantic package is solid enough, and follows the traditional structural/nested style of macros. At first I was quite dubious of bussproofs' stack-machine style of macros, but after typing up a few proofs in it, I'm convinced. Because you don't have all the nested braces, it's much easier to restructure, clean up, or copypaste chunks of proofs. It's a bit verbose, all told, but that's easy enough to remedy by writing your own layer of macros on top of it. The one downside to bussproofs (compared to semantic) is that it doesn't support arbitrarily many premisses, and it doesn't allow vertical orientation of premisses. So if your typing rules are complex enough, you may need to stick with semantic; but for doing proofs in basic logics, or for doing CCG derivations, bussproofs is where it's at.

### Idle musings on CPPOs

04:36 pm

I was thinking recently about the design of a language which makes pointedness explicit in the type theory. That is, there are benefits to working in provably-total languages, especially when it comes to the equivalence theorems you can prove; however, there are also benefits to working in non-total languages (namely, not needing to prove totality). Previous approaches to this sort of thing have been based in domain theory; in particular, they've been based on pointed cpos (cppos)— i.e., a cpo with a least element, representing the total lack of information, which is often conflated with nontermination. But there's a problem with this.

The problem is, there are some cpos which have a least element (and hence are technically cppo) but where that least element cannot or should not be conflated with nontermination. In particular, product domains are formed by the tupling of elements from various domains. Thus, given A = \Pi_{i \in I} B_i, if all the B_i are cppo then it follows that A is cppo: the least element of A is the tuple taking the least element of each of the B_i domains. However, the least element of A is not nontermination— it's a tuple (whose components may be nonterminating). That is, we can perform eta expansion of A, without running into the various issues that arise for Haskell's tuples. In the limit we can consider the product of the empty family of domains (aka, the domain with a single uninformative value), and this should not be confused with the result of lifting the empty domain (aka, the domain with a single non-terminating "value").

So clearly, if we are to use domain theory to reason about termination, we must distinguish cppos which are pointed by nontermination, vs cppos whose least element is eta-expandable. However, I'm not sure that this distinction has been made before in the literature...

### (no subject)

07:54 pm

Last night, Licia made me the happiest girl in the world.

### Tear down this wall

02:00 am

So, a friend of mine wrote a scathing review of the ACM's recent refusal to open access. As he mentions, the ACM claims to be a non-profit organization with the purported mission of fostering the open interchange of information, and yet it refuses to open access because that would cut into the bottom line be "too hard". This is absurd when USENIX, ACL, NIPS, JMLR, etc are all open; to say nothing of the arxiv. If the ACM publicly admitted to being a for-profit organization that would be one thing. I'd still be upset with them, but at least they'd be honest. Until the ACM updates its out-of-date practices, I will not support them because they are not a professional organization that represents the ethical standards of the computer science community I am a part of. If you're also part of this community, then you can help tear down this paywall.

### Upping the signal

08:52 pm

We are not the mythical Hollywood Spartans Aaron and I laughed at together many years ago after watching 300. We were slowly, reluctantly, falling in love after both of us had rejected dayjob life at Wired. That night we were both amused and just a little bit horrified that this primitive notion of what makes virtue; that the heroes of this story would have killed us both as children. Aaron and I were part of a culture that prides itself on not slaughtering deformed or sickly children, or leaving unwanted babies to die of exposure. Instead we were the people that could go to the moon and builds ADA ramps. We hold people like Stephen Hawking up as paragons, not of their virtues, but ours. We contend that we live better and more wisely for keeping brilliant minds in useful arts and sciences not only alive, but offering a place where they can thrive and enrich us all.

And we are lying.

The great enemy of the truth is very often not the lie—deliberate, contrived and dishonest, but the myth, persistent, persuasive, and unrealistic. Belief in myths allows the comfort of opinion without the discomfort of thought. -JFK

(This is followed by a list of myths and refutations.)

### Why integrate?

10:47 pm

Yesterday I was trying to explain some of the paradoxes of probability theory to a friend who disbelieves in the real numbers. It's not always clear whether this disbelief is actual, or if it's just an affectation; constructivist and devil's-advocate that he is, it could go either way really. In any case, it's always amusing to spar with (not that I have any especial concern for the un/reality of the reals). Midway through, Dylan Thurston came over to listen in and raised a question I've mulled over before but have been turning over again and again since then. What is it that I mean when describing a space (as opposed to a function etc) as "continuous"?

The knee-jerk response is that continuity is the antithesis of discreetness. That is, given some collection or space or other arrangement of things, often we are interested in accumulating some value over the lot of them. In the easiest setting, finite collections, we just sum over each element of that collection. But this process isn't limited to finite collections; we sum over infinite collections like the natural numbers with nary a care, and use the same large sigma notation to do so. So mere countable infinity isn't a problem for the notion of summation or accumulation. In programming we oft take our infinitudes even further. There's nothing special about the natural numbers. We can sum over the collection of trees, or lists, or any other polynomial type with just as little (or as much) concern for how many values are in these types as for how many natural numbers there are. But at some point this breaks down. Somewhere between the polynomial types and the real numbers, everything falls apart. We cannot in any meaningful sense use large sigma to accumulate a value over the vast majority of subsets of the reals. Instead we must turn to a different notion of accumulation: integration. For discrete collections summation is fine, but when we enter the continuous setting we must switch to integration.

The problem, of course, is that integrals are not really well-defined. Regardless of your choice of formalization, they all run into paradoxes and problems[1]. One of these problems rears its head in that probability theoretic paradox I was attempting to explain. Namely, the conception of inhabited sets of measure zero. The paradox arises even before probabilities rear their head. Colloquially, integrals are the area under a curve over some interval of the curve's domain. How do we get the area of some curvy shape? Well, we can approximate the shape by making a bunch of rectangles; and our approximation becomes better and better as those rectangles become thinner and thinner. In the limit, this approximation matches the actual shape and so we can get its area. But, in the limit, those rectangles have thickness zero; and thus, they must have area zero. So how is it that summing all those slivers with area zero can ever result in a non-zero total area? Thus, is the paradox.

But pulling things back to the original question: what does it mean for a space to be continuous in the first place? What is it ---exactly--- that causes summation to fail and forces us into this problematic regime of integration? Is the notion of continuity or of the reals or of infinite divisibility or however you want to phrase it, is the notion itself a hack? And if it is a hack, then how do we get away from it? Classical mathematicians are fond of hacks but, while I respect a good hack, as a constructivist myself I prefer to be on surer footing than simply believing something must be the case since the alternative is too absurd to conceive of. So, why do we integrate? I've yet to find a reason I can believe in...

[1] One can make the same complaint about logics (and other areas of mathematics) too. Impredicativity is much the same as the situation in probability theory; the idea is so simple and obvious that we want to believe in it, but to do so naively opens the door to demonstrable unsoundness. The liar's paradox is another close analogy, what with making perfect sense except in the limit where everything breaks down. Indeed, the paradoxes of impredicativity are of the exact same sort as the liar's paradox. But in spite of all these issues, we do not usually say that logic is ill-defined; so perhaps my judgment of calculus is unfair. Though, to my knowledge, people seem to have a better handle on the problems of logic. Or perhaps it's just that the lack of consensus has led to the balkanization of logic, with constructivists and classicalists avoiding one another, whereas in calculus the different sides exchange ideas more freely and so the confusion and disagreements are more in the open...

### on differences in worldview between C/C++/Java and Haskell

11:38 pm

Quoth Clifford Beshers:

As a test, ask your students to write down the definition of 'type'. I'll bet that their answers will be longer than 'set' and include some mention of bytes, words and big-endian. We were all trained to be mechanics instead of drivers, because all we had were go-carts.

### Finite sets

08:05 pm

So, I just encountered a most delicious type the other day:

class Finite a where assemble :: Applicative f => (a -> f b) -> f (a -> b)

What's so nice about it is that the only way you can implement it is if the type a is in fact finite. (But see the notes.) So the questions are:

• Can you see why?
• Can you figure out how to implement it for some chosen finite type?
• Can you figure out how to implement it in general, given a list of all the values? (you may assume Eq a for this one)
• Can you figure out how to get a list of all the values, given some arbitrary implementation of assemble?
A trivial note )

### Math envy

02:22 am

I have often derided those who are susceptible to math envy. Y'know, the idea that math=intelligence. This utter foolishness leads to the simultaneous fear and awe of anyone who throws math around, as if the presence of mere symbols and equations demonstrates the clear superiority of the author's throbbing, bulging,... intellect. This utter foolishness leads, therefore, to authors who feel the need to add superfluous "mathematics" to their writings in order to demonstrate that their... intelligence measures up that of their colleagues.

Well, turns out, someone finally got around to doing a study on math envy: Kimmo Ericksson (2012) "The nonsense math effect", Judgment and Decision Making 7(6). As expected, those with less training in mathematics tend to rate utterly irrelevant "mathematical content" more highly than its absence. Lest anyone start feeling smugly superior, however, I'll note that I've seen this effect most strongly in those who should know better, i.e., those with just a little mathematical training. This includes, for example, computer scientists who are not formal theoreticians. Not to name names, but I've read more than one NLP paper that throws in some worthless equation just to try to look more worthwhile. (These papers are often fine, in and of themselves, but would have been better had they not succumbed to math envy.)

As Language Log points out in their coverage, this isn't limited just to math. Some people also have brain-scan envy and similar afflictions. That's definitely worth watching out for, but IME people seem more aware of their pernicious effects while being blind to math envy.

### Semirings (columbicubiculomania, pt.IV)

12:01 am

Apparently I messed up my link to the map of ring theory last time. It's fixed now, but the fact that noone commented on it means noone's reading this blog anymore (or those who do don't care). Ah well. Regardless, it's time to finish up this little series for now.

Anyone who's hung around me long enough knows that I'm a great fan of semirings. Partly this is because of their ubiquity in natural language processing, but mostly it's because they're the intersection of two of the most powerful and ubiquitous abstractions in mathematics. Semirings simultaneously generalize rings (by not necessarily having negation) and bounded lattices (by not necessarily being commutative nor having both operators distribute). Therefore, they generalize both arithmetic and ordering, two of the core components of mathematics. Also the fact that the natural numbers do not form a ring is a strong case for taking semirings seriously. I don't have a map this time, instead I have an extensive list of common examples. The examples fall into about half a dozen common patterns.

• Natural number-like semirings
• Integer-like rings
• Tropical and Viterbi semirings
• Bounded distributive lattices
• Boolean rings
• Regular languages
• Square matrices over a semiring

I'm sure you're already familiar with the natural numbers, integers, etc., so I won't spend much time talking about them. The one thing worth mentioning here, though, is what the heck that last column is for. It's for *-semirings (i.e., closed semirings), not to be confused with *-(semi)rings (i.e., (semi)rings with involution). Jacques Carette introduced me and Russell O'Connor to them, and Russell has written a bit about them. Since he does a good job of explaining them, I won't talk about them just now. But, because they're a bit obscure, I made sure to indicate what the asteration operation is when it exists. The one especially notable point here is that the Alexandroff compactification of the reals is no longer a ring! (Another reason to take semirings seriously.) We give up the ring-ness of the reals to obtain closed-ness and compact-ness.

The tropical and Viterbi semirings are a sort of cross between arithmetic and ordering, and you've probably never heard of them before. They are, however, rather common in computer science. They tend to arise whenever we use dynamic programming in order to find the "best" of some set of things: e.g., the most likely tag sequence, the shortest path, the longest path, etc. This is what part of speech tagging, matrix chain multiplication, and Dijkstra's algorithm are all about. And recognizing that these are semirings is what leads to generalizations like forward–backward and inside–outside, to say nothing of the Gauss–Jordan–Floyd–Warshall–McNaughton–Yamada algorithm that comes from focusing on *-semirings.

The last major class I'll talk about are the bounded distributive lattices. The reason to discuss these is just to point out that this includes all Boolean algebras (hence classical logic and subset inclusion hierarchies), all Heyting algebras (hence intuitionistic logic), and all bounded totally ordered sets.

This chart is released under Creative Commons Attribution-ShareAlike 3.0. Any questions? Anything I missed? Anything that needs further explanation?

09:01 pm

### columbicubiculomania (pt.III)

04:48 pm

Last time we talked about sets which support a single binary operator. That's all well and good, but things get really interesting once we have two of those operators. Thus, for lack of a better name, I present a map of ring theory. The majority of this isn't actually part of ring theory proper, of course; but that's as good a name as any, since we'll want to distinguish these from lattice theory which also has two interlocking operators.

In the center of the map are (unital) rings. For those who aren't aware, in older literature, what are now called pseudorings used to be called "rings", and what are now called rings used to be called "unital rings" or "rings with unit". I've run into very few proper pseudorings of interest, so I support this change in terminology. In any case, if we start from rings and move upward we get rid of properties. In addition to pseudorings we also get semirings, which are utterly ubiquitous. In the next post I'll give a few dozen of the most common semirings, so I'll save my rant for then. If we keep moving up then we start getting rid of associativities and distributivities, resulting in things like seminearrings (aka near semirings) which arise naturally in certain kinds of parsing problems. This area is poorly explored, hence the dearth of names in the upper part of the map. This is, however, where I've been spending a lot of time lately; so you'll probably hear more about seminearrings and their ilk in the near future. As the names suggest, we have both left-near and right-near versions of these various structures, though I only draw the right-near ones for clarity.

Moving downward from semirings there are two directions to head. Off to the left we run into Kleene algebras and lattice theory yet again. And to the south we run into the swamp along the way to fields. In spite of their apparent axiomatization, fields are not really algebraic objects, which is a big part of the reason for all this mess. In the lower left we see a chain of inclusions based on some peculiar properties like every ideal being principal, the existence of greatest common divisors (though not necessarily computable with Euclid's algorithm), the ascending chain condition on principal ideals, etc. These properties will be familiar to the actual ring theorists, as would numerous other properties I didn't bother putting on here. Off to the lower right we get a different sort of breakdown. In particular, before we get unique inverses for all non-zero elements, we can instead just have pseudoinverses or strong pseudoinverses. This is similar to the distinction between (von Neumann) regular semigroups vs inverse semigroups.

There's plenty more to say about ring theory and related areas, but that would be a whole series of posts on its own. This is actually the first map I started to make, because this is the region where we find so many mathematicians coming in from different areas and not being familiar with all that has been done before. As I'm sure you notice, quite a lot has been done, both in breadth and in depth. I brought this map up once when chatting with Amr Sabry, and he's the one who convinced me to finally get around to posting all these. So, hopefully these'll give y'all a better lay of the land.

There are some notable omissions from this map as it stands. In particular, complete semirings (aka *-semirings) are missing, as are rings with involution (aka *-rings), as are the various constructive notions of fields (like discrete fields, Heyting fields, residue fields, meadows, etc.). Next time I'll talk a bit about complete semirings; they were omitted mainly for lack of space, but should be included in future versions. The various constructive notions of fields were also omitted mainly for space reasons. I'll probably end up making a close-up map of the swamplands between rings and fields in order to do justice to them. Rings with involution were omitted mainly because I'm not entirely sure where the best place to put them is. As the name suggests, they've been primarily studied in contexts where you have a full-blown ring. However, there's nothing about the involution which really requires the existence of negation. I've started chatting with Jacques Carette about related topics recently, though; so perhaps I'll have more to say about them later.

This map is released under Creative Commons Attribution-ShareAlike 3.0. Any questions? Anything I missed? Anything that needs further explanation?

### columbicubiculomania (pt.II)

02:22 pm

Continuing the thread from last time, let's move on from relations and consider a map of individual binary operations. In a lot of ways this is even simpler than the binary relations from last time, though the map requires a bit more explanation. This time, rather than having definitions at the top, they're given as labels on the arcs. Arcs in the same color denote the same property, dashed lines represent things you get for free, and black lines are for the odd things; all just like last time.

Most of the study of individual binary operations falls under group theory, which forms the core of this map. The one interesting thing here is that if you have at least monoid structure (i.e., have an identity element) then the uniqueness of inverses follows from having the presence of inverses. However, for semigroups which are not monoids, these two properties differ. This'll come up again next time when we start talking about rings and fields.

Off to the left we veer into lattices. And to the right we get the crazy stuff that comes from non-associative algebra. Quasigroups and loops are somewhat similar to groups in that they have an invertible structure, but unfortunately they don't have associativity. It turns out, there's a whole hierarchy of almost-but-not-quite-associative properties, which is shown on the second page. The strongest property you can get without being fully associative is Moufang, which can be phrased in four different ways. Below this we have left- and right-Bol (if you have both the Bols then you have Moufang). Below that we have alternativity where you choose two of three: left-alternativity, right-alternativity, and flexibility. Below that, of course, you can have just one of those properties. And finally, at the bottom, power associativity means that powers associate (and so "powers" is a well-formed notion) but that's it.

As I said, there's not a whole lot here, but I needed to bring this one up before getting into ring-like structures. This map is released under Creative Commons Attribution-ShareAlike 3.0. Any questions? Anything I missed? Anything that needs further explanation?

### Night of the Living Blog

09:42 pm

Friday I turned in the last of my papers, and last night I turned in the last of my grades. Which means I'm free for a whole two weeks or so. Which is good because for the last couple weeks I've been wanting desperately to blog as a means of escaping all those due dates. So now I get to flush some of the backlog from this semester. In particular, there's a cluster of posts I've been meaning to put up for a while, a sort of Intro to Maths for mathematicians, as it were. To save y'all's friendspages I'll spread these posts out over the next few days.

columbicubiculomania — The compulsion to stick things into pigeonholes. (Jim Matisoff 1990)

Ever since stumbling upon that fabulous word I've wanted to spread its popularity. As a geek with certain obsessive–compulsive tendencies, I'm a bit prone to columbicubiculomania; or rather, as the theoretician that I am, I'm prone to the dual of columbicubiculomania. I don't care so much about the pigeons, they can take care of themselves. But I do have a certain compulsion to design and arrange pigeonholes such that whenever someone feels the columbicubiculocompulsion, they'll have the right places to stuff all their pigeons into. Part of this is also tied up in me trying to figure out (or rather, to convey) where exactly I situate myself in the vast sea of competing academic fields. As the perennial outsider, I'm more interested (it seems) in seeing how everything is connected than are those stuck on the inside.

And so, over the past few years I've been accumulating the jargon from different subfields of mathematics and organizing them into a cohesive whole. In particular, for the next few posts, I'm concerned with the terminology for algebraic objects. Too often I read papers with practitioners of one algebraic field (e.g., group theory, ring theory,...) getting unwittingly caught up in another field and subsequently using such outrageous terms as "semigroup with identity" or "monoid without identity". Because of this ridiculousness, a lot of mathematicians and computer scientists end up not realizing when the thing they're studying has already been studied in depth under another name. So let's see all those names and how they connect. Let's produce a cartography of mathematical objects.

Perhaps the easiest place to start is one of the latter maps I produced. Binary relations are ubiquitous in all areas of mathematics, logic, and computer science. Typically we don't care about all binary relations, we only care about relations with particular properties. Of course, the interaction between properties is nontrivial since properties A and B together can entail that property C must hold. Thus, a map of binary relations is helpful for keeping track of it all. This map requires relatively little explanation, which is why I started here.

All the common properties of interest are defined at the top, and color coded for use in the map. And, constructivist that I am, I've been sure to distinguish between strong and weak versions of the properties (which collapse in the classical setting). The arrowheads in the map are for keeping track of when we're talking about P, anti-P, or co-P (for some particular property P). And the big black dot is the starting point of the map (i.e., a binary relation with no special properties). The dashed lines indicate when some property will follow for free. So, for example, there's no name for a transitive irreflexive relation since trans+irrefl entails antisymmetry and trans+irrefl+antisym is a strict partial order. And the black lines are for when the named object requires some peculiar property that doesn't show up all over the place. These conventions for dashed and black lines are common to all my maps.

Once you're familiar with my conventions, the only thing really left unstated on this map is that the apartness relation used in the definition of a strongly connected binary relation is actually a tight apartness. The constructive significance of apartness relations vs equality relations is something I'll have to talk about another time.

Towards the top of the map we start veering away from binary relations per se and start heading into order theory. Because the complexities of order theory are a bit different from the complexities of binary relations, I've chosen to break them up into different maps. As yet, I haven't actually made the map for order theory, but I'll get around to it eventually. Lattice theory and domain theory also gets tied up in all this (in the order theory particularly). I've started work on a lattice theory map, but haven't finished it just yet.

This map is released under Creative Commons Attribution-ShareAlike 3.0. Any questions? Anything I missed? Anything that needs further explanation?

### New favorite quote

12:40 am

Quoth Paul Taylor:

This result is folklore, which is a technical term for a method of publication in category theory. It means that someone sketched it on the back of an envelope, mimeographed it (whatever that means) and showed it to three people in a seminar in Chicago in 1973, except that the only evidence that we have of these events is a comment that was overheard in another seminar at Columbia in 1976. Nevertheless, if some younger person is so presumptuous as to write out a proper proof and attempt to publish it, they will get shot down in flames.

### Summer: finis

12:38 am

Hmm, so I was hoping to get back into blogging over the summer. Y'know, a sort of liveblogging on quals, like how people blog about their Google Summer of Code projects these days. Turns out, some part of my brain is all, "oh noes! blogging is a waste of time! You could be working or, y'know, playing videogames or something!" So that sucks. In part because my brain is stupid, in part because I've been reading so many awesome blogs by people at least as busy as I am and now I feed bad for not measuring up to some impossible standard I made up out of thin air.

So the summer was fun, albeit not relaxing in the slightest. I really need to work on that whole Day of Rest idea. Things that happened this summer:

• Grace and Jason got married! So that's the second California wedding I've been to in as many years. Now I can quit complaining about never having gone to a wedding. It was great to see friends from college again. Of the ones I lost touch with over the years, one works in the game industry (both indie and corporate), and another works for Wolfram Alpha (indeed, with Mr. Wolfram himself). So that's pretty cool.
• Went to NASSLLI in Austin. There were some awesome classes there. Craige Roberts is fabulous; definitely someone to keep an eye on. Got to meet Adam Lopez, who was recently working on stuff related to one of my quals. Adam was part of the Edinburgh SMT crew, who came to JHU shortly after I left so I hadn't met him before. And, of course, got to hang out with Ken and Oleg again. Also, awesome, someone there remembered my talk from NASSLLI 2010 and asked about followup work on it.
• Read a bunch of fun books, or rather had them read to me. Licia got a kindle and loves reading aloud; she's the best ever. Fun books include: Pride and Prejudice and Jane Eyre. Seriously, they are both delightful and if you haven't read them you should. Competent women are the best. Also, Look Me in the Eye: My Life with Asperger's (dude, what a life!), and most of God, No! and Drop Dead Healthy.
• Bought Tales of Graces F on a whim and loved every minute of it. It starts off as your very standard JRPG about childhood friends, but then jumps ahead a few years after everyone has separated and grown up. The prologue is, as the reviews say, the least entertaining part; but it does a good job of setting the background for making the main plot poignant. Just saying people were childhood friends pales in comparison to seeing it and then seeing how they've grown apart. I haven't played any of the other Tales games, but the system is pretty similar to the Star Ocean system. Better done though, IMO. You have the fusion/cooking thing, but it's done in a way that's both extremely helpful and not obnoxious, so you use it regularly and actually care. The combat system is vibrant and engaging, and the system of badges is really cool. Overall the system has a lot of depth but doesn't get in the way of just playing. Some of the reviews complained about uneven difficulty, but I have no idea what they're on about. 10/10
• And in a few weeks I'll be heading off to ICFP again. It'll be the first time I've been to Europe, can't wait.

### Summer: when work actually gets done

07:44 pm

This past semester was a real doozy, for a number of reasons. But now that classes are over, maybe I'll get a chance to talk about some of them. In any case, at least it's done. Now I get to do quals: three months to write three papers good enough to convince people I can write a thesis. I'm looking forward to it; it's been so long since I've been free to do my own research without feeling bad about it encroaching on the work I 'should' be doing.

### Worthless math insight of the day

06:01 pm

The integers mod 3 are better thought of as {-1, 0, 1} rather than the traditional {0,1,2}. This makes it clear that the two non-zero elements are additive inverses of one another. And it also makes it clear that multiplication of non-zero elements just returns an "is same/different" judgment, which comes for free from our usual arithmetic: (-1)*(-1) = 1 = 1*1 and (-1)*1 = -1 = 1*(-1).

11:46 am

### unification-fd 0.7.0

The unification-fd package offers generic functions for single-sorted first-order structural unification (think of programming in Prolog, or of the metavariables in type inference)[1][2]. The library is sufficient for implementing higher-rank type systems à la Peyton Jones, Vytiniotis, Weirich, Shields, but bear in mind that unification variables are the metavariables of type inference— not the type-variables.

An effort has been made to make the package as portable as possible. However, because it uses the ST monad and the mtl-2 package it can't be H98 nor H2010. However, it only uses the following common extensions which should be well supported[3]:

Rank2Types
MultiParamTypeClasses
FunctionalDependencies -- Alas, necessary for type inference
FlexibleContexts       -- Necessary for practical use of MPTCs
FlexibleInstances      -- Necessary for practical use of MPTCs
UndecidableInstances   -- For Show instances due to two-level types

### Changes (since 0.6.0)

This release is another major API breaking release. Apologies, but things are a lot cleaner now and hopefully the API won't break again for a while. The biggest change is that the definition of terms has changed from the previous:

 data MutTerm v t = MutVar !v | MutTerm !(t (MutTerm v t))

To the much nicer:

 data UTerm t v = UVar !v | UTerm !(t (UTerm t v))

The old mnemonic of "mutable terms" was inherited from the code's previous life implementing a logic programming language; but when I was playing around with implementing a type checker I realized that the names don't really make sense outside of that original context. So the new mnemonic is "unification terms". In addition to being a bit shorter, it should help clarify the separation of concerns (e.g., between unification variables vs lambda-term variables, type variables, etc.).

The swapping of the type parameters is so that UTerm can have instances for Functor, Monad, etc. This change should've been made along with the re-kinding of variable types back in version 0.6.0, since the UTerm type is the free monad generated by t. I've provided all the category theoretic instances I could imagine some plausible reason for wanting. Since it's free, there are a bunch more I haven't implemented since they don't really make sense for structural terms (e.g., MonadTrans, MonadWriter, MonadReader, MonadState, MonadError, MonadCont). If you can come up with some compelling reason to want those instances, I can add them in the future.

Since the order of type parameters to BindingMonad, UnificationFailure, Rank, and RankedBindingMonad was based on analogy to the order for terms, I've also swapped the order in all of them for consistency.

I've removed the eqVar method of the Variable class, and instead added an Eq superclass constraint. Again, this should've happened with the re-kinding of variables back in version 0.6.0. A major benefit of this change is that now you can use all those library functions which require Eq (e.g., many of the set-theoretic operations on lists, like (\\) and elem).

I've added new functions: getFreeVarsAll, applyBindingsAll, freshenAll; which are like the versions without "All", except they're lifted to operate over Foldable/Traversable collections of terms. This is crucial for freshenAll because it allows you to retain sharing of variables among the collection of terms. Whereas it's merely an optimization for the others (saves time for getFreeVarsAll, saves space for applyBindingsAll).

The type of the seenAs function has also changed, to ensure that variables can only be seen as structure rather than as any UTerm.

Thanks to Roman Cheplyaka for suggesting many of these changes.

### Description

The unification API is generic in the type of the structures being unified and in the implementation of unification variables, following the two-level types pearl of Sheard (2001). This style mixes well with Swierstra (2008), though an implementation of the latter is not included in this package.

That is, all you have to do is define the functor whose fixed-point is the recursive type you're interested in:

 -- The non-recursive structure of terms data S a = ... -- The recursive term type type PureTerm = Fix S

And then provide an instance for Unifiable, where zipMatch performs one level of equality testing for terms and returns the one-level spine filled with pairs of subterms to be recursively checked (or Nothing if this level doesn't match).

 class (Traversable t) => Unifiable t where zipMatch :: t a -> t b -> Maybe (t (a,b))

The choice of which variable implementation to use is defined by similarly simple classes Variable and BindingMonad. We store the variable bindings in a monad, for obvious reasons. In case it's not obvious, see Dijkstra et al. (2008) for benchmarks demonstrating the cost of naively applying bindings eagerly.

There are currently two implementations of variables provided: one based on STRefs, and another based on a state monad carrying an IntMap. The former has the benefit of O(1) access time, but the latter is plenty fast and has the benefit of supporting backtracking. Backtracking itself is provided by the logict package and is described in Kiselyov et al. (2005).

In addition to this modularity, unification-fd implements a number of optimizations over the algorithm presented in Sheard (2001)— which is also the algorithm presented in Cardelli (1987).

• Their implementation uses path compression, which we retain. Though we modify the compression algorithm in order to make sharing observable.
• In addition, we perform aggressive opportunistic observable sharing, a potentially novel method of introducing even more sharing than is provided by the monadic bindings. Basically, we make it so that we can use the observable sharing provided by the modified path compression as much as possible (without introducing any new variables).
• And we remove the notoriously expensive occurs-check, replacing it with visited-sets (which detect cyclic terms more lazily and without the asymptotic overhead of the occurs-check). A variant of unification which retains the occurs-check is also provided, in case you really need to fail fast.
• Finally, a highly experimental branch of the API performs weighted path compression, which is asymptotically optimal. Unfortunately, the current implementation is quite a bit uglier than the unweighted version, and I haven't had a chance to perform benchmarks to see how the constant factors compare. Hence moving it to an experimental branch.

These optimizations pass a test suite for detecting obvious errors. If you find any bugs, do be sure to let me know. Also, if you happen to have a test suite or benchmark suite for unification on hand, I'd love to get a copy.

### Notes and limitations

[1] At present the library does not appear amenable for implementing higher-rank unification itself; i.e., for higher-ranked metavariables, or higher-ranked logic programming. To be fully general we'd have to abstract over which structural positions are co/contravariant, whether the unification variables should be predicative or impredicative, as well as the isomorphisms of moving quantifiers around. It's on my todo list, but it's certainly non-trivial. If you have any suggestions, feel free to contact me. [back]

[2] At present it is only suitable for single-sorted (aka untyped) unification, à la Prolog. In the future I aim to support multi-sorted (aka typed) unification, however doing so is complicated by the fact that it can lead to the loss of MGUs; so it will likely be offered as an alternative to the single-sorted variant, similar to how the weighted path-compression is currently offered as an alternative. [back]

[3] With the exception of fundeps which are notoriously difficult to implement. However, they are supported by Hugs and GHC 6.6, so I don't feel bad about requiring them. Once the API stabilizes a bit more I plan to release a unification-tf package which uses type families instead, for those who feel type families are easier to implement or use. There have been a couple requests for unification-tf, so I've bumped it up on my todo list. [back]

### References

Luca Cardelli (1987)
Basic polymorphic typechecking. Science of Computer Programming, 8(2): 147–172.
Atze Dijkstra, Arie Middelkoop, S. Doaitse Swierstra (2008)
Efficient Functional Unification and Substitution. Technical Report UU-CS-2008-027, Utrecht University.
Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, Mark Shields (2007)
Practical type inference for arbitrary-rank types. JFP 17(1). The online version has some minor corrections/clarifications.
Oleg Kiselyov, Chung-chieh Shan, Daniel P. Friedman, and Amr Sabry (2005)
Backtracking, Interleaving, and Terminating Monad Transformers. ICFP.
Tim Sheard (2001)
Generic Unification via Two-Level Types and Paramterized Modules, Functional Pearl. ICFP.
Tim Sheard and Emir Pasalic (2004)
Two-Level Types and Parameterized Modules. JFP 14(5): 547–587. This is an expanded version of Sheard (2001) with new examples.
Wouter Swierstra (2008)
Data types à la carte, Functional Pearl. JFP 18: 423–436.

wren ng thornton

S M T W T F S
1 23456
78910111213
14151617 181920
21222324252627
28 2930

No cut tags