LiveJournal changed their TOS recently and, well, I didn't really feel like reading it to see what changed so instead I've deleted my old LJ account. All the folks I used to follow on LJ have since moved on to other venues or stopped blogging entirely, and I've been using DW as my main blog for quite some time now, so it was just a matter of time. In the event you were still following the old LJ account, now's the time to switch.
The containers package contains efficient general-purpose implementations of various basic immutable container types. The declared cost of each operation is either worst-case or amortized, but remains valid even if structures are shared.
Changes since 0.5.8.1 (2016-08-31)
The headline change is adding
Data.IntMap. The versions for
Data.Map were introduced in 0.5.8.1, so this change restores parity between the interfaces. With this in place we hope this version will make it into GHC 8.2.
Other changes include:
- Add instances for
- Add lifted instances (from
Data.Tree. (Thanks to Oleg Grenrus for doing a lot of this work.)
- Properly deprecate functions in
Data.IntMaplong documented as deprecated.
- Rename several internal modules for clarity. Thanks to esoeylemez for starting this process.
Data.Map.fromDistinctDescListmore eager, improving performance.
- Plug space leaks in
Data.Map.Lazy.fromDescListby manually inlining constant functions.
Data.Mapas total alternatives to
Data.Mapas a total alternative to
- Avoid using
deleteFindMaxinternally, preferring total functions instead. New implementations of said functions lead to slight performance improvements overall.
The last week has been challenging for all of us. In the depths of my own fear and uncertainty, I reached for one of my favorite books —Pema Chödrön’s Comfortable with Uncertainty— and opened to a passage at random. On friday, a friend of mine asked how I’ve been able to deal with it all. I told him about the passage, and he (a non-buddhist) found it helpful in dealing with his own pain, so I wanted to share more broadly.
Before getting to the passage, I think it’s important for people to recognize that this pain we are feeling is a collective trauma. This is not our day-to-day pain, not our usual suffering. Everyone develops habits and skills for addressing the typical discomforts of life, but those skills are often inapplicable or ineffective for recovering from truly traumatic events. When someone is in a car wreck, or attacked, or raped, or abruptly loses a job or loved one— we recognize these things as traumas. We recognize that these events take some extra work to recover from. In the aftermath of the election I have seen many of the symptoms of trauma in the people around me. Depression, hypervigilance, difficulty concentrating, short tempers, and so on. When trauma hits, our usual coping mechanisms often fail or go haywire. A drink or two to unwind, turns into bleary drunkenness every night. Playing games to let go, turns into escapism to avoid thinking. Solitude, turns into reclusion. A healthy skepticism, turns into paranoia. If we do not recognize traumas for what they are, it becomes all too easy to find ourselves with even worse problems. Recognition is necessary for forming an appropriate response.
Now, the passage. As humans we have three habitual methods for relating to suffering. All three are ineffectual at reducing that suffering. These three ineffectual strategies are: attacking, indulging, and ignoring. And I’ve seen all three in great quantities in all the OpEd pieces floating around over the past week.
By “attacking” Pema Chödrön means not just lashing out, attacking Trump’s supporters or their ideals, but also all the ways we attack ourselves: We condemn ourselves, criticize ourselves for any indulgence, pity ourselves to the point of not getting out of bed. This strategy shows up in all those articles criticizing us for not having interpreted the polls correctly, or chastising us for not voting, or condemning the way the internet has formed these echo-chamber bubbles, and so on. But all this self-flagellation, all this beating ourselves up, does nothing to heal our pain. Now we suffer not only from our fears of what’s to come, but also because “it’s all our fault”. We refuse to “let ourselves off easy”, so whenever someone tries to address our pain we attack them and beat them away, protecting our pain because we feel like we deserve it.
Indulging is just as common. Though we are haunted by self-doubt, we condone our behavior. We say “I don’t deserve this discomfort. I have plenty of reasons to be angry or sleep all day.” We justify our pain to the point of turning it into a virtue and applauding ourselves. This strategy shows up in all those articles that relish in the details of how bad things will become, or congratulating ourselves for saying something like this would happen. But again, by cherishing our pain and presenting it as something to be praised, we are preventing ourselves from healing. Noone wants to give up something they cherish, nor give up on all the attention and sympathy they are lavished with.
Ignoring is no less common. “Ignoring” means not just refusing to acknowledge our pain and fear, but also pretending it doesn’t exist, dissociating, spacing out, going numb, acting on autopilot, or any of the other ways to try to keep our suffering out of sight and out of mind. This strategy is advocated by all those articles talking about how things actually aren’t that bad, or how this is just business as usual, or how it’ll all get better once the mid-term elections happen. While ignoring seems effective in the short term, it does nothing to address the suffering you feel. In addition to not healing that initial wound, it creates more pain as we inevitably force ourselves into tighter and tighter spaces in order to keep it out of mind.
There is an alternative to these three futile strategies. The enlightened strategy is to try fully experiencing whatever you’ve been resisting— without exiting in your habitual way. Become inquisitive about your habits. Recognize when you are pushing your suffering away, or embracing it, or denying it. Become inquisitive about your suffering. What is it, exactly, that you are denying? Why does it feel so urgent to push it away? Why does it feel so necessary to cling to it? Stop trying to justify your feelings, stop trying to explain them. Start instead to look at them, to see them for what they really are. Ask why it is they hurt, what part of your ego they compromise, what ideals they belie.
The passage on the three futile strategies follows a koan about “heaven and hell”. From a buddhist perspective, “hell” is not a place, it is all the feelings of pain and fear and suffering we experience. Nor is “heaven” a place, but rather all our feelings of gratitude and joy and understanding. Thus, the buddhist does not say “hell is bad and heaven is good” nor “get rid of hell and just seek heaven”. Rather, one should approach all things with an open mind, greeting both heaven and hell with that openness. In her words,
Only with this kind of equanimity can we realize that no matter what comes along, we’re always standing in the middle of a sacred space. Only with equanimity can we see that everything that comes into our circle has come to teach us what we need to know.
I find these words powerfully healing. It is healing to remember that no matter where we are or what befalls us, our life is a blessing, and in virtue of that blessing our bodies and the places we move through are sacred spaces. The sacred is not something which exists without us, but something which is created from within. Moreover, it is healing to step away from questions like “what did I do to deserve this?” and instead remember to ask what it is we can learn from the experience.
I have endured many traumas in my life, and I half expected the election outcome, but still it felt like a kick in the chest. This wound brought back all my darkest habits. Once I recovered from the shock enough to begin the rituals of healing and self-care, I reflected on the question of why this particular wound hurt so bad. In my experience (and not just because I’m buddhist), deep emotional pain always stems from some threat to one’s ego; so what part of my ego is on the line? For me, the reason the election hurt so much is because I had become complacent in believing that the world is steadily becoming a more just place and believing that people are by-and-large fundamentally good. With the election of Obama, the passing of the ACA, the supreme court ruling on Obergefell v. Hodges, and so on, I think a lot of us on the progressive side have been susceptible to those beliefs. The election hurt so much, for me, because it forced the recognition that it’s not just the legacy of systemic institutionalized hatred we must fight, but that over a quarter of the population actively supports the worst extremes of that hatred. Yes, the election itself was offensive. Yes, I fear for my life and the lives of those close to me. But the real root of the pain itself, the reason it hurt so bad, is this refutation of those optimistic beliefs about humanity and the path towards justice. Realizing that this was the root cause of my pain did a lot to help me process it and move on. It also gave a healthy way to shift focus from the pain itself, to something actionable. Having experienced the pain, I can accept it. And having learned what it has to teach me, I know what I must do.
So sit with your pain, and try to experience it fully. Stop pushing it away. Stop embracing it. Stop beating yourself up over it. Approach it with an open mind and let it pass through you. And, finally, ask yourself what you can learn from it.
I announced this on twitter a while back, but tomorrow I'm flying out to Nara Japan. I'll be out there all week for ICFP and all that jazz. It's been about a decade since last time I was in the Kansai region, and I can't wait. As I've done in the past, if you want to meet up for lunch or dinner, just comment below (or shoot me a tweet, email, etc).
This year's self-improvement goal was to get back into blogging regularly. Part of that goal was just to get back into writing regularly; the other part was specifically to publish more regularly.
I've done fairly well on the first half, actually. I'd hoped to do better, but then all year I've had to deal with spoon-draining circumstances, so I've probably done about as well as I can without sacrificing my health. One of my other self-improvement goals has been to take my health seriously, to listen to my body rather than pushing it beyond its limits. I'm on-track for improving at both of these, I just need to stop beating myself up over it.
For the second half, the publishing bit, that I've done poorly. I'd like to blame the spoon vortex here too, but really I think the biggest problem is my perfectionism. Perfectionism greatly amplifies the problem of lacking spoons: both the editing itself, as well as the emotional fallout of missing the mark or of having taken the entire day to hit it, both of these cost spoons. The real aim behind my goal to publish regularly wasn't to have more words to my name, but rather to “get out there” more, to be more productive in-and-of-itself rather than to have more products. So I've started thinking: the real target for this self-improvement goal should not be publishing regularly, but rather should be (working to) overcome perfectionism.
If perfectionism is a problem of fear, then the thing I must address is that fear. So how to do it? One of the suggestions in that article is to let yourself fail. Not to lower your unreasonable standards (the party-line for what to do), but rather to allow yourself to not meet those standards. One of my standards is to be thought provoking, and hence to focus overmuch on essays. To try and break free from this, I'm thinking to start posting summaries of my daily dissertation progress. A nanowrimo sort of thing, though without the focus on word-count per se. I've read a few articles suggesting one should start their day by summarizing the previous day's progress, but I've never tried it. So here goes nothing :)
Over the last few weeks I was interviewed for the Identity Function. The process was quite nice and got me thinking on a number of things. Some of them I may well flesh out into blog posts once I get the time. Of course, that likely won't be until the autumn given everything else going on the next couple months.
I'll be in New York from 28 June through 10 July. The first couple days are for a PI meeting, then I'll get a four-day weekend before LICS, NLCS, and LOLA. Originally the plan was to take a quick trip to Sacramento that weekend for a friend's wedding. (The wedding's still on, but plans changed.) At least this way I'll get a chance to relax, rather than running all over the place. Of course this also means I'll be spending the 4th in NYC. Historically the 4th has been one of my favorite holidays, because it was one I've always spent with friends. I don't know that any of my readers are in NYC, but if you'll be around drop me a line. Or if you used to live there and know fun things to do that weekend, let me know! (Especially any quiet end-of-Pride things.)
Me and L set the date for our final move to the Bay Area: 20 July. And then I start at Google on the 25th. Between now and then: dissertating!!
Life’s been really hectic lately, but I’ve been getting (slowly) back into working on my Haskell packages. In particular, since the switch from darcs to github I’ve started getting more comments and feature requests, which is nice. Over the next half-year or so, here’s what I’ll be up to in my free time between work on the dissertation and work on Hakaru:
containers — I’ve been appointed one of the new co-maintainers of our favorite venerable library. I prolly won’t be doing any major work until autumn (as mentioned when I was appointed), but I’ve had a number of conversations with David Feuer about where to take things in terms of cleaning up some old maintenance cruft.
bytestring-trie — A few years back I started reimplementing my tries to use Bagwell’s Array Mapped Tries in lieu of Okasaki’s Big-Endian Patricia Tries, but then got stalled because life. I’ve started up on it again, and it’s just about ready to be released after a few more tweaks. Also, now that I’m working on it again I can finally clear out the backlog of API requests (sorry folks!).
exact-combinatorics — A user recently pointed me towards a new fast implementation of factorial making waves lately. It’s not clear just yet whether it’ll be faster than the current implementation, but should be easy enough to get going and run some benchmarks.
unification-fd — This one isn’t hacking so much as dissemination. I have a backlog of queries about why things are the way they are, which I need to address; and I’ve been meaning to continue the tutorial about how to use this library for your unification needs.
logfloat — We’ve been using this a lot in Hakaru, and there are a few performance tweaks I think I can add. The main optimization area is trying to minimize the conditionals for detecting edge cases. The biggest issue has just been coming up with some decent benchmarks. The problem, of course, is that most programs making use of logfloats do a lot of other work too so it can be tricky to detect the actual effect of changes. I think this is something Hakaru can help a lot with since it makes it easy to construct all sorts of new models.
All this stuff is "well known", but I want to put it out there for folks who may not have encountered it, or not encountered it all together in one picture.
The Damas–Hindley–Milner type system (i.e., the type system that Algorithm W is inferring types for) is propositional logic extended with rank-1 second-order universal quantifiers. It is interesting because it is so particularly stable with respect to inference, decidability, etc. That is, we can come up with many other algorithms besides Algorithm W and they enjoy nice properties like the fact that adding type signatures won't cause inference to fail. (It's worth noting, that Algorithm W is DEXPTIME-complete; so while in practice it's often linear time, for pathological inputs it can take exponentially long. However, if we put a constant bound on the depth of nested let-bindings, then the upper bound becomes polynomial.)
The extension of DHM with rank-1 second-order existential quantifiers is strictly more powerful. It is interesting because it allows unrestricted use of both of the quantifiers in prenex position; thus, it is the limit/top of the alternating quantifier hierarchy (à la the arithmetical hierarchy) that starts with DHM. Surely there are other interesting properties here, but this system is understudied relative to the ones above and below. Edit: Although GHC gets by with encoding existentials away, it's worth noting that MLF allows existentials where the unpacking is implicit rather than requiring an "unseal" or case eliminator (Leijen 2006); and also that UHC does in fact offer first-class existentials (Dijkstra 2005).
The extension with rank-2 second-order universals (i.e., where the universal quantifier can appear to the left of one function arrow) is strictly more powerful still. Here we can encode rank-1 existentials, but my point in this whole post is to point out that rank-1 existentials themselves are strictly weaker than the rank-2 universals it takes to encode them! Also, one little-known fact: this type system is interesting because it is the last one in this progression where type inference is decidable. The decidability of rank-2 universal quantification is part of the reason why GHC distinguishes between
-XRankNTypes. Alas, although inference is decidable —and thus of mathematical interest— it is not decidable in the same robust way that DHM is. That is, if we care about human factors like good error messages or not breaking when the user adds type signatures, then we don't get those properties here. Still, the fact that this system is at the cusp of decidable inference is important to know. Edit: Also of interest, this system has the same typeable terms as simply-typed λ-calculus with rank-2 intersection types, and the type inference problem here is fundamentally DEXPTIME-complete (Jim 1995).
Things keep alternating back and forth between existentials and universals of each rank; so far as I'm aware, none of these systems are of any particular interest until we hit the limit: rank-ω (aka: rank-N) second-order quantification. This type system is often called "System F", but that's a misnomer. It is important to differentiate between the syntactic system (i.e., actual System F) we're inferring types for, vs the type system (aka: propositional logic with second-order quantifiers) in which the inferred types live. That is, we can perfectly well have a syntactic system which doesn't have explicit type abstractions/applications but for which we still ascribe rank-ω types. It so happens that the type inference problem is undecidable for that syntactic system, but it was already undecidable way back at rank-3 so the undecidability isn't particularly novel.
Usually whenever we think about gradschool we think about the switch from "doing classwork, quals, etc" to "dissertating" is a single-step process, which we call "becoming a PhD candidate" (as opposed to being a PhD student). In practice there are over half a dozen steps. Filing the paperwork declaring your completion of classwork, quals, etc is just the first few. (Okay, easy enough) Then there's the prospectus, for the graduate school. (The what for the who now?) Then the forming of your research committee. (Right, okay) Then the proposal. (Wait, how is this different from the other thing?) Then the proposal defense. (Um, okay, but not every department requires this?) Plus a few other steps I'm surely forgetting.
As of yesterday, I am officially finally totally completely absolutely done with all the paperwork, and can finally get back to actually working on the thesis itself!
It's official, I'm heading to Google at the end of July to work with Mark Larson on the Chrome OS security team (of all things!). Seems an unlikely match, but Mark was pretty keen on my background (including the gender stuff!) and wasn't put off by my fusion of linguistics, type-theory, constructive logic/maths, et al. I guess the security team is more concerned with semantic models and constructive correctness than the other teams I talked with. Makes sense, I suppose, but it's sad that these are still thought of as "security" concerns rather than "everything in programming" concerns.
I'm totally looking forward to it, but I am still thinking of it as a bit of an experiment. I'm an academic at heart, so I could see myself heading back to academia in a few years. (Wanting to be a professor is, afterall, what motivated me to start gradschool in the first place.) Then again, I've never really thought of industry as being interested in the sorts of theory I work on. In any case, I'll still be around at the various conferences I've been frequenting; I won't just disappear into industry.
I know a bunch of y'all are in the bay area. I'd love to hear any pointers you have on apartment hunting or on what neighborhoods (nearish Mt View) are nice— i.e., places an artistic disabled radical queer woman might like. Feel free to comment below or get in touch by email, twitter, etc.
Many of us with disabilities have more than one. This multiplicity is invisiblized by the abled community. When our lives are already assumed to be defined by disability, admitting multiplicity risks the hazard of letting the complexities of disabled lives enter one's mind. But even among the disabled, there's a tendency to focus on the one or two things which most obviously impact our lives. This is a coping mechanism. To cope with lacking spoons, we are always prioritizing our energies, and there is never enough to solve all the things. But also, ableism being what it is, we must be careful never to "complain too much" lest we loose whatever ears we've gained; so we consign our suffering to silence, that we might grasp at crumbs of compassion for hope that when things worsen there may still be someone who'll listen.
I have my "one or two things": depression and cPTSD. And I've mentioned my migraines on occasion, though they're seldom of bloggable interest. But there's one I've never talked about, one I've still not come to terms with myself. That's the thing about chronic pain. Noone ever teaches us about all the things that shouldn't hurt, about all the pains most people don't have. And consequently we come to normalize them, to unsee the ways they make us choose —in small ways at first— to restrict our lives. Last week I met a fabulous girl and we got to talking about disability. And with one sentence she cut through me like a thunderbolt, cut through a silence I hadn't even realized I'd been spinning for years. Her words, so simple:
I have a connective tissue disease
I've suspected it for a couple decades, known it for nearly a decade, but it's never been something I've been allowed to talk about. When a teen complains about joint pain, it is dismissed as an insignificance. When a twentysomething does, everyone older jests and jeers; "just wait till you're my age," they say. Sit down. Shut up. Respect your elders. If you're resilient enough to keep at it, to endure the shame and go to a doctor... well, doctors have ways of silencing things they can't cure. When I first saw a doctor for my knees, he acted like it was nothing, like I was a stupid kid bitching about nothing— despite saying, with surprise in his voice, how my x-rays looked like someone 15–20 years older. When I pressed, when I refused to be cowed, he told me there was nothing modern science could do: I could use a splint, but that'd weaken the muscles and exacerbate the problem; I could try working out to strengthen the muscles —at least, for as long as I could stand the pain— but that'd only slow the inevitable by a couple years at best; it wasn't bad enough for surgery, besides that'd just cause even more damage. "You're young," he said in flat monotone, like words rehearsed without meaning. Like pointing out something broken or left behind, when you really don't care if they hear you. Your coffee. Your wallet. Your tail light. You're young.
The thing about genetic issues is that they pervade everything. It's never a singular problem, it's a cascade of them, a death by ten-thousand papercuts. In my childhood, my mother always had issues with her knees. It was almost a joke how often she went in for surgeries on them; the kind of joke people only mumble and noone laughs at but they tell it anyways because they don't know what else to do. During my early college years, her shoulders started going out. A few years back my sister died spontaneously, and within a few months a cousin joined her. Aortic ruptures. In the last year or so, my mother had an aortic dissection. She survived, but more from luck than anything. I happened to be in Maryland when she was in the hospital, and I visited. She'd also been having catastrophic spinal problems. My parents didn't even bother mentioning it until she went in for the first surgery. It didn't go well. Three followup surgeries later and who knows if any of it did any good. Sitting next to her as she lay in that bed, her hands all locked up in pain, held in mine, I could barely look on her. Because I know I'll live to be crippled and die twisted in pain. She's had enough in-patient PT to be released, and is back home now on out-patient PT. Noone talks about it. But at least noone jokes anymore.
I can't say if it was her heart or her back that somehow managed to convince some doctor to take a closer look. He'd thought she had Marfan syndrome and ordered a genetic screening. Tests came back negative. Followups found it's actually Loeys-Dietz, something that wasn't even discovered until ten years ago, and the docs only knew of it because she'd been admitted to the hospital where they discovered it. There's no point in testing the dead, but there's little doubt about what did my sister and cousin in. I've been checked for aortic problems, and show no symptoms as yet. I'll have to get checked again every couple years.
(One of the funniest things about transitioning is how it's been the healthiest decision I've ever made. If I'd've known all the minor health issues it'd cure, I would've fought harder to do it when I was 18. Among the things it helped was my back. While uncommon, HRT can cause corrections in one's hips and lower ribs. Thanks to the changes in my hips and my center of gravity, I no longer have chronic back pain. Growing up I could never attain correct posture: it caused pain and felt unnatural; whereas now it comes freely and without thinking.)
But the litany of little pains isn't what hurts the most. I used to draw. It used to be my life. The fire in my heart, as maths is the breath in my chest. I'd do it when I wasn't thinking. I'd do it to focus my thinking. I'd come home and spend hours at it. I'd ignore eating to finish a piece. I won awards. I thought I'd make a vocation of it. By halfway through undergrad I could barely finish a small sketch in the margins of my notes. Many of my friends are artists (e.g.), and while I love their work, a hateful demon grows in me every time I see their successes or hear them praised. These days I can barely hold a pencil. My script an ever more illegible shorthand as I try to eke out a few more pages before I resign to sitting behind a computer. (The most creative parts of doing math, for me, needs being written. It is only once I have the sketch of a thing can I put it to pixels.) Just bringing up my art, acknowledging it as something lost rather than as something I lost time for, crushes me.
That girl, that blessed fabulous girl. A few days after we'd met I asked her about her ring, a beautiful curious thing, like two rings locked together at an angle. Turns out it's a surgical splint for preventing hyperextension. She told me where to get one, and on the bus yesterday I decided to check out their website. Reading through the descriptions of the rings they offer —I don't even... How do you name that emotion when a pain you've had so long you've forgotten it exists is suddenly eased, that lift, that release, that letting go. Like when you find someone who shares your very same marginalization, that feeling where you can just talk, can let words free without censor knowing they have already been understood before they are spoken. That sudden finding oneself not alone. That slow creeping into existence of a future worth looking toward. I had to turn off my browser. Can't be crying on busses. Can't be weak in public.
The next week+ I'll be in St. Petersburg Florida for PEPM, PADL, POPL, PPS, and PPAML PI (also CPP and OBT). Would've mentioned it sooner, but it's a bit of a last minute thing. I love reconnecting with old friends and meeting new folks, so feel free to come say hi. If you want to meet up for dinner or such, leave a comment with when/where to find you, or just look for the tall gal with the blue streak in her hair.
Last time I talked a bit about ABTs; in particular, I introduced the notion of strongly-typed ABTs (or "GABTs" if you prefer) and showed how we can extend the basic idea of ABTs to guarantee well-typedness in addition to well-aritiedness. However, I also made a note that ensuring this sort of well-typedness runs counter to what Neel and other CMUers often do. One of my colleagues here at IU noticed the reason, so I thought I'd write a bit more about it.
The issue at stake here is how general we can make our ABT library, to minimize the amount of boilerplate needed whenever inventing a new language. By encoding object-language type systems into the kinding of the ABT, we restrict the the possible object languages we can use the ABT implementation for (namely those object languages with type systems that can be embedded into whatever kinding the ABT has). To put a finer point on it, using the kinds presented in the previous post you cannot have binders in your type system. (Edit 2016.02.29: actually the details are more complicated.) This means no System F, and no dependent types. This is unfortunate as the whole point of ABTs is to capture binding structure once and for all!
However, I'd like to reiterate that, for our purposes in Hakaru this limitation is no restriction. Hakaru is simply-typed, so there are no type-level binders in sight. Moreover, we do a lot of program transformations in Hakaru. By using GABTs we can have GHC verify that our program transformations will never produce Hakaru code which is ill-typed, and that our program transformations will always produce Hakaru code of an appropriate type (e.g., the same type as the input term, for things like partial evaluation; but we have a number of type-changing transformations too). Thus, even though our GABT library could not be reused for implementing languages with type-level binders, it still provides a substantial benefit for those languages without type-level binders.
Although our GABTs cannot handle type-level binders, that does not mean we're restricted to only working with simply typed languages. For example, intersection types are not usually thought of as "simple types"; but they do not require binders and so they're fine. More generally, Larry Moss is engaged in a research project where he asks, "given infinite time, how far could Aristotle have gotten in logic?" By which he means, given the Aristotelian restriction to syllogistic logics (i.e., ones without the quantifiers introduced by Frege), what are the limits in what we can cover? It turns out that we can cover quite a lot. Some syllogistic logics go beyond the power of the "Peano–Frege" boundary: they can handle comparing cardinality of sets! A good pictorial summary of this line of research is on slide 2 of this talk; and a bit more about the complexity results is given in this talk (the requisite picture is on slide 39).
Edit 2016.02.29: In actuality, there's nothing inherent in type theory that prohibits having type-level binders for our object language; it's a limitation in GHC. In particular, GHC doesn't allow lifting GADTs into data kinds. If we could lift GADTs, then we could simply use ABTs to define the syntax of object-language type expressions, and lift those to serve as the type indices for using ABTs to define the syntax of object-language term expressions. This stratified approach is sufficient to handle System F and any other non-dependent quantifiers. To go further and handle dependent quantifiers as well, we'd also need to be able to define the object-language's terms and types in a mutually inductive way.↩
Edit 2015.10.29: Be sure to also read the followup post on the benefits and limitations of this approach compared to the usual untyped ABTs.
Earlier this year Neel Krishnaswami talked about abstract binding trees (ABTs) [part 1] [part 2]. IMO, the best way to think about ABTs is as a generalization of abstract syntax trees (ASTs), though this is not a perspective sanctioned by the CMUers I’ve talked to. CMUers oppose this way of phrasing things, in part, because the ABT libraries they’re familiar with make crucial use of the design pattern of two-level types; but I think the essential insights of ABTs and two-level types are quite different, and we ought to keep the benefits of these two techniques distinct.
Over the past year I’ve been working on the inferential language1 Hakaru, and in the new version of the compiler we’re using ABTs for our syntax trees. However, contrary to Neel’s stance against using strongly-typed internal representations for syntax, we extend the ABT approach to make use of GADTs to guarantee local well-typedness— since this in turn can be used to guarantee that program transformations are also well-typed. (If you don’t want those guarantees, then take a look at Jon Sterling’s abt library on Hackage2.) In this post I’m going to present a simplified version of our architecture, and then talk about some of the extra stuff bringing it closer to our production architecture.
First things first
Since we want everything to be well-typed, we first must introduce some universe,
U, of all the types in our language. (In Haskell we can implement such a universe by using the
-XDataKinds extension, so I’ll equivocate between calling
U a “universe” vs a “kind”.) For the rest of this post it doesn’t actually matter what lives in that universe3, just so long as things match up when they need to. Since the choice of universe is irrelevant, we could abstract over
U by turning on the
-XPolyKinds extension; but I avoid doing so below, just to help keep things more concrete.
The simplest way of thinking about well-typed ASTs is that they capture the set of terms generated by a (typed) signature; that is, the fixed point of some
Σ ∷ [U] → U → ⭑. Unpacking the type for
Σ, we have that every syntactic constructor
s ∈ Σ is associated with some arity (the length of the list), each argument to
s has some type in
U (the elements of the list), and applying
s to the right number of ASTs of the right types will generate a new AST with some type in
U (the second argument to
To implement this fixed point we define an
AST type which is parameterized by its signature. To ensure well-aritiedness (and well-typedness) of our ASTs with respect to that signature, we’ll need to introduce a helper type
SArgs4. And to ensure that we obtain the least fixed-point of the signature, we’ll make everything strict.
infix 4 :$ infixr 5 :* data SArgs ∷ (U → ⭑) → [U] → ⭑ where End ∷ SArgs ast  (:*) ∷ !(ast u) → !(SArgs ast us) → SArgs ast (u : us) data AST ∷ ([U] → U → ⭑) → U → ⭑ where (:$) ∷ !(σ us u) → !(SArgs (AST σ) us) → AST σ u
The problem with ASTs is that they have no notion of variables, and thus have no notion of variable binding. Naively we could implement binders like lambda-abstraction by having something like
λ ∷ Σ [u, v] (u :→ v) but then we’d need to do a post-hoc check to ensure that the first argument to
λ is in fact a variable. To build that check into the datatype itself we’d have to move
λ into the definition of
AST (since the first argument is of type
Variable u rather than
AST Σ u). If lambda-abstraction were the only binder we had, that might not be so bad; but any real-world language has a plethora of binders, and this approach doesn’t scale.
The essential idea behind ABTs is to abstract over the notion of binding itself. Given a single uniform definition of what it means to be a binding form, we don’t have to worry about adding a bunch of ad-hoc constructors to our AST datatype. Moreover, we can then provide single uniform definitions for things which mess with variables and are homomorphic over the signature. Things like capture-avoiding substitution and providing a HOAS API for our first-order representation.
The crucial step is to adjust our notion of what a signature contains. The basic signatures used above only contained applicative forms; i.e., things we can apply to locally-closed terms; i.e., what are called “functors” in the logic programming community. For ABTs we’ll want to allow our signatures to include any generalized quantifier. That is, our signatures will now be of type
Σ ∷ [[U] × U] → U → ⭑. Previously, the arguments were indexed by
U; now, they’re indexed by
[U] × U. The length of the list gives the number of variables being bound, the types in the list give the types of those variables, and the second component of the pair gives the type of the whole locally-open expression.
To implement this we need to extend our syntax tree to include variable bindings and variable uses:
data SArgs ∷ ([U] → U → ⭑) → [[U] × U] → ⭑ where End ∷ SArgs abt  (:*) ∷ !(abt vs u) → !(SArgs abt vus) → SArgs abt ((vs,u) : vus) data ABT ∷ ([[U] × U] → U → ⭑) → [U] → U → ⭑ where (:$) ∷ !(σ vus u) → !(SArgs (ABT σ) vus) → ABT σ  u Var ∷ !(Variable v) → ABT σ  v Bind ∷ !(Variable v) → !(ABT σ vs u) → ABT σ (v : vs) u
Time for an example of how this all fits together. To add lambda-abstraction to our language we’d have
λ ∷ Σ [([u],v)] (u :→ v): that is, the
λ constructor takes a single argument which is a locally-open term, binding a single variable of type
u, and whose body has type
v. So given some
x ∷ Variable u and
e ∷ ABT Σ  v we’d have the AST
(λ :$ Bind x e :* End) ∷ ABT Σ  (u :→ v).
“Local” vs “global” well-typedness
ABT definition above, every term of type
ABT Σ vs u must be locally well-typed according to the signature
Σ. I keep saying “locally” well-typed because we only actually keep track of local binding information. This is an intentional design decision. But only tracking local well-typedness does have some downsides.
So what are the downsides? Where could things go wrong? Given a locally-closed term (i.e., either
Var x or
f :$ e) any free variables that occur inside will not have their
U-types tracked by Haskell’s type system. This introduces some room for the compiler writer to break the connection between the types of a variable’s binder and its use. That is, under the hood, every variable is represented by some unique identifier like an integer or a string. Integers and strings aren’t
U-indexed Haskell types, thus it’s possible to construct a
Variable u and a
Variable v with the same unique identifier, even though
v differ. We could then
Variable u but
Variable v. In order to ensure global well-typedness we need to ensure this can’t happen.
One way is to keep track of global binding information, as we do in the paper presentation of languages. Unfortunately, to do this we’d need to teach Haskell’s typechecker about the structural rules of our language. Without a type-level implementation of sets/maps which respects all the axioms that sets/maps should, we’d be forced to do things like traverse our ASTs and rebuild them identically, but at different type indices. This is simply too hairy to stomach. Implementing the axioms ourselves is doubly so.
Or we could fake it, using
unsafeCoerce to avoid the extraneous traversals or the complicated pattern matching on axioms. But doing this we’d erase all guarantees that adding global binding information has to offer.
A third approach, and the one we take in Hakaru, is compartmentalize the places where variables can be constructed. The variable generation code must be part of our trusted code base, but unlike the
unsafeCoerce approach we can keep all the TCB code together in one spot rather than spread out across the whole compiler.
Stratifying our data types
The above definition of ABTs is a simplified version of what we actually use in Hakaru. For example, Hakaru has user-defined algebraic data types, so we also need case analysis on those data types. Alas, generic case analysis is not a generalized quantifier, thus we cannot implement it with
(:$). We could consider just adding case analysis to the
ABT definition, but then we’d start running into extensibility issues again. Instead, we can break the
ABT type apart into two types: one for capturing variable uses and bindings, and the other for whatever syntax we can come up with. Thus,
data Syntax ∷ ([[U] × U] → U → ⭑) → ([U] → U → ⭑) → U → ⭑ where (:$) ∷ !(σ vus u) → !(SArgs abt vus) → Syntax σ abt u data ABT ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Syn ∷ !(Syntax σ (ABT σ) u) → ABT σ  u Var ∷ !(Variable v) → ABT σ  v Bind ∷ !(Variable v) → !(ABT σ vs u) → ABT σ (v : vs) u
Of course, since we’re going to be extending
Syntax with all our language-specific details, there’s not a whole lot of benefit to parameterizing over
σ. Thus, we can simplify the types considerably by just picking some concrete
Σ to plug in for
Syntax apart from
ABT we can now extend our notion of syntax without worrying about the details of variable binding (which can be defined once and for all on
ABT). But we could still run into extensibility issues. In particular, often we want to separate the fixed-point portion of recursive types from their generating functor so that we can do things like add annotations at every node in the recursive data type. A prime example of such annotations is keeping track of free variables, as in Neel’s original post. To allow this form of extensibility we need to break up the
ABT type into two parts: the recursion, and the
Bind view of the ABT.
data ABT ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Unview ∷ !(View σ (ABT σ) vs u) → ABT σ vs u view ∷ ABT σ vs u → View σ (ABT σ) vs u view (Unview e) = e data View ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Syn ∷ !(Syntax σ abt u) → View σ abt  u Var ∷ !(Variable v) → View σ abt  v Bind ∷ !(Variable v) → !(View σ abt vs u) → View σ abt (v : vs) u
Now, to allow arbitrary annotations we’ll replace the data type
ABT with an equivalent type class. Each instance of the
ABT class defines some sort of annotations, and we can use the
unview methods to move between the instance and the concrete
There’s one last form of extensibility we may want to add. Using fixed point combinators gives us a way of describing complete trees. A different way of introducing recursion is with free monads. The free-monad combinator is just like the fixed-point combinator, except that we have an additional type parameter for metavariables and we have a data constructor for using those metavariables instead of requiring the recursion to ground out with a complete syntax tree. The reasons why this might be nice to do are beyond the scope of this post, but the point is we might want to do that so we need to split the
ABT class into two parts: one for the recursion itself, and another for the annotations.
In the end, we have a four-level type: the
View, the annotations, and the recursion.
 In the accepted/current parlance, Hakaru is a “probabilistic programming language”; but I and a number of other folks working on such languages have become disaffected with that term of late, since it’s not entirely clear what should and should not count as a “probabilistic” PL. Over the course of a number of discussions on the topic, I’ve settled on “inferential” PL as describing what is (or, at least what I find) interesting about “probabilistic” PL. I’ve been meaning to write a post about the subject, and hopefully this footnote will remind me to do so.↩
 N.B., the indexing used in that package is what we get if we erase/compactify the universe
U. That is: the erasure of
U is a singleton set; the erasure of
[U] is isomorphic to the Peano numbers; the erasure of
[[U] × U] is isomorphic to a list of Peano numbers; etc.↩
 Ideally we’d be able to flatten this type to avoid all the overhead of the linked list implementation. In fact, the entire AST node of
(:$) together with its
SArgs should be flattened. These nodes have the same general layout as the heap objects in the STG machine: a record with a pointer to the data constructor (i.e., element of the signature) followed by an appropriate number of arguments; and so in principle we ought to be able to implement them directly with a single STG heap object.↩
One of the folks I've chatted with a bunch online and finally got to meet in-person this year was Gershom. Towards the end of the symposium, he mentioned the post I linked to last time about my pulling away from Haskell communities. But the way he phrased it really struck me. And I wanted to comment on that.
I guess the crux of it comes down to, what do we mean when we say "the Haskell community"? What I meant when writing that post over a year ago was the online community of Haskell learners/practitioners (i.e., the folks who dominate spaces like reddit). But what Gershom's comment suggested was the academic community of FP researchers who work on/with Haskell (i.e., the folks who dominate spaces like ICFP). These two communities have always been intertwined, but as Haskell gains popularity they feel more and more distinct.
FWIW, I do still feel welcome as part of the academic community. There are some things I could comment on, sure, but those have more to do (i think) the general issue of being a woman in tech than they do with Haskell per se. Of course, this only underscores the concern of my original post. The free exchange between academics and practitioners was always a big part of what I've loved about the (general) Haskell community. Were not for that exchange, the spark planted by Mark Jones would never have fanned into the flame of my current research. As that free exchange unravels, this pathway from interested novice to researcher is cut off. And that pathway is crucial for the vibrancy of the community, as well as for addressing those general issues of being a minority in tech.
Got back from Vancouver a couple days ago. The return flight was hell, but other than that the trip was pretty good. Got to meet a whole bunch of folks who read my blog (hi!), which was super cool. It always gives me warm fuzzies to know people are actually interested in my stuff. (You'd think I'd get used to it after a while, but no, evidently childhood scars still remain years after developing a solid sense of self-esteem. Go fig.) Of course, now I feel bad about not having written much of late. I have a few posts I've been meaning to write —mostly technical ones—, and I'm thinking I might should try to kick those out as a way to get back into the daily writing habit needed for finishing up my dissertation.
Since handing out my card at ICFP, I've been hacking around on my website. I'm still working on getting my publications/presentations to be properly formatted, and the timestamps in the footers on a couple pages are busted, but other than that I think it's in pretty good shape. I also took part in a keysigning party for the purpose of building a WoT as part of the developing story for how we securely deliver packages in Haskell. My key is up on Keybase as well as on my website (along with all my other contact info).
After a week of great talks, too much drinking, and too little sleep, I met up with a dear friend from undergrad. It's been far far too long since we've had a chance to hang out (alas, he couldn't make the wedding last year), but he's one of those friends you can just fall right back into. It's funny how much can change and yet stay the same. In one of our ambling walks I dragged him into a clothing store —something I never would've done pre-transition. He didn't seem to mind. He just went along like it's the sort of thing we've always done. And it felt to me like the sort of thing we could've always done. Recently he's been ref'ing for roller derby up in Victoria, and after talking about it I'm thinking I might try my hand at it. Sounds like a lot of fun. We have a team in Bloomington, and they're doing a training/recruiting thing, though alas I'll miss it since I'll be in Mountain View on the 14th. I'll see if I can't find some other way to get introduced.
In the spirit of Brent's post, I figure I'll make a public announcement that I'm in Vancouver all week attending HOPE, ICFP, and the Haskell Symposium. I love reconnecting with old friends, as well as meeting new folks. Even if I've met you at past ICFPs, feel free to re-introduce yourself as ...things have changed over the past few years. I know I've been pretty quiet of late on Haskell cafe as well as here, but word on the street is folks still recognize my name around places. So if you want to meet up, leave a comment with when/where to find you, or just look for the tall gal with the blue streak in her hair.
Unlike Brent, and unlike in years past, I might should note that I am looking to "advance my career". As of this fall, I am officially on the market for research and professorship positions. So if you're interested in having a linguistically-minded constructive mathematician help work on your problems, come say hi. For those not attending ICFP, you can check out my professional site; I need to fix my script for generating the publications page, but you can find a brief background/research statement there along with the other usuals like CV, recent projects, and classes taught.
The bytestring-lexing package offers extremely efficient bytestring parsers for some common lexemes: namely integral and fractional numbers. In addition, it provides efficient serializers for (some of) the formats it parses.
As of version 0.3.0, bytestring-lexing offers the best-in-show parsers for integral values. (According to the Warp web server's benchmark of parsing the Content-Length field of HTTP headers.) And as of this version (0.5.0) it offers (to my knowledge) the best-in-show parser for fractional/floating numbers.
Changes since 0.4.3 (2013-03-21)
I've completely overhauled the parsers for fractional numbers.
Data.ByteString.Lex.Lazy.Double modules have been removed, as has their reliance on Alex as a build tool. I know some users were reluctant to use bytestring-lexing because of that dependency, and forked their own version of bytestring-lexing-0.3.0's integral parsers. This is no longer an issue, and those users are requested to switch over to using bytestring-lexing.
The old modules are replaced by the new
Data.ByteString.Lex.Fractional module. This module provides two variants of the primary parsers. The
readExponential functions are very simple and should suffice for most users' needs. The
readExponentialLimited are variants which take an argument specifying the desired precision limit (in decimal digits). With care, the limited-precision parsers can perform far more efficiently than the unlimited-precision parsers. Performance aside, they can also be used to intentionally restrict the precision of your program's inputs.
The Criterion output of the benchmark discussed below, can be seen here. The main competitors we compare against are the previous version of bytestring-lexing (which already surpassed text and attoparsec/scientific) and bytestring-read which was the previous best-in-show.
The unlimited-precision parsers provide 3.3× to 3.9× speedup over the
readDouble function from bytestring-lexing-0.4.3.3, as well as being polymorphic over all
Fractional values. For
Double: these functions have essentially the same performance as bytestring-read on reasonable inputs (1.07× to 0.89×), but for inputs which have far more precision than
Double can handle these functions are much slower than bytestring-read (0.30× 'speedup'). However, for
Rational: these functions provide 1.26× to 1.96× speedup compared to bytestring-read.
The limited-precision parsers do even better, but require some care to use properly. For types with infinite precision (e.g.,
Rational) we can pass in an 'infinite' limit by passing the length of the input string plus one. For
Rational: doing so provides 1.5× speedup over the unlimited-precision parsers (and 1.9× to 3× speedup over bytestring-read), because we can avoid intermediate renormalizations. Whether other unlimited precision types would see the same benefit remains an open question.
For types with inherently limited precision (e.g.,
Double), we could either pass in an 'infinite' limit or we could pass in the actual inherent limit. For types with inherently limited precision, passing in an 'infinite' limit degrades performance compared to the unlimited-precision parsers (0.51× to 0.8× 'speedup'). Whereas, passing in the actual inherent limit gives 1.3× to 4.5× speedup over the unlimited-precision parsers. They also provide 1.2× to 1.4× speedup over bytestring-read; for a total of 5.1× to 14.4× speedup over bytestring-lexing-0.4.3.3!
- Homepage: http://code.haskell.org/~wren/
- Hackage: http://hackage.haskell.org/package/
- Darcs: http://community.haskell.org/~wren/
- Haddock: Darcs version
I've been working on culling a bunch of my old email addresses and unifying my username across the internet. I've mentioned a number of these changes before, but every few months I run into folks who've missed the memo. So, for those who keep up with my blog or follow Haskell Planet, here's the latest rundown:
- firstname.lastname@example.org —is dead
- email@example.com —is dead
- firstname.lastname@example.org —will soon be dead
- My Perl forwarding address lives— email@example.com
- My Haskell forwarding address lives— firstname.lastname@example.org
- My current work address is— email@example.com
- My personal email at gmail.com is also lives
- Code Hosting & Other User Names
I have a new academic/professional website: http://cl.indiana.edu/~wren/!
There are still a few unfinished areas (e.g., my publications page), but hopefully I'll be finished with them shortly. The new site it built with Hakyll instead of my old Google-Summer-of-Code static website generator. I'm still learning how to implement my old workflow in Hakyll, and if I get the chance I'll write a few posts on how I've set things up. Reading through other folks' posts on how they use Hakyll have been very helpful, and I'd like to give back to the community. I've already issued a pull request for adding two new combinators for defining template fields.
In the meantime, if you notice any broken links from my old blog posts, please let me know.