It would seem over the last year or two my blog has lapsed from obscurity into death. Not being one to let things rest, I figure this horse still has some beating left in it. About, what, a month ago I handed in the final project for my MSE and so I am now a masterful computer scientist. This means, in short, that I now know enough to bore even other computer scientists on at least one topic.
The funny thing is that both topics of my project —category theory and unification— are topics I knew essentially nothing about when I transfered to JHU from PSU a year ago. Of course now, I know enough to consider myself a researcher in both fields, and hence know more than all but my peers within the field. I know enough to feel I know so little only because I have a stack of theses on my desk that I haven't finished reading yet. I'm thinking I should finish reading those before recasting my project into a submission to a conference/journal. Since the project is more in the vein of figuring out how a specific language should work, rather than general theoretical work, I'm not sure exactly how that casting into publishable form should go; it seems too... particular to be worth publishing. But then maybe I'm just succumbing to the academic demon that tells me my work is obvious to everyone since it is to me.
One thing that still disappoints me is that, much as I do indeed love programming languages and type theory, when I transfered here my goal was to move from programming languages and more towards computational linguistics. (If I were to stick with PL, I could have been working with the eminent Mark Jones or Tim Sheard back at PSU.) To be fair, I've also learned an enormous amount about computational linguistics, but I worry that my final project does not belie that learning to the admission committees for the PhD programs I'll be applying to over the next few months. Another problem that has me worried about those applications is, once again, in the demesne of internecine politics. For those who aren't aware, years ago a line was drawn in the dirt between computationally-oriented linguists and linguistically-oriented computer scientists, and over the years that line has evolved into trenches and concertina wire. To be fair, the concertina seems to have been taken down over the last decade, though there are still bundles of it laying around for the unwary (such as myself) to stumble into. There are individuals on both sides who are willing to reach across the divide, but from what I've seen the division is still ingrained for the majority of both camps.
My ultimate interests lie precisely along that division, but given the choice between the two I'd rather be thrown in with the linguists. On the CS side of things, what interests me most has always been the math: type theory, automata theory, etc. These are foundational to all of CS and so everyone at least dabbles, but the NLP and MT folks (in the States, less so in Europe) seem to focus instead on probabilistic models for natural language. I don't like statistics. I can do them, but I'm not fond of them. Back in my undergraduate days this is part of why I loved anthropology but couldn't stand sociology (again, barring the exceptional individual who crosses state lines). While in some sense stats are math too, they're an entirely different kind of math than the discrete and algebraic structures that entertain me. I can talk categories and grammars and algebra and models and logic, but the terminology and symbology of stats are greek to me. Tied in somehow with the probabilistic models is a general tendency towards topics like data mining, information extraction, and text classification. And while I enjoy machine learning, once again, I prefer artificial intelligence. And to me, none of these tendencies strike me as meaningfully linguistic.
More than the baroque obfuscatory traditions of their terminology, my distaste for statistics is more a symptom than a cause. A unifying theme among all these different axes —computational linguistics vs NLP, anthropology vs sociology, mathematics vs statistics, AI vs machine learning — is that I prefer deep theoretical explanations of the universe over attempts to model observations about the universe. Sociology can tell you that some trend exists in a population, but it can make no predictions about an individual's behavior. Machine learning can generate correct classifications, but it rarely explains anything about category boundaries or human learning. An n-gram language model for machine translation can generate output that looks at least passingly like the language, but it can't generalize to new lexemes or to complex dependencies.
My latest pleasure reading is Karen Armstrong's The Battle for God: A history of fundamentalism. In the first few chapters Armstrong presents a religious lens on the history of the late-fifteenth through nineteenth centuries. Towards the beginning of this history the concepts of mythos and logos are considered complementary forces each with separate spheres of prevalence. However, as Western culture is constructed over these centuries, logos becomes ascendant and mythos is cast aside and denigrated as falsity and nonsense. Her thesis is that this division is the origin of fundamentalist movements in the three branches of the Abrahamic tradition. It's an excellent book and you should read it, but I mention it more because it seems to me that my academic interests have a similar formulation.
One of the reasons I've been recalcitrant about joining the ranks of computer scientists is that, while I love the domain, I've always been skeptical of the people. When you take a group of students from the humanities they're often vibrant and interesting; multifaceted, whether you like them or not. But when you take a group of students from engineering and mathematical sciences, there tends to be a certain... soullessness that's common there. Some of this can be attributed to purely financial concerns: students go into engineering to make money, not because they love it; students go into humanities to do something interesting before becoming a bartender. When pitting workplace drudgery against passionate curiosity, it's no wonder the personalities are different. But I think there's a deeper difference. The mathematical sciences place a very high premium on logos and have little if any room for mythos, whereas the humanities place great importance on mythos (yet they still rely on logos as a complimentary force). In the open source movement, the jargon file, and other esoterica we can see that geeks have undeniably constructed countless mythoi. And yet the average computer geek is an entirely different beast than the average computer scientist or electrical engineer. I love computer geeks like I love humanists and humanitarians, so they're not the ones I'm skeptical of, though they seem to be sparse in academia.
I've always felt that it is important to have Renaissance men and women, and that modern science's focus on hyperspecialization is an impediment to the advancement of knowledge. This is one of the reasons I love systems theory (at least as Martin Zwick teaches it). While I think it's an orthogonal consideration, this breadth seems to be somewhat at odds with logocentric (pure) computer science. The disciplines that welcome diversity —artificial intelligence/life, cognitive science, systems theory, computational linguistics— seem to constantly become marginalized, even within the multidisciplinary spectrum of linguistics, computer science, et al. Non-coincidentally these are the same disciplines I'm most attracted to. It seems to me that the Renaissance spirit requires the complementary fusion of mythos and logos, which is why it's so rare in logocentric Western society.