Your Soapbox or Mine?
22 Nov 2008 04:59 amLast night I went to a farewell dinner for Micha, who is heading back to Germany after a couple months at CLSP. About a dozen of us had delicious Ethiopian, and half hung around for drinks afterwards. Both establishments were quite nice, reminding me I should hang out in Mt Vernon more often. Micha's specialty is in "Deep MT", a variety of machine translation which makes use of linguistic factors rather than being purely statistical. Or to wit: MT done right. So there was some self-selection involved but the company was, as always, what made the night.
Three of the folks who stuck around for drinks were the first years at CLSP: two from CS who share my MT seminar, and one from ECE who seemed more grounded than most ;) Add to that Micha, myself, and one of the old-timers. It's amazing what people'll say once you get them off campus, or once you get a few drinks in 'em. On campus it's all business all the time. Which is fitting, it's a job afterall; but it does leave things rather dreary. And somehow it seems to lead to never really knowing what other folks are working on, or what they're interested in. It's nice to see the human side of people. It's also nice to see the business side of the business. But no, I need more humans in my life.
At Brewers Art I spent most of my time talking with A. She was sitting next to me and I could hear her, two excellent points in her favor. At some point we got onto that topic: what we're really interested in. I said I just finished my degree and was sticking around for a year working on GALE, "so that's why you're always so together at MT seminar," and I'm working on PhD apps for next year. The follow on question: the wheres and whys. I began to give the other face of my last rant, a presentation I've been polishing for those selfsame apps. I'm interested in morphology and its interfaces with syntax, semantics, and phonology; and I think we need to be working on linguistically-aware tools, since SMT's ignorance of morphosyntax is one of its principal failures (a point Micha demonstrated fabulously in his seminar last friday); and I think we need to be working on languages with few resources, for political reasons and also because tying ourselves to megacorpora means we will never break away from the need to invest millions to get enough training data to simulate knowledge, badly.
Shortly into my rant she said, "that's my soapbox!" For her undergrad thesis she worked on computational typology: measuring the distances between languages in typological space. The sort of work that would be essential for L3 to use a known system for translating between two languages to bootstrap translations between similar languages. When I told her the places I was thinking of heading she was surprised there were people working on our domain; she'd spent so long justifying this empirical-yet-linguistic approach, and I too know how hard it can be to convince the devout statisticians or the non-computationalists. Typology, more even than morphology, is a domain that gets a passing mention in undergrad years and yet never sees the light of day in modern research.
For her part she tried convincing me I should stick around CLSP, to join her in the battle. A tempting thought, though I worry it may be more uphill a battle than at the schools I've been thinking of. Though maybe it's worth another thought. All in all great food, great beer, great discussions, and intellectual vindication. What more could you ask for in a night?