The Decalogue of Nick #2: I’ve trained as a linguist, and I have done computational linguistics stuff

By: | Post date: December 27, 2016 | Comments: No Comments
Posted in categories: Academia, Language, Personal

For Audrey Ackerman and Brian Collins and Zeibura S. Kathau.

Ask a Greek what they can tell you about Byzantium, and they won’t tell you what the millennium of the East Roman Empire achieved. They won’t tell you about the Palaeologan Renaissance, or the ambivalence about the Classical past, or the edifices of Roman Law, or the architectural marvels.

All they’ll tell you is that the Empire fell. And they don’t even pick the right instance when it fell. (The Empire at 1453 wasn’t worth saving.)

Well, so it is with my linguistics career. Scratch me just a little bit, and I will lament the defining woe of my life, that I did not become a professional linguist: Nick Nicholas’ answer to What is your personal experience with obtaining a linguistics degree? (If I’m feeling prudent, I’ll admit that the outcome was the right one: Nick Nicholas’ answer to What are your 3 worst mistakes? Would you fix any of them if you could go back in time? But only if I’m feeling prudent.)

What’s harder for me to do, as a glass half empty kinda guy, is admit how much I gained in the experience.

Linguistics gave me a sense of purpose when I had none. Linguistics gave me friends and companionship and stimulation. Linguistics gave me the place where I could act as glue between my peers—that trait that Clarissa Lohr continues to find approval-worthy in me. Linguistics gave me the opportunity to teach—and once I’d gotten to teach, nunc dimittis: I could have died a happy man, even if it was just three semesters.

And in truth, Linguistics gave me the opportunity to turn away from it, to say that no, I deserved better than to be strung along, and to regain myself even at the cost of losing myself.

I sidled into linguistics and out of electrical engineering towards the end of my undergraduate degree. Through a masters I did enough of the bridging undergraduate courses that they let me in to the PhD programme. They recognised, I guess, that I had some talent.

The Master’s was in discourse theory: Rhetorical Structure Theory, to be exact. Discourse structure and implicature was my early fascination in linguistics, and RST was all the rage back then (1994) in text generation. (Do people even call it text generation any more?)

When time came for me to pick a PhD topic, though, I wanted to go back to historical linguistics. Actually I wanted to go forward from discourse theory to historical linguistics: grammaticalisation gave a way for implicature to motivate language change that intrigued me. While I initially wanted to work on Tibetan (because Squiggles), I had a conversation with the doyen of grammaticalisation theory, Elizabeth C. Traugott, during which she asked, wouldn’t it be fascinating to look at how grammaticalisation interacted with diglossia.

Two years in to my thesis, I’d worked out that no, that was a dumb idea. But I was too far into my work by then. My topic was the development of the modern relativiser pu, and how it had diversified in meaning. I’d intended to work backward, and unearth all sorts of awesome instances of implicature and analogy from Early Modern Greek.

But theses don’t go as you’d planned. On the way towards internal reconstruction, I became captive to the diversity of Greek dialect—nature’s historical linguistic laboratory: they have a common starting point with dozens of divergent endpoints, so you can get an amazing sense of what is possible in language change. By the time I’d worked out what was happening in all the dialects, and detoured into what was happening in the rest of the Balkans, I’d run out of time. And space: the Balkan chapter ended up on the cutting room floor.

From the thesis, I’d gained an encyclopaedic familiarity with modern Greek dialect; a good knowledge of Early Modern Greek anyway (which I put to use in my later coauthored monograph, An Entertaining Tale of Quadrupeds); and a smattering of Balkan linguistics. I’d planned to use my knowledge to write a reference grammar of Early Modern Greek by the time I was 50. That isn’t happening; and the guys who were working on it (Greek Grammar to fill the gap) have run out of funding and have retired.

What I did not get is any Ancient Greek; I don’t have any formal training in that, although once you’re a linguist, you can make sense of a grammar book just fine. (And I picked up what I needed to later.)

I also picked up a fair bit of linguistic typology from a decade of working as a research assistant, mainly under John Hajek. It was a rocky relationship, as you can well imagine from someone with my ego in a second fiddle role. But it was a good schooling too. And working on a phonological survey of Papua New Guinea, I got at least some of the phonology I did not get from the department.

I wrote a bunch of papers after I finished the PhD. Some got published. Some got submitted at the time journals got switched over from paper to electronic submission, and got lost in the mail. It was fun to write the papers; but it was also writing in a vacuum. I didn’t really have a network of peers to care about what I was writing (part of the problem of not being in Europe), and the problems I was working on seem to have been too obscure to have stimulated any interest anyway. In fact, the papers that generated the most interest were about social history (the Greek colony in Corsica). I have 8 finished unsubmitted papers, and 8 more incomplete, from when I stopped writing in 2008. I’m not strongly motivated to do anything with them.

I got more interaction, if anything, out of the Ἡλληνιστεύκοντος blog I used to do (and will do again, if Quora disappears in a puff of smoke). And some of my favourite questions on Quora are when I do my own detective work, to solve a linguistic problem I don’t already know the answer to.


For Amy Dakin.

I have also done some computational linguistic stuff. Most of it has been at the Thesaurus Linguae Graecae, where I had worked from February 1999 through to June 2016.

I’ve been reluctant to go publicly into the specifics of why I’m no longer employed there, until now. But then again, my time at the TLG should not suffer the same fate as Byzantine History: what I achieved (what *I* achieved) is more important than the way I ceased to.

I have very high regard for my fellow programmer of 13 years Nishad Prakash; and if anything even more regard for my fellow programmer of 4 years, John Salatas, who is still working there. They are far better craftsmen than I am. And I don’t mean to take anything away from their achievements by what I’m about to say.

But anything you see at the TLG that involves linguistics? Me. Anything that involves stylometrics? Me. Anything that involves Natural Language Processing? Formatting? Peculiar sigla? Comparison of texts? Me.

There’s a lot of computer science things that I’m proud of working out while there. Some algorithmic refinements to recursive Longest Common Subsequence detection, to work out common phrases between passages. Some fiendish DFA and NDFA work, to deal with the quirky ASCII encoding of Greek we have in character-by-character and wildcard search. A lot of cleverness in contextual grammatical disambiguation, that I’m not confident will ever see the light of day (or will be highlighted for users if it does).

And my crowning work: the morphological analyser of Greek. It originated in Perseus’ Morpheus, but I have stretched and pulled and broadened and narrowed and reranked it over the past 15 years, to deal with all the stages of Greek the TLG has thrown at it, from Homer through to misspelled 17th century Cretan land deeds—and to still yield some semblance of order. In the process, I dare say I have developed as intuitive a sense of what grammatical wackiness Byzantine authors could indulge in as anyone living: I’ve had to deal with it all.

I didn’t get to write the reference grammar of Early Modern Greek. But the morphological analyser I curated, with all the proper names of Athenian courtesans and Albanian chieftains, of Egyptian decans and minor saints, with all the mangled Byzantine optatives and grammarians’ fictional conjugations, with every last utterance of Sappho accounted for, and as much of Theodore Metochites’ as I could disentangle: that has been just as great an achievement.

Which I now no longer can contribute to.

But those of you with access to a TLG subscription: click on some words’ analyses, and do some parallel text comparisons, and look up some of the online lexica. And taste some of the joy of the Greek language and the Greek literary corpus, that I got to savour in my time.

And you’re welcome.

Leave a Reply