Subscribe to Blog via Email
The Ibycus mainframe
The Ibycus computer was what Thesaurus Linguae Graecae data crunching got done on throughout the 1980s and 1990s. It was the stuff of legend, an HP 1000 customised in David Packard Jr.‘s garage, with spelling and format checkers and text editors in assembler, that crunched through tens of millions of words of Greek in its own temperature controlled room.
It’s also the stuff of legend featuring in the “Lernaean Text” (or Hellenic Quest text), the long-running and indefatigable distorted urban legend doing the rounds of the Net for years, claiming that Ibycus (or Imycus) has determined that Greek has 90 million* distinct words. It also says that Bill Gates wants his programmers to program in Ancient Greek, John Sculley is still running Apple and publishing with CNN (?) the Hellenic Quest software to teach the world Greek, and Greek words have deep cabbalistic meanings and no arbitrariness of signs. That’s why Nikos Sarantakos calls it Lernaean: however many times you cut off its head (including refutations by the TLG itself), it keeps coming back, because enough people want it to be true.
In the real world, the TLG as of this writing has around 100 million tokens [instances of words], 1.5 million types [distinct words—as in run ran or ἄνθρωπος ἀνθρώπου] and (depending on how you count them) closer to 200,000 lexemes [dictionary words]. The count will go up as the TLG expands coverage, and the lexeme count will vary with different decisions on how to treat words. The lemmatiser currently tends to conflate variants for more hits in searching, so the count will be on the low side (though I’m including some 30,000 proper names, which is strictly speaking cheating). Still, in the real world, I don’t see how anything will take that count from 200 thousand to 90 million. And for strictly classical Greek, it’d be only a little over half that…
In the real world, too, Ibycus was just a mainframe, not the harbinger of a New Hellenic World Order; and my first task when I joined the TLG in 1999 was to help decommission it. The TLG texts were being copied off Ibycus to machines with 8-bit bytes in February when I got there, and my first job was to finish writing replacement format and spell-checkers on PC. Ibycus was unplugged and removed in late 1999, and the TLG page on Ibycus shows the removalists at work. (Unfortunately the machine got damaged in the process, and the TLG couldn’t find a museum willing to house it.)
I did have some other photos of the box from 1999, though:
That’s me, trying to lay down the law to the HP 1000…
[* EDIT: Had typo’d to “90 distinct words”, and thank you to dokiskaki for pointing that out. 90 distinct words would certainly have made life much easier…]