In 2002 I wrote an article entitled “Don’t Sit on a Phoenix” for Dance Gazette about the problems of using the web for research. It’s got nothing to do with music or dance, but it’s still one of my favourite essays, not least because the problem I was writing about—the constant unattributed quotation of “the mind is not a vessel to be filled, but a light to be kindled” on the web—has not gone away, even 18 years later. In fact, to my dismayed astonishment, it has just turned up only recently as a textual horseshoe-over-the-door of a publication by an institution that ought to know better. I offered £50 to the reader who could actually identify the original source of the quotation. Eventually, a classics scholar from the US wrote in with the answer which was published in a follow-up issue, which I no longer have to hand (though the answer is available in this Reddit posting from 2019).

A related problem in the article was that of optical character recognition (OCR) in the scanning of articles of books, hence the title of the article, “Don’t Sit on a Phoenix,” which was a reference to what turned out to be an OCR error—the text should have read don’t sit on a choenix. You’ll have to read the article to find out the rest.

And 18 years later, the OCR problem has not gone away either. Having forked out £99 on OCR software to make some photocopied articles into searchable pdfs, I’ve just created a similar error myself. In an article on Ashton’s use of music, the PDF produced by the OCR software, accurate in every other respect, made one fairly disastrous misreading: it replaced the title of one of Ashton’s works, Scènes de Ballet, with “Seines de Ballet.” And before you ask, yes, the software was primed to recognize French, among other languages. There is no way to tell that the transformation happened by OCR, the page looks exactly the same as in the original PDF. In the immortal translation provided by Google Translate, this comes out as “Ballet Boobs,” which, I probably don’t need to tell you, is not a ballet by Sir Frederick Ashton, but it would have made a great example for that Dance Gazette article.

