David Pierce | Matematik | M.S.G.S.Ü.

On transcription

This article is about my preparation of a webpage of two connected excerpts, in English and Turkish, from Robert Pirsig's Zen and the Art of Motorcycle Maintenance. As somebody who frequenty reads old books, even two-thousand-year-old books, I am conscious of the difficulty of knowing whether what I am reading is really what the author intended. Editors, even (or especially) amateur online editors, should avoid compounding this difficulty. This is why I provide the present explanation.

My own sensitivity to typographical matters has been developed by use of the LaTeX typesetting program. More people could fruitfully use this program for writing of all kinds, in place of Microsoft Word.

The English transcription

The English text that I present from Pirsig's book has been cut and pasted from a pdf file found somewhere on the web. Unfortunately that file does not explain its origins. It does begin and end with the phrase, “back to the bookshelf,” underlined and in red as if it were a link. Thus the file may be from somebody's online collection of books; but that person is anonymous in the file itself.

I myself possess Pirsig's book in print in two forms.

  1. I have the Bantam Books paperback “New Age Edition” of October 1981. On its copyright page, this edition gives a complete “Printing History,” in the form of a list. The original William Morrow edition of April, 1974, had five printings. The first Bantam edition was April, 1975; there 26 printings of this, the last being November, 1980. The New Age Edition was October, 1981, but this was still a Bantam publication and is implicitly counted as the 27th printing; for, next on the list is the 28th printing of May, 1982. This printing is the last in the Printing History. However, at the bottom of the whole copyright page, below the centered words “PRINTED IN THE UNITED STATES OF AMERICA,” there is a horizontal list of numerals, from 37 down to 30. If the numeral 29 were added at the end, then the list would be centered as well. I suppose the 29 was deleted to indicate that my copy of the book is from the 30th printing.
  2. I have the William Morrow hardback “25th Anniversary Edition” of 1999. Its copyright page has no printing or edition history. It just assigns copyright to Robert M. Pirsig in both 1974 and 1999l and at the bottom of the page, centered below the publisher's web address of www.williammorrow.com, there is a centered list of numerals from 40 down to 31. So perhaps I have a thirty-first printing; but of which edition or editions? The dust jacket labels the book as being the 25th Anniversary Edition, and after the epigraph from Plato's Phaedrus, the book itself begins with what is headed as “Introduction to the Twenty-fifth Anniversary Edition.” It seems unlikely that a hardback edition would go through 31 printings.

The transcription of Pirsig’s book found in the pdf file includes the author’s 1984 Afterword, though not the Introduction to the Twenty-fifth Anniversary Edition. Presumably then the pdf file is based on a printing between 1984 and 1999.

In the 25th Anniversary Edition, as the Introduction points out, Phaedrus’s own words are set in sans-serif type, to prevent certain misunderstandings of readers of the earlier editions. The online pdf file does not reflect this typographical innovation. Unfortunately this by itself is not good evidence that the transcriber was working with an earlier edition; for it is not clear that the transcriber would have bothered to identify and implement the changes of font shown in the 25th Anniversary Edition. The pdf transcript does not even make the typographical distinction between roman and italic type. Nor does it distinguish between opening (“) and closing (”) quotation marks: it just uses the symmetrical quotation mark (") found on an old-fashioned typewriter. Perhaps the transcriber used an OCR program that could not make these distinctions.

In this case, the transcriber should have made the distinctions by hand. No doubt this was thought to be too much trouble. However, the transcriber did trouble to make some unjustified changes to the original text. He (I imagine it is a “he”) has presumed to use the ligature æ in Phaedrus’s name, setting this as “Phædrus”; and in the Afterword, where the author's son Chris is quoted on the subject of his 23rd birthday (which he did not live to see), the ordinal suffix is raised, as in “23rd.” Both of these innovations—the ligature and the raised suffix—are unwarranted; for the original publisher surely had the means to implement them, if they had been desired.

There are some questions of typographical style that editors may settle as they please. This includes questions of punctuating, but not of spelling them. The writer of the Arrant Pedantry blog argues persuasively that you can write his name as Jonathon Owen, or (in an index) as Owen, Jonathon; you can write Jonathon R. Owen or Jonathon R Owen; but you must not write Jonathan Owens.

It may be argued whether Phædrus is a different spelling or a different punctuation of Phaedrus. I think it is a different spelling. The ligature æ is formed, not merely for the sake of visual harmony.

Back in 1926, in A Dictionary of Modern English Usage, H. W. Fowler sensibly condemned the æ and œ ligatures because

Writing Phaedrus’s name as Phædrus is like writing your rock band’s name with the metal umlaut of Blue Öyster Cult. It is worse, because you can write your own band’s name as you like, but you should not feel free to change somebody else’s name. I dwell on the point because I myself might once have written Phaedrus in the mysterious form Phædrus, if I had had the capability. At least, I might have done this at age fifteen.

I noted that Pirsig's transcriber did not bother to reproduce Pirsig's italics. It may be debated whether a writer ought to write in such a way that italics are not needed. Fowler himself condemns excessive use of italics, especially to emphasize whole sentences. Apparently this bad habit developed during what was for Fowler the last war, but is for us only the First World War. Italics do however have their place. In any case, Pirsig did use them, and a proper transcription of his work ought to respect this. Perhaps the pdf transcription was made with the help of an OCR program that was itself insensitive to italics. But anybody with the capability of raising the ordinal ending in “23rd birthday” to create “23rd birthday” ought to be able to set text in italic, after locating by visual inspection the text that should be in italic, as I have done for my own presentation of the text.

The pdf transcription reflects a minor typographical change that must have been made in the 1984 edition; I make note of it because, if this small change was made in the print edition, larger changes may have been made as well. A unnamed reviewer from the Los Angeles Times Calendar quoted in the opening promotional pages of my paperback edition says, “This book should live a long life and go through many editions, and I hope Pirsig tinkers with it”; how much tinkering did Pirsig end up doing? An unhyphenated instance of the phrase “degree and grading system” on page 174 of the earlier edition (in the sentence beginning “Phaedrus’ argument for the abolition of the degree-and-grading system”) later became “degree-and-grading system.” (An unhyphenated instance in the same paragraph remains in the later edition: “‘Of course you can’t eliminate the degree and grading system. After all, that’s what we’re here for.’”)

I have not compared my own electronic text line by line with the printed original. I have just read both, without noticing discrepancies.

The Turkish transcription

I obtained the Turkish text by

  1. photocopying the relevant pages of the print book,
  2. photographing those photocopies to create jpg files,
  3. cropping each photo twice in order to isolate each original page of text,
  4. feeding each page through an online OCR program,
  5. matching each paragraph of OCR Turkish text with the corresponding English paragraph and correcting it.

Had I waited until I had access to a scanner, I could have combined the first two steps. When inadequacies in the original photocopying made the OCR text poor, I just had to retype it. I could just have typed up the whole text, but typing in Turkish is a pain because (at least by the method that I learned for English) five letters, namely P, Ğ, Ü, Ş, and İ, along with four punctuation marks (the period, colon, semicolon, and comma), are all given to the right pinkie.

Son değişiklik: Friday, 01 April 2016, 16:30:15 EEST