HTML Monier-Williams Lexicon: Use of H-K and UTF-8 translit.
Last modified:
07/11/2008 03:26 AM
[Ex H-Buddhism mailing list]
Dear Readers, I am forwarding my response to Madhav Deshpande. His question over the use of Harvard-Kyoto and UTF-8 transliteration in the HTML M-W is often raised. Best regards, Richard MAHONEY -----Forwarded Message----- From: Richard MAHONEY <r.mahoney@ICONZ.CO.NZ> To: INDOLOGY@liverpool.ac.uk Subject: Re: HTML Monier-Williams Sanskrit-English Dictionary (Version 0.3 - Release Candidate 1) Date: Thu, 10 Jul 2008 16:50:55 +1200 Dear Madhav, On Wed, 2008-07-09 at 00:03, Deshpande, Madhav wrote: > Hi Richard, > > Thanks for making available this version of the MW HTML. As I > downloaded and looked at it, I see that the main word entrees are in > Harvard-Kyoto notation while the names of cited texts appear in full > diacritics (utf8?). Is that by design? Apologies for the delay in answering you. I've been a little bogged down. The response to my note was a little greater than anticipated ... Yes, using H-K rather than UTF-8 for the headwords and embedded Skt terms was intentional. I've uploaded a screen shot of a typical page viewed with the web browser Opera. The image is attached at the tail end of the following page (`zams-opera.png'): HTML Monier-Williams Sanskrit-English Dictionary (Version 0.3 - RC 1) http://indica-et-buddhica.org/sections/news/repositorium/html-monier-williams The primary and secondary headwords for `zaMs', and all the Skt terms appearing within the body of the definition, are consistently transliterated using H-K. They are also dark red for emphasis. I've done this as most users, myself included, appear to prefer to search for headwords using H-K. That said, I am also aware that there are a good number of others who prefer UTF-8 so to suit them I have written code that should be reasonably easy to modify. If you look at the HTML source you will see that all Skt terms are marked up in this manner: <span class="s">zaMs</span> It shouldn't be too difficult to write a script (Perl?) to convert anything appearing between this class of span into UTF-8. If someone would like to write such a thing then I would be happy to run it over the final release so that both a Harvard-Kyoto and UTF-8 version is available for download. Kind regards, Richard -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road | cellular: +64 27 482 9986 OXFORD, NZ | email: r.mahoney@indica-et-buddhica.org ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Indica et Buddhica: Materials for Indology and Buddhology Scholia: http://scholia.indica-et-buddhica.org/ Tabulae: http://tabulae.indica-et-buddhica.org/ |