Phonetics generator

Conlangers and historical linguists have created many online tools to model sound changes in languages over time. Using these tools, it is possible to create open-source programs that will generate phonetic transcriptions of Tibetan from input in Wylie or Tibetan Unicode. The generator on this page is designed to be used on the online SCA2 tool with Wylie input.

Existing tools such as the Tibetan phonetics generator on LotsawaHouse have many advantages to this one, such as the ability to parse multiple words or accept input in Tibetan Unicode. I created this generator because it’s totally open source and can be customized fairly easily, without needing to know Perl. Also, this generator shows tone at the level of the syllable, which existing tools don’t do.

This generator is in beta. Report any issues here. I intend to create phonetics generators for other dialects later on. If you feel brave, go ahead and play around with the code to see if you can simplify any of the rules I used, or if you can tackle them differently. SCA2 can’t tell which position a letter takes in a syllable, so you’ll need to find workarounds for that.

Central Tibetan Phonetics generator

Try it on SCA2!

1) Download the following file:

2) Open SCA2, click “Browse…”, and open the file.

3) Then, put Tibetan text (in Wylie) into the “input” field, and click “Apply”!

Note that the tool can only handle single syllables, not connected speech. So, for example, “rdo rje” will give “do je” instead of “do rje” or “dorje”, and the second syllable will not be marked as high tone either. The tool also won’t notice or flag invalid syllables, and will simply try to process them as if they were valid syllables.

Screenshot:

Comparing systems of Tibetan phonetics

Many existing systems of Tibetan phonetics have serious issues that cause people to misunderstand Tibetan pronunciation.

Chief among these are:

  • lack of tone
  • transcribing the suffix letters ག་བ་ as “g” and “b”

Let’s address these topics one at a time.

Tone

Existing phonetics generators are based on Central Tibetan pronunciation, which uses tones to distinguish between different words. However, as far as I am aware, no existing Tibetan phonetics generator shows the tones of Tibetan syllables.

For example, the word for “to return” (ལོག་) is low-tone, and the word for “electricity” (གློག་) is high-tone, but otherwise they are pronounced the same way. However, both these words are transcribed as “lok” in every existing Tibetan phonetics system:

  • LotsawaHouse Phonetics
  • Rigpa Phonetics
  • Padmakara’s International Simplified Phonetics
  • Samye Translations
  • THL Simplified Phonemic Transcription

You can enter these words into the Tibetan phonetics generator on LotsawaHouse to see for yourself.

However, in my phonetics generator, ལོག་ is transcribed “lok”, and གློག་ is transcribed lōk, with a macron to show high tone. It is the only phonetics generator that makes this crucial distinction in tone.

The suffix letters ག་ and བ་

Existing phonetics generators often transcribe the suffix letter བ་ as “b” even though it is actually pronounced “p”. For example, LotsawaHouse Phonetics renders སློབ་ཕྲུག་ as “lob truk”, Rigpa Phonetics renders it as “lob truk”, and Padmakara International Simplified Phonetics renders it as “lobtruk”.

However, Samye Translations and THL Simplified Phonemic Transcription both render it correctly as “lop truk”.

The suffix letter ག་ is often transcribed as “g” instead of “k” in colloquial transcriptions, such as “Gelug” (which is actually pronounced “Geluk”) or “Dzogchen” (which is actually pronounced “Dzokchen”). These are also incorrect.