Mandarin Diction Guide

The official romanization system used for Mandarin in China is Pinyin, and I provide a Pinyin transliteration (as well as the original Chinese characters) on the text page of my Mandarin songs. This system is not very intuitive for English speakers, however, so I make some modifications in the score that I hope will make learning these pieces easier. 


a is usually pronounced like the [a] in father, but in certain cases it is closer to [ɛ] as in bed. These include the combinations ‑ian and sometimes ‑uan. In these instances, I replace a with ɛ in the score. 

e alone is pronounced [ɤ], and in those cases I will use this symbol in the score. Before i, it is closer to [e] as in egg. Before n and ng it is more like [ə] as in away, so I use ə in the score.* After i, it is like the [ɛ] in bed, and I use ɛ in the score. 

* Some people pronounce ‑en more like [ɛn], and this is fine with me too if that is what your ensemble prefers.

i is usually like the [i] in bee, but after z, c, s, zh, ch, sh, and r it is [ɤ] and I use that symbol.

o alone (which only occurs after b, p, m, and f) is actually a diphthong, [uo], so I write that in the score. In other cases (-ao, -ou, and -ong, for example), it is close to [o] as in core.

u is usually pronounced like [u] as in food. After y, j, q, and x it is [y] like the German ü, and I replace it with ü in the score. 

There are also a few “abbreviations”:

  • ui is pronounced [uei], as in away.
  • iu is pronounced [iou], as in yodeling.
  • un is pronounced [uɛn] (except after y, j, q, and x, when it is [yn]).

In these instances, I preserve the Pinyin spelling on the text page but write out the phonetic pronunciation in the score. 

For diphthongs, I prefer singers to emphasize the more open vowel. For example, ua should be sung like the wa in wander, without remaining very long on [u]; ia is like the ya in yard, without very much emphasis on the [i].


Mandarin consonants can be difficult to master, and I recommend working with a coach whenever possible. But for beginners who do not have access to a coach, I think the following pronunciations are acceptable:

  • j and zh are somewhat like the [dʒ] in jump, but with j the lips are pulled further back.
  • q and ch are close to the [tʃ] in church, but with q the lips are pulled further back.
  • x and sh are close to the [ʃ] in sheep, but with x the lips are pulled further back and the middle part of the tongue presses against the roof of the mouth.
  • c is like the [ts] in cats. z is similar, but not aspirated; I think of it more like the [dz] in kids.
  • h is similar to the [x] in Bach.
  • r (as an initial consonant) can be pronounced a variety of ways, but my preference is for [ʐ], similar to the [ʒ] in measure. (I’m leaving discussion of the erhua for another time, as my current Mandarin pieces do not include this sound.)