Urdu is an Indo-Aryan language. The script it uses is derived from Arabic and Persian, but to suit the peculiar requirements of Indo-Aryan phonology, particularly aspiration, retroflexion and nasalization, it has been suitably modified. It is cursive in nature. A characteristic feature of the this script is that short vowels are not usually indicated, though they can be shown by superscript or subscript marks when necessary. Another important feature is that among the consonants, it has duplicate and triplicate letters which stand for the same sound. It has been necessary to retain them because they are distinctive in the Arabic/Persian loans of Urdu. The script is written from right to left. The letters are of two types, connectors and non-connectors. The connectors combine with the following letters in the word or the syllable, while the non-connectors cannot combine with the following letters. However, all letters combine with the preceding connector ones. The writing style normally used for Urdu hand-writing or priming is called Nastalīq, i.e., beautiful. Since the script is cursive in nature, most of the letters have three shapes, initial when they occur in the beginning, medial when they occur in the middle and finally joined when they occur at the end of a word. The final unjoined shape is the same as the basic letter.
Sequence of Urdu Letters
The long vowels in Urdu are indicated by alif ( ا ), alif-mad ( آ ), vāo ( و ), choṭī yē ( ې ) and baṛī yē ( ے ). The superscript mad ( ٓ ) written over alif, e.g., آ , denotes long /ā/ at the beginning of a word. However, in medial and final position alif ( ا ) by itself stands for a long /ā/. Yē ( ې ) and vāo ( و ) when occurring initially, stand for semi-vowel /y/ and /v/ respectively, such as, /yahā̃/ ( یہاں ), /vahā̃/ ( وہاں ). Vāo ( و ), choṭī yē ( ې ), baṛī yē ( ے ) in other environments denote long vowels, the detail of which follows.
Vao and Ye
In Urdu, vāo ( و ) serves the purpose of four sounds as indicated below:
|وہاں||جو||جو ̔||جو ́|
Initially, vāo like yē always stands for semi-vowels as in U /vahā̃/. However, in the medial and final positions vāo stands for three different long vowel sounds, i.e., /ō/ , /ū/ or /au/. The vāo for /ū/ sound is shown with ultā pēsh ( ۏ ); the vāo for /au/ sound is shown with a proceeding zabar (و...ٓ...); whereas the unmarked vāo ( و ) stands for the sound /ō/, as it is shown in the examples above.
Similarly, ے / ې (yē) in Urdu serves the purpose of four sounds as indicated below:
Initially, yē like vāo always stands for a semi-vowel, as in /yahā̃/. However, in the medial and final positions yē stands for three different long vowel sounds, i.e., /ē/, /ī/, /ai/. The yē for /i/ sound is shown with a khaṛa zēr ( میرͅا ); the yē for /i/ sound is shown with a preceding zabar ( میرا ́ ); whereas the unmarked yē ( میرا ) stands for the sound /ē/, as is shown in the examples above.
The short vowels in Urdu are indicated by superscript or subscript as indicated below:
Above a consonant is called 'zabar'. It denotes a following /a/: ...ٓ...
Below a consonant is called 'zēr'. It denotes a following /i/: ...̗...
Above a consonant is called 'pēsh'. It denotes a following /u/: ...ٓ...
Alif ( ا ) at the beginning of a word or a syllable denotes that the word or syllable begins with a vowel. The particular short vowels can be indicated by use of zēr, zabar, of pēsh, e.g., ا ̗س , آب , آن. However, short vowel signs are used only when necessary, the general practice being that Urdu readers read their language without short vowel marks.
Nun-g͟hunna, dō cashmī hē and hamza
The following three letters, though traditionally not listed in the Urdu alphabet, are very important to learn:
- nun-g͟hunna ( ں ), i.e., nūn without dot stands for nasalization of a vowel; however, medial nasalization is indicated with full nūn ( ن ), i.e., with the dot above such as in /jāū̃/ , جوں , /ū̃ṭ/ اونٹ
- dō cashmī hē ( ھ ) is a distinctive feature of Urdu and represents aspiration, such as in /ghōṛā/ گھوڑا , /thōṛā/ تھوڑا .
- hamza ( ء ) is a glottal stop in Arabic, but in Urdu, generally, it is an orthographic mark used as a superscript denoting the occurrence of two vowels in a word. Except for the vocalics with which it occurs, it has no phonetic value in Urdu.
The following groups of letters stand for the same sound in Urdu:
- tē ( ت ) and tōe ( ط ) both represent /t/
- sē ( ث ), sīn ( س ) and swād ( ص ) all represent /s/.
- choṭī hē ( ه ) and baṛī hē ( ح ) both represent /h/.
- zāl ( ذ ), zē ( ز ), zwad ( ض ) and zōe ( ظ ) all represent /z/.
ain ( ع )
The Consonant ain ( ع ), which in Arabic is a glottal fricative, in Urdu generally is pronounced as a vowel combined with other vowels in the word. It normally merges with the sound of a vowel character or vowel marker, such as:
- Initially: aurat ( عورت ), izzat ( عزت )
- Medially: malūm ( معلوم ), bād ( بعد ), shōla ( شعلہ )
- Finally: jama ( جمع ), mauzū ( موضوع )
he ( ه )
He ( ه ) stands for /h/, but in many cases in the final position it is pronounced softly and denotes a short vowel, e.g.,
But where a final /h/ is to be pronounced, it is shown with a hook, e.g.,
Silent vāo ( و )
Vāo ( و ) following kh ( خ ) occurs only in a few Persian and Arabic loan words where vāo ( و ) is not pronounced. It is marked with a subscript below the vāo, thus:
|cāhi + yē||چاہیے|
|dīji + yē||دیجیے|
The Urdu script here is re-produced from the book Let's Learn Urdu by Gopi Chand Narang and published by National Council for Promotion of Urdu Language, New Delhi with their permission.
- Initially, choṭī hē ( ه ) + alif ( ا ) is written with a double hook, e.g.,
- kāf ( ک ) or gāf ( گ ) + alif ( ا ) are written with a small loop, e.g.,
- kāf ( ک ) or gāf ( گ ) before lām are also written similarly, e.g.,
- In the environment /i/ + semi-vowel /y/ + vowel, hamza ( ء ) is not written and instead two dots of medial /y/ are indicated:
- hamza ( ء ) is also not indicated in the environment, /u/ + /ā/, e.g.,