The romanizations used on the Spice Pages are very close to both the common scientific transliteration (IAST) of Sanskrit and the enlarged ISO 15919 standard. Pure ASCII schemes like Harvard-Kyoto and ITRANS are very different.
Retroflex consonats of Sanskrit are marked by a dot below the letter, (Ṭ, Ḍ, Ṣ, Ṇ). The same dot is used for retroflex variety of L (Ḷ). For R, however, I have to use two different special characters: North Indian ड़ (DDDHA, equivalent to DDA with Nukta) is rendered with a dot (Ṛ); its equivalent letter in Bengali (ড়) and Oriya (ଡ଼) is called RRA, not DDDHA. South Indian RRA ऱ (e. g., Tamil ற) is romanized with an underline, Ṟ.
The Devanagari eyelash ra, which is characteristic of Marathi, has no codepoint of its own; instead, it is coded as RA+VIRAMA+ZWJ र् in Unicode 2.0, but Unicode 3.0 interprets the character as half form of RRA (and consequently writes RRA+VIRAMA ऱ्). For transliteration, I use Ř.
Vocalic R and L are usually marked with a dot in scientific Sanskrit romanization (IAST). However, since the dot is already used for retroflex consonants in ISO-15919, I have to use a ring instead: R̥ and L̥. The long versions have an additional macron, R̥̄ and L̥̄. These characters might be poorly supported on some browsers (but fortunately, all except R̥ are rare or even nonexistent in Sanskrit, and do not appear at all in the modern languages except in learned Sanskrit loans).
The velar and palatal nasals are written Ṅ and Ñ, respectively. Note that in most Indic alphabets, both are frequently written as anusvara.
The implicit vowel is always rendered A, even if it is rather pronounced like O. Most North Indian languages often write this vowel even if it is not spoken, and in such cases my transliteration and sorting are often slightly inconsequent.
Superscript H following a consonant (ʰ) marks aspirated sounds (which exist in all languages save Tamil and Sinhala).
Superscript N preceding a voiced stop (ⁿ) indicates prenasalized stops (this feature is restricted to Sinhala).
Long vowels (Ā, Ī, Ū) are distinguished by a macron. E and O are always long in Aryan languages, and thus not specifically marked (as in IAST, but not in ISO-15919). I write the Dravidian short E and O as Ĕ and Ŏ, respectively. There is also a short E and O in some North Indian languages to write English loanwords (transliterated Ê and Ô), but these do not appear in this index (example: Hindi kofi [कॉफ़ि]
Sinhala has two additional vowels which I transliterate as Æ and Ǣ, this is, as AE ligature with a macron to indicate length. On sorting, this had to be conflated with A and Ā I don’t know where they are put in the Sinhala collating sequence.
Malayalam script has a few unique features that make it difficult to integrate into this index. The chillu forms are pure consonants appearing syllable-finally and cannot be distinguished from intra-syllable consonant clusters; the word-final short U of Malayalam is written with a virama and is represented as a Schwa ə here.
The Gurmukhi and Devanagari letters KHHA and GHHA represent khah خ and ghain غ, respectively, in some Arabic loanwords. ISO-15919 suggests KH and Ġ as romanization. The former is in my opinion a poor choice, as it is not easily representable in Unicode. Thus, I use Ḫ and Ġ as in Arabic; however, neither of them appears in any spice name I could find. Also, the letters QA and ZA are used only for Arabic or Persian terms, and are consequently rare. FA, on the other side, appears in a few few native words plus Arabic, Persian and English loans.
The Tamil letters LLLA ழ (also in Malayalam ഴ) and NNNA ன are represented by L and N with an underline, in accord with the scientific transliteration of Tamil: Ḻ and Ṉ, respectively.
Additionally, Devanagari transliterations are given for languages that have another native alphabet. These Devanagarizations use a couple of special signs to accommodate sounds alien to most Aryan languages. In general, there is a 1:1 correspondence between Devanagari and native letters; as an exception, the anusvara has been used in Devanagari throughout even if the native alphabet expresses nasalization by means of character, and a few native signs not reprensentable in Devanagari had to be replaced by near-matches. Since the Devanagari transliterations are produced programmatically by sed, some of them might by systematically flawed.
Displaying this page correctly is currently a true challange to all except the most recent computer systems, many of which will fail. You will not only need fonts for all scripts used, but you must also make sure that your browser (or the underlying operating system) can handle the complex rules of Indic typography correctly. Malayalam is an especially tricky case, and many current systems fail the following test: മഞ്ഞള് (maññaḷ) and എള്ള് (ĕḷḷə). You should see only one virama (a breve-like mark) over the last letter of the second word, while on some systems up to four viramas are shown, or characters are replaced with question marks or square boxes. Fortunately, Unicode 5.1 offers a new way to code the first word as മഞ്ഞൾ, which should be easier to interpret for renderers (as soon as the fonts are updated).
The entries are sorted according to the canonical Devanagari collating sequence, which is mimicked by all the other Indic scripts. Anusvara is sorted as if it were written as a nasal. The handling of the implicit vowel is still somewhat unsystematic; it is often ignored in sorting if it is written but not pronounced.
Depending on definition, several hundred to more than thousand languages from five families are spoken in India alone. Yet, most of these have no literary tradition, and some that had it in the past have now lost it (shockingly often in the course of the 19/20th century). Some North-Western Indian languages use the Arabic Alphabet and must be excluded from this index; the same is true of many minority languages in the far North East, where the Latin alphabet is common (Khasi, Garo, Karbi etc., though some others use Bengali script, and Bodo is unique in using Devanagari). The traditional literary languages of the India have today official status in the union states where they are spoken, and all of them are contained in this index with moderate to reasonable accuracy.
The past decades witnessed a trend towards literacy in many languages that were previously mostly oral; some of these are
large languages (Konkani, Dogri), but also
small minority languages from the North East have
undergone that process. Some of these have now acquired official status, or are approaching this aim. Usually, existing
scripts were adapted to new languages (mostly Devanagari or Bengali), rather than reviving archaic, disused autochthonic scripts.
In principle, many of these new literary languages could be included into this index, but information about their spice names
is hard to come by in the web, and so I have to rely on fieldwork, which is resourceful and often gives poor results.
Nevertheless, I can present spice names in a couple of languages of the far North East, and I harbour hope that the number
might increase in future.
The following table is both a status quo and a to-do list. It contains all languages known to me which satisfy two conditions: (a) written with a Brahmi-derived script that is structurally close enough to Devanagari and supported in Unicode; an (b) has official status, or is at the very least tought in school so that a standard orthography is defined. In India and its neighbouring countries, an estimated 30 to 40 languages from three families (Indo–European, Sino–Tibetan, Dravidian) remain as possible candidates for inclusion into this index. Some cases are really problematic: Rajasthani, although enjoying somewhat official status, has no normalized orthography; Tibetan and its relatives are written in a script that has developed pretty far from the Indian original (thus, Tibetan spice names are more easily found in the Tibetan Index). There are also some cases with no clear consensus on the script used for a language: Konkani (Latin/Devanagari/Kannada, although it is offical in Goa with Devanagari script), Kokborok (Bengali/Latin), Kashmiri and Sindhi (Arabic/Devanagari).
The South East Asian Brahmi-derived scripts are perhaps impossible to add to this index, as they have evolved
a long, long road from the common ancestor. There is currently a Thai and Lao Index
available, and the two remaing (Khmer and Burmese) might follow in the future, whenever
I’ll have enough expertise for this task (or a friendly reader will help me).
|Devanagari||संस्कृत||Sanskrit (sa) β||classical tongue of lore, religion and philosophy|
|हिंदी||Hindi (hi)||lingua franca in Northern India; official language of the Indian Union and many Northern union states|
|मराठी||Marathi (mr)||official state language in Maharashtra|
|कॉशुर||Kashmiri (Koshur, ks)||regional langage in Kashmir, now mostly (in Pakistan always) written in Arabic alphabet [کٲشر] (Dardic)|
|कोंकणी||Konkani (kok) α||official state language in Goa|
|नेपाली||Nepali (ne)||national langage in Nepal; official second language in West Bengal|
|नेपालभाषा|| Nepal Bhasa ||regional langage in Nepal; the traditional Newari-Script (Ranjana) enjoys a partial revival in Kathmandu, but Unicode support is still lacking (Sino–Tibetan)
|| बोड़ो || Bodo (brx)||official second language in the Western part of Assam (Sino–Tibetan)
|| मैथिली || Maithili (Bihari, bh) ||official (?) regional language in Bihar and strong community in Southern Nepal. In the past, it was written in a variety of Bengali script known as Mithilakshar (there is yet no Unicode support) or in the Kaithi script [𑂍𑂶𑂘𑂲], but today Devanagari is used if the language is written at all.
|| डोगरी || Dogri (doi)||regional language in Jammu and Kashmir (co-official in the South of that state); spoken in Pakistan and there written in Arabic alphabet [ڈوگرى]. Culinary vocabulary is very close to Hindi.
|| सिन्धी || Sindhi (sd)||scattered over North Western India and Pakistan (where it is co-official); recognized by the Indian constitution, but no official status in any union state; in India partially and in Pakistan completely written in a variety of Arabic script with many additional characters [سنڌی]
|| राजस्थानी || Rajasthani (raj)||Grouph of rather diverse dialects spoken in Rajasthan
|| दिवॆहिबस् || Dhivehi (dv)||National language of the Maldives and written with a unique script [ދިވެހިބަސް]. A few hundred speakers in India use Devanagari, but I do not know enough about their orthography rules to incorporate that language here.
|| Gurmukhi|| ਪੰਜਾਬੀ ||Punjabi (Panjabi, pa)||official state language in Haryana and Punjab; widely spoken as a vernacular in the eastern part of Pakistan where it is not official (written in || Gujarati|| ગુજરાતી ||Gujarati (gu) ||official state language in Gujarat
|| Bengali (Eastern Nagari)|| বাংলা ||Bengali (Bangla, bn) ||official state language in Western Bengal; national language of Bangladesh
|| অসমীয়া || Assamese ||official state language in Assam
|| মণিপুরি (মৈতৈ লোন) || Manipuri (Meitei-lon, mni) ||official state language in Manipur; unofficial minority language in Bangladesh. The native Meitei Mayek script [ꯃꯩꯇꯩ ꯃꯌꯦꯛ] was replaced by Bengali in the 18.th century and is currently revived; there is a separate spice index in that script available. (Sino–Tibetan)
|| ককবরক || Kokborok (Tripuri, trp)||Official regional language in Tripura (spoken also in Bangladesh). Is also written in Latin script (Sino–Tibetan)
|| বিষ্ণুপ্রিয়া মণিপুরী || Bishnupriya Manipuri (bpy) ||spoken in scattered communities in North East India and Bangladesh (not official)
|| সিলটী || Sylheti (Siloti) ||spoken in Bangladesh and North East India (not official); there is a native alphabet (Siloti Nagori [ꠡꠁꠟꠐꠡ ꠘꠀꠉꠠꠡ]), which is, however, mostly extinct.
|| Oriya|| ଓଡ଼ିଆ ||Oriya (or)||official state language in Orissa
|| Telugu|| తెలుగు ||Telugu (te)||official state language in Andhra Pradesh (Dravidian)
|| Tamil|| தமிழ் ||Tamil (ta)||official state language in Tamil Nadu; second national language of Sri Lanka (Dravidian)
|| Kannada|| ಕನ್ನಡ ||Kannada (kn)||official state language in Karnataka (Dravidian)
|| ತುಳು ಬಾಸೆ ||Tulu (tcy)||Minority language in Karnataka, formerly written with a specific script but now mostly oral (Dravidian)
|| Malayalam|| മലയാളം ||Malayalam (ml) ||official state language in Kerala (Dravidian)
|| Sinhala|| සිංහල ||Sinhala (Singhalese, si)||national language of Sri Lanka
see Tibetan Index
|བོད་སྐད་||Tibetan (bo) ||a group of interrelated languages spoken in Tibet, China, Nepal, India and Pakistan (Sino–Tibetan)
||གླེ་སྐད་||Ladakhi (ljb)||sublanguage of Tibetan located mainly on the North Western edge of India; has official status in Jammu & Kashmir state (Sino–Tibetan)
||རྫོང་ཁ་||Dzonkha (dz) β||national language of Bhutan. Also part of the Tibetan macrolanguage (Sino–Tibetan)
|| Ajhapat || 𑄌𑄋𑄴𑄟 ||Chakma (Changma, ccp)||small minority language of North Eastern India and Bangladesh (get a font)
It should be noted that I obtained many of the names shown in this index from poorly legible, hand-written lists, and (as anyone with some knowledge of Indic scripts will confirm) it is quite difficult and error-prone to digitalize such scribblings, moreover one may easily fall victim to orthographic deficiencies of the indivuals who wrote them. Whenever possible, I checked on the Internet, but due to a scarcity of sources (especially bilingual ones), this was only partially possible for some languages (notably, Gujarati, Bengali and Malayalam). Spellings in Sinhala, Punjabi, Assamese, Nepali and Konkani should therefore taken with a grain of salt. Comments and corrections are of course welcome.
Some of the more exotic languages with null internet coverage were field researched, which poses significant problems with poor literacy rates and a general lack of writing tradition. Thus, spice names in North East Indian minority languages, Maithili, Dogri and Newari are perhaps to be seen as approximations only (the number of different spellings for a word can well equal the number of people asked, which is particularly nasty if only one native speaker is available); also, Tibetan proved hard work, although I was finally able to cross-check some of the names with written literature on medicinal herbs.
Sanskrit presents a different problem, as the standard dictionary of Monier Williams contains an unreasonable amount of synonyms and polyvalencies (and invalid scientific plant names); neither ancient Indian writers nor modern linguists give much heed to botany, it seems. For that reason, Sanskrit will probably remain in the β state forever.
- Begin of page
- German page (Deutsch)
- Table of Contents
- Alphabetic Index
- Botanic Index
- Geographic Index
- Spice Mixture Index
- Morphological Index