Jayarava's Raves: Indo-European Languages

29 May 2009

Indo-European Languages

It occurs to me that I often go on about etymology and the links between Sanskrit and English words and yet I've never said much about that link. How can a Sanskrit and an English word possibly be linked, or even cognate? It's because Sanskrit and English are both members of a large family of languages known as Indo-European (IE). This includes most of the languages of Europe (the major exceptions are Basque, Finnish, Estonian, and Hungarian), and the languages of North India. These share many grammatical and morphological features.

We have to begin this story in the middle. During the period when Britain ruled over most of India many men were sent out to India as administrators. These men often had a classical education - that is they read Latin and Greek, and were familiar with the works of the classical authors. Sir William Jones (1746-1794) had gone much further and was a gifted linguist, having published translations from Persian and Arabic, and learned a number of other languages besides. However his livelihood was in law and he was appointed to be a Judge in Calcutta in 1783. Here he came into contact with Sanskrit and within a few years was publishing translations from Sanskrit. Jones reported to the Asiatic Society of Calcutta, which he had founded, in February 1786 that there was an apparent relationship between Latin, Greek and Sanskrit: "... no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists."

Subsequently much work has been done in comparative linguistics to demonstrate that the same kind of links exist between very many other languages. The 'Euro' of Indo-European includes the groups: Celtic languages, Germanic (including English), Italic (aka Romantic), Slavic, Greek, Albanian; Armenian. The 'Indo' stands in fact for Indo-Iranian - this branch includes Persian/Iranian, Panjabi, Hindi, Gujurati, Marathi, Bihari, Bengali. Of the European exceptions Basque may well be a remnant of the languages spoken in Europe before the Indo-European ancestors moved into that area. Finnish, Estonian, and Magyar (i.e. Hungarian) are members of the Finno-Ugric branch of the Uralic family which seems to have had an ancestral homeland in and around the Ural mountains. Basque is not related to any other known language - as such it is known as a language isolate.

One of areas where Indo-European languages show similarities is in kinship nouns. Consider for example the word for father.

Sanskrit	pitṛ (nominative pitā)
Greek	pater
Latin	pater
German	vater
Hindi	pitā

One can see that there are changes from /p/ to /v/; and from /t/ to /d/ and to /th/ in English. These changes are typical of the type of changes that happen from language to language. So the sounds used for the word father are either the same, or related to each other via a known process.

Another area where the links are clear is in numbers.

Sanskrit

Greek

Latin

German

English

Hindi

eka

eis

unus

eins

one

dvi

duo

zwei

two

tri

treis

tres

drei

three

tīn

catur

tettares

quattuor

vier

four

cār

pañca

pente

quinque

fünf

five

pañc

ṣaṣ

hex

sex

sechs

six

chah

sapta

hepta

septem

sieben

seven

sāt

aṣṭa

octo

acht

eight

āṭh

nava

ennea

novem

neun

nine

daśa

deca

decem

zehn

ten

das

Again the similarity in some cases is striking (two, eight) and in some cases less obvious but subject to understandable variation (four, five). Of course linguists have marshalled a lot more evidence over the two centuries since "India" Jones wrote his paper. So much so that it has been possible to tentatively reconstruct what the precursor language might have sounded like and worked. This language is called Proto-Indo-European. PIE is a best guest as nothing in fact survives from that time which might indicate what language was spoken or how it was spoken. It seems likely that there were a range of related dialects some of which had more input than others - which is what we see in later India and Europe. For instance some linguists think that Pāli is not a direct descendent of the Vedic language of the Ṛgveda, but of a closely related dialect.

The word āryan is often used in this context though Indo-Aryan is being replaced by Indo-Iranian in linguistic circles. There are two reasons for this. Firstly there are the unfortunate associations with the Nazi racial purity ideas. These are alive and well if the internet is anything to go by. Racist Europeans still try to show that they are a pure race descended from the āryans. They don't seem to realise that almost every one else in Europe is as well, including that often hated group the Gypsies whose Roma language is included amongst the IE family and whose origins are likely to have been in North West Rajasthan.

The second reason is more important for linguists. If anything āryan describes a linguistic group, not a racial group. This is very important. Sometimes the IE family has nothing to do with race. When peoples migrate they often end up speaking the tongue of their neighbours. In any case race is a rather vaguely defined concept these days. Genetics are showing that where we perceive racial difference there is often little evidence of this in DNA. In fact, for instance, for some genes all people in India are similar - including speakers of IE, Dravidian, and Munda languages. Race is not a natural category, it is one we impose on people.

Using the phrase Indo-Iranian refers to a geographic area. And what we see is that languages in proximity may share features that distant, but related, languages do not. A very important case is the use of retroflex consonants. These are pronounced with the tip of the tongue curled back to touch the top of the palette, and are Romanised as ṭ ṭh ḍ ḍh ṇ and ṣ. Of the IE languages only the languages of India use them. This is well established by the time of the Ṛgveda (ca. 1500 BCE). The Dravidian languages, centred by not entirely confined to South India, also use retroflex consonants. It is a feature of Indian languages that transcends race or language family. Mind you, our English dentals (t th d dh n) sound retroflex to Indian speakers because we typically don't have our tongue on the teeth, but immediately behind them on the gum - the sound is less crisp than a true dental and so words like doctor, for instance, are transliterated with retroflexes: e.g. ḍaokṭor (ढॉक्टर्) in Hindi.

The earlier IE languages are heavily inflected. This means that endings are added to a word to tell us what its function in the sentence is (grammar). So in a simple sentence in Sanskrit like:

rāmo bhaginyā saha tāṃ nagarīmagacchat
Rāma went to town with his sister.

There would be no ambiguity if we change the words around (although the euphonic sandhi changes are slightly different - they don't affect the meaning)

aggacchat tāṃ nagarīṃ bhaginyā saha rāmaḥ

We always know that it is Rāma who is the agent, that his sister is with him, and that they went to town no matter the order. The trend in IE languages is away from use of inflections towards prepositions such as: to, with, his. If we mix up the word order in English then we confuse the meaning of the sentence. This trend is not inevitable, but is a characteristic of IE languages. Tamil went the other way for instance, becoming more inflected.

One thing which seems clear, although the scholarly debate rumbles on, is that the homeland of IE, or PIE, is not in India. The debate is largely kept up by Indian scholars who are keen to prove that Sanskrit was the indigenous language of North India. As sometimes happens where there are vested interests, the scholarly debate can be quite emotional with India scholars accused of Hindu Nationalism, while at the same time using terms like 'Orientalist' and 'Cultural Imperialist' for their European detractors. However one must look to the evidence which supports the view that the Indo-European languages originated from the Caspian Sea area in what is now Turkmenistan.

Another area of dispute is the Indus Valley civilisation. This is a rather large topic, so I'm only going to skim it. Basically from possibly as early as 7000 BCE up to about 1700 BCE there was a civilisation along the Indus River, and the now dried up Saraswati River. Their material remains were only discovered in the 20th century, but it seems clear that they were for a long time a successful culture - with possible trading links to the Middle East. The cities were abandoned gradually, rather than being over-run by invaders (scotching the Āryan invasion theory) probably due to a major shift in the patterns of the monsoons which caused the Saraswati to dry up. They left behind hundreds of little clay tablets with symbols on them. Too few to be a ideographic writing like Chinese, and too many to be an alphabet. They most likely do represent some form of written language - similar to Egyptian hieroglyphs which mixed words, ideas, and sounds to create about 400 symbols in all. But what that language is remains a mystery - despite many attempts to decipher it. None of the many claims to have deciphered it stand up to scrutiny, and most exploit the ambiguity of having not texts but only very short sequences of characters to work with - a maximum of 20, but an average of about 8. It would have been nice if the Indus language turned out to be Sanskrit, but this seems not to be the case. Neither is it Dravidian - there is little evidence for Dravidian people having been driven out of the North by Āryan invaders either. One possibility is that it is related to Munda - the family of languages spoken by remnant tribal populations in part of India and related to Malay.

So some mysteries remain. The exact relationship of the speakers of Indo-European languages is still not entirely clear, though the answer most likely lies in the areas of geography and sociology rather than race. But the relationships between IE languages themselves is clear, and the evidence very strong. They are all related, and probably all grew out of one language, or a very small number of closely related dialects. Some languages seem to be less changed than others. Slavic languages for instance are closer to PIE than the Germanic languages. Romantic languages, whose roots in Latin are still obvious, often show considerable similarity. But even in English one can see the relationship if one is observant.

Recently Published

Attwood Jayarava (2024). "Revised Editions of the Prajñāpāramitāhṛdaya and Bānrěbōluómìduō xīn jīng «般若波羅蜜多心經»." Asian Literature and Translation. 11(1). 52-92. https://alt.cardiffuniversitypress.org/articles/10.18573/alt.63.

Attwood, Jayarava. (2024). “Heart to Heart: A Comparative Study of and Commentary on the Chinese and Sanskrit Heart Sutra Texts”. Buddhist Studies Review 40 (2): 159-88. https://doi.org/10.1558/bsrv.25438. [Email me to obtain a copy]

(2022) "The Cessation of Sensory Experience and Prajñāpāramitā Philosophy" IJBTC 32(1):111-148. IJBTC Website. [free download]

(2021) "The Chinese Origins of the Heart Sutra Revisited: A Comparative Analysis of the Chinese and Sanskrit Texts." Journal of the International Association of Buddhist Studies. 44: 13-52. DOI 10.2143/JIABS.44.0.3290289

2021. The Concept of Niyāma in Buddhism. Visible Mantra Press. £8.99.

Books by Jayarava

Karma and Rebirth Reconsidered

A thorough reconsideration of the myths of the afterlife (rebirth) and just world (karma) in Buddhism. The book shows that from an evolutionary point of view such myths are understandable, but that Buddhists have never produced a coherent theory of either that can stand scrutiny. Indeed we now know that neither myth is realistic and that there is no afterlife and the universe is not fair. However, as human beings we can live on in memory or leave works, and we can live ethically.

Buy from Lulu now.

Talking to the Kālāmas

A new translation of the Kālāma Sutta along with a commentary which explores the meaning of the text, and explodes some myths about it. 34 p. £4.50

Buy from Lulu now.

Visible Mantra

A resource for visualising and calligraphy of Buddhist mantras. Commons mantras and seed-syllables in various fonts and scripts, including Devanāgarī, Tibetan Uchen, and Siddham.