29 May 2009

Indo-European Languages

It occurs to me that I often go on about etymology and the links between Sanskrit and English words and yet I've never said much about that link. How can a Sanskrit and an English word possibly be linked, or even cognate? It's because Sanskrit and English are both members of a large family of languages known as Indo-European (IE). This includes most of the languages of Europe (the major exceptions are Basque, Finnish, Estonian, and Hungarian), and the languages of North India. These share many grammatical and morphological features.

We have to begin this story in the middle. During the period when Britain ruled over most of India many men were sent out to India as administrators. These men often had a classical education - that is they read Latin and Greek, and were familiar with the works of the classical authors. Sir William Jones (1746-1794) had gone much further and was a gifted linguist, having published translations from Persian and Arabic, and learned a number of other languages besides. However his livelihood was in law and he was appointed to be a Judge in Calcutta in 1783. Here he came into contact with Sanskrit and within a few years was publishing translations from Sanskrit. Jones reported to the Asiatic Society of Calcutta, which he had founded, in February 1786 that there was an apparent relationship between Latin, Greek and Sanskrit: "... no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists."

Subsequently much work has been done in comparative linguistics to demonstrate that the same kind of links exist between very many other languages. The 'Euro' of Indo-European includes the groups: Celtic languages, Germanic (including English), Italic (aka Romantic), Slavic, Greek, Albanian; Armenian. The 'Indo' stands in fact for Indo-Iranian - this branch includes Persian/Iranian, Panjabi, Hindi, Gujurati, Marathi, Bihari, Bengali. Of the European exceptions Basque may well be a remnant of the languages spoken in Europe before the Indo-European ancestors moved into that area. Finnish, Estonian, and Magyar (i.e. Hungarian) are members of the Finno-Ugric branch of the Uralic family which seems to have had an ancestral homeland in and around the Ural mountains. Basque is not related to any other known language - as such it is known as a language isolate.

One of areas where Indo-European languages show similarities is in kinship nouns. Consider for example the word for father.

Sanskrit pitṛ (nominative pitā)
Greek pater
Latin pater
German vater
Hindi pitā

One can see that there are changes from /p/ to /v/; and from /t/ to /d/ and to /th/ in English. These changes are typical of the type of changes that happen from language to language. So the sounds used for the word father are either the same, or related to each other via a known process.

Another area where the links are clear is in numbers.



































































Again the similarity in some cases is striking (two, eight) and in some cases less obvious but subject to understandable variation (four, five). Of course linguists have marshalled a lot more evidence over the two centuries since "India" Jones wrote his paper. So much so that it has been possible to tentatively reconstruct what the precursor language might have sounded like and worked. This language is called Proto-Indo-European. PIE is a best guest as nothing in fact survives from that time which might indicate what language was spoken or how it was spoken. It seems likely that there were a range of related dialects some of which had more input than others - which is what we see in later India and Europe. For instance some linguists think that Pāli is not a direct descendent of the Vedic language of the Ṛgveda, but of a closely related dialect.

The word āryan is often used in this context though Indo-Aryan is being replaced by Indo-Iranian in linguistic circles. There are two reasons for this. Firstly there are the unfortunate associations with the Nazi racial purity ideas. These are alive and well if the internet is anything to go by. Racist Europeans still try to show that they are a pure race descended from the āryans. They don't seem to realise that almost every one else in Europe is as well, including that often hated group the Gypsies whose Roma language is included amongst the IE family and whose origins are likely to have been in North West Rajasthan.

The second reason is more important for linguists. If anything āryan describes a linguistic group, not a racial group. This is very important. Sometimes the IE family has nothing to do with race. When peoples migrate they often end up speaking the tongue of their neighbours. In any case race is a rather vaguely defined concept these days. Genetics are showing that where we perceive racial difference there is often little evidence of this in DNA. In fact, for instance, for some genes all people in India are similar - including speakers of IE, Dravidian, and Munda languages. Race is not a natural category, it is one we impose on people.

Using the phrase Indo-Iranian refers to a geographic area. And what we see is that languages in proximity may share features that distant, but related, languages do not. A very important case is the use of retroflex consonants. These are pronounced with the tip of the tongue curled back to touch the top of the palette, and are Romanised as ṭ ṭh ḍ ḍh ṇ and ṣ. Of the IE languages only the languages of India use them. This is well established by the time of the Ṛgveda (ca. 1500 BCE). The Dravidian languages, centred by not entirely confined to South India, also use retroflex consonants. It is a feature of Indian languages that transcends race or language family. Mind you, our English dentals (t th d dh n) sound retroflex to Indian speakers because we typically don't have our tongue on the teeth, but immediately behind them on the gum - the sound is less crisp than a true dental and so words like doctor, for instance, are transliterated with retroflexes: e.g. ḍaokṭor (ढॉक्टर्) in Hindi.

The earlier IE languages are heavily inflected. This means that endings are added to a word to tell us what its function in the sentence is (grammar). So in a simple sentence in Sanskrit like:
rāmo bhaginyā saha tāṃ nagarīmagacchat
Rāma went to town with his sister.
There would be no ambiguity if we change the words around (although the euphonic sandhi changes are slightly different - they don't affect the meaning)
aggacchat tāṃ nagarīṃ bhaginyā saha rāmaḥ
We always know that it is Rāma who is the agent, that his sister is with him, and that they went to town no matter the order. The trend in IE languages is away from use of inflections towards prepositions such as: to, with, his. If we mix up the word order in English then we confuse the meaning of the sentence. This trend is not inevitable, but is a characteristic of IE languages. Tamil went the other way for instance, becoming more inflected.

One thing which seems clear, although the scholarly debate rumbles on, is that the homeland of IE, or PIE, is not in India. The debate is largely kept up by Indian scholars who are keen to prove that Sanskrit was the indigenous language of North India. As sometimes happens where there are vested interests, the scholarly debate can be quite emotional with India scholars accused of Hindu Nationalism, while at the same time using terms like 'Orientalist' and 'Cultural Imperialist' for their European detractors. However one must look to the evidence which supports the view that the Indo-European languages originated from the Caspian Sea area in what is now Turkmenistan.

Another area of dispute is the Indus Valley civilisation. This is a rather large topic, so I'm only going to skim it. Basically from possibly as early as 7000 BCE up to about 1700 BCE there was a civilisation along the Indus River, and the now dried up Saraswati River. Their material remains were only discovered in the 20th century, but it seems clear that they were for a long time a successful culture - with possible trading links to the Middle East. The cities were abandoned gradually, rather than being over-run by invaders (scotching the Āryan invasion theory) probably due to a major shift in the patterns of the monsoons which caused the Saraswati to dry up. They left behind hundreds of little clay tablets with symbols on them. Too few to be a ideographic writing like Chinese, and too many to be an alphabet. They most likely do represent some form of written language - similar to Egyptian hieroglyphs which mixed words, ideas, and sounds to create about 400 symbols in all. But what that language is remains a mystery - despite many attempts to decipher it. None of the many claims to have deciphered it stand up to scrutiny, and most exploit the ambiguity of having not texts but only very short sequences of characters to work with - a maximum of 20, but an average of about 8. It would have been nice if the Indus language turned out to be Sanskrit, but this seems not to be the case. Neither is it Dravidian - there is little evidence for Dravidian people having been driven out of the North by Āryan invaders either. One possibility is that it is related to Munda - the family of languages spoken by remnant tribal populations in part of India and related to Malay.

So some mysteries remain. The exact relationship of the speakers of Indo-European languages is still not entirely clear, though the answer most likely lies in the areas of geography and sociology rather than race. But the relationships between IE languages themselves is clear, and the evidence very strong. They are all related, and probably all grew out of one language, or a very small number of closely related dialects. Some languages seem to be less changed than others. Slavic languages for instance are closer to PIE than the Germanic languages. Romantic languages, whose roots in Latin are still obvious, often show considerable similarity. But even in English one can see the relationship if one is observant.
