On multilingualism and language classification

White book shelves in a library.

There are many attitudes and beliefs about languages and their use that we repeat in our everyday lives and that we take to be true or reflective of reality. For example, we may listen to the dialect of a speaker and infer likely personality traits or the region of their origin. However, many of these conclusions are based on history and culture, not necessarily related to language in any fundamental way. One of these is the classification of languages, which is also used in library classification systems.

Different classifications are used in the study of languages, but libraries have adopted genealogical classification, i.e. classifying languages according to their supposed relationships. This means that the world’s languages are divided into language families within which the languages are at least distantly related. This relationship cannot, however, be proven across language family borders. These classifications are made by examining both contemporary languages and the historical material that is available on languages, in an attempt to get an idea of what past languages were like. The genealogical classification of languages also draws on historical and archaeological research, which helps to locate speakers of particular historical language forms, both geographically and temporally. In addition to this, the current study of languages utilises genetic research, which can also provide information on, e.g. where the people living in a particular region originate.

Genealogical classification works relatively well for languages that are still in widespread use and for which plenty of historical linguistic material exists. One such language family is Indo-European languages, the largest in the world in terms of number of speakers: it includes most European languages (but not, e.g. Finnish and the related languages) and many Middle Eastern and South Asian languages, such as Persian and Hindi. Comparative linguistics, which aims to show the historical relationships between languages, or genealogical classification, began in the late 1700s, when Europeans first strengthened their position in South Asian trade and later subjected the regions to colonial rule. Many trade or administrative officials also became interested in local languages, and it was discovered that both the classical Sanskrit language of the region and many contemporary languages are related to European languages.

However, the genealogical classification of languages is always somewhat uncertain and does not necessarily provide an accurate picture of the history of languages. Family tree models often used in genealogical classification suggest that languages divide into separate branches at a time. After all, that is how the human family tree that the model for languages was adapted from works. But in the case of languages, the division cannot often be made into completely separate branches; instead, speakers of different forms of language interact with each other, leading influences to be transmitted from one language to another. Moreover, all living languages are constantly changing and are influenced by a wide range of non-linguistic factors.

Family tree models also give the misleading impression that languages are separate and distinct. This can sometimes be the case, for example, when speakers of the same language split into two or more groups and move in different directions. In all likelihood, these languages will also evolve in different directions. In today’s world, however, very few speakers of languages live in complete or even near isolation from speakers of other languages, meaning that, in practice, it is often impossible to draw a line between languages. Language borders are a political decision rather than a property of languages. Swedish and Norwegian are an example of this, as they could be classified as different varieties of the same language on linguistic grounds, but they do have a state border between them.

The world has probably been multilingual since the birth of human language. People have also been multilingual, speaking different languages as needed in different contexts and with different people. Naming languages and dividing them into separate languages often seems natural to us, but it is a relatively recent practice in human history, with roots that go back to colonialism.

As Europeans spread their power into new territories, they were met by a vast number of completely new languages. As in many other areas, Europeans wanted to ‘organise’ languages to make what looked like chaos more manageable. For example, in many regions of Africa, the languages that were used were not isolated but rather large areas where somewhat different varieties were spoken. Speakers from neighbouring areas and villages could understand each other, but as the geographical distance grew, understanding each other became more difficult. These dialects were, in many places, separated into different languages by Europeans, often using rather artificial criteria. Of the official languages spoken in South Africa, for example, the so-called Nguni languages, isiZulu, isiXhosa, isiNdebele and siSwati, could be different varieties of the same language from a linguistic and practical comprehensibility point of view, but they are now considered to be different languages. The language divide was deliberately reinforced under apartheid by the forced migration of people to certain areas on the basis of the language they spoke. As separate languages, none of the Nguni languages is large enough in terms of speakers to threaten the status of English or Afrikaans, but if counted as one language, the Nguni languages would have been by far the largest language in the country.

There was also often no need to name languages before the arrival of Europeans, but what mattered was, e.g. the village you came from. This led to the names of villages or regions often ending up as names of languages. Names of languages were also formed through misunderstanding, such as the Ha language spoken in western Tanzania: When outsiders asked who the people in the area were and what language they spoke, the locals replied that they were abaha ‘locals’ and spoke igiha ‘in local way’. So the language became Ha, even though it just means ‘local’.

If languages were not originally separate and named entities, why do we often think of them as such? Colonial ambitions created the need to dominate others through defining and classifying languages. At the time of the birth of colonialism, Europe was also dominated by nationalist thinking, which helped to create the ideological basis for colonialism. Nationalist thinking has also played an important role in shaping our current understanding of language. Nationalism is the belief that there are undivided peoples who have the right to political power. According to nationalist thinking, the people make up the nation state, and a particular language is part of the national identity. Nationalist thinking obscures the inherent diversity of human history, culture and languages that always exists within any state. On the other hand, state borders do not create a sharp boundary for human activity either; for example, the continuity of languages across state borders (cf. the example of Swedish and Norwegian used above).

The concept of mother tongue, which is very familiar to Finns, also has its roots in nationalism – a nation ‘naturally’ has one language that binds it together and also serves as one of the pillars of national identity. This often leads to an idealisation of monolingualism. The ideology of the national language has also laid the foundations for inequality between languages: the national language is often elevated to the status of official language, while other languages are forgotten or deliberately restricted. In Tanzania, for example, more than 100 languages are spoken, but Swahili, the national language, is the only language used in primary schools. In secondary schools and in higher education, the language of instruction is English, a colonial language.

Classifying languages may seem like a purely technical exercise that is similar to organising materials in a library, but, as explained above, it is also based on conscious choices made at a particular time, which have helped to communicate and reinforce the idea of what languages are and what their place in the world is. As with ideologies more generally, successive generations often repeat familiar patterns without questioning them and gradually come to take them for granted. But what if, instead of segregating languages, libraries could help to cross artificial language borders? Someone looking for reading material in Swedish, for example, might find their perspective broadened and find that they can also get by with Norwegian. And the rather modest offerings in Finnish libraries for those who need reading material in isiZulu, for example, could be a bit more extensive if the search results for isiZulu literature were to include other Nguni languages.

Lotta Aunio, Title of Docent in African Studies
Senior Univ. Lecturer in Bantu Languages
University of Helsinki

Photo: Lotta Aunio

Scroll to Top
Skip to content