Thoughts about Language

by Gautham Pai

Introduction

The following is a brain dump of my study of language, especially Sanskrit. The exploration will lead us through several other languages, especially English, and also touch upon Computer Science topics like Natural Language Processing and Artificial Intelligence.

But let's start from the beginning!

The Alphabets

The story begins with alphabets, which are the smallest constituents of a language. There are a fixed set of alphabets in the language. Alphabets combine to form words. Words have specific meanings. Words combine to form sentences and we speak or write these sentences to converse with others.

Let us look at these aspects in the English language, starting from alphabets.

We have 26 alphabets in English:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Of these the following are called vowels:

A E I O U

The rest are consonants.

It's interesting to know why we have these 26 letters in English.

  • Could there be other letters?
  • Can we drop some letters because they could be derived from other existing letters?
  • Why are they arranged the way they are arranged?
  • Why are vowels spread out in different places?

Did you know that, the classical Latin alphabet, from which the modern European alphabets derived (including English), did not have the "W" character? We pronounce "W" without it sounding like "W" at all, isn't it? The usage of "W" is more closer to other letters like "V" as in "well", "why" etc.

Swaras in Sanskrit

Let us look at the letters used in Sanskrit and the way they are arranged?

Vowels are called swaras in Sanskrit. Why are they called swaras?

Let me now introduce you to one of the heroes of Indian philosophy and language, Patañjali. Patañjali is well known for his Yoga Sutras. For people who think Yoga is all about exercises, this is a great book to refer to. Patanjali also authored the Mahābhāṣya, a great commentary, on the Aṣṭādhyāyī of Pāṇini. I will introduce you to Pāṇini and his work in some time.

Patañjali says, स्वयम् राजन्ते इति स्वरा: (svayaṃ rājante iti svarāḥ), which means "Swaras are those which shine by themselves." In other words, a swara is that which can be pronounced independently without the support of other sounds. This is opposed to vyanjanas (consonants) that require the support of swaras for their pronunciation.

One thing we notice very early in the study of Sanskrit Grammar is that there is a lot of emphasis on the place of pronunciation of every syllable. Language is about communicating our thoughts with sound produced using our body. And if we are able to precisely define the origin of each sound in our body (eg: throat, mouth, nose etc), we can ensure that we maintain the exact pronunciation and accents and thus the purity of the language for thousands of years, as long as the human body itself has not undergone major changes! The best part of learning grammar is that you can use your own body as a tool for experiments. 😃

Beginners often learn the swaras in this order:

Letter (Devanagari)English Diacritic RepresentationExample for Pronunciation
aa as in "earth"
āa as in "father"
ii as in "it"
īee as in "eel"
uu as in "put"
ūoo as in "ooze"
ea as in "ace"
aii as in "ice"
oo as in "open"
auou as in "ouch"

I looked for English words where the swaras are used independently but I couldn't find examples for ā, u, ṛ and ḷ.

Use of diacritics: Since English alphabets don't communicate the pronunciation well, I will be using the diacritic notations in future to convey the pronunciation. You can use the above chart for your reference.

Beginners are confused about the usages and pronunciation of ऋ (ṛ) and ऌ (ḷ), so let us understand its significance in more detail. But let us park it for now and revisit it in future. You can pronounce them as "rr" and "ll" until we re-examine this.

Let us observe the rest of the swaras.

Hrasva, Deergha and Pluta Swaras

Take the first 10 and group them as follows (you can ignore the pronunciation of the last 4 for now):

  • अ (a) and आ (ā)
  • इ (i) and ई (ī)
  • उ (u) and ऊ (ū)
  • ऋ (ṛ) and ॠ (ṝ)
  • ऌ (ḷ) and ॡ (ḹ)

The second swara in each group is a "longer" form of the first swara. If you elongate the first swara of each group, you get the second one.

This is referred to as मात्रा (Matra) in Sanskrit. Matra is about the duration of pronunciation (or timing) and so is also very closely associated with the study of music too. Tabla players would talk about different Taalas having different matra (metre) counts.

The shorter form is called ह्रस्व (Hrasva) and the longer form is called दीर्घ (Deergha).

In terms of time, we have:

  • ह्रस्व - 1 unit of time
  • दीर्घ - 2 units the time

There are actually other matras as well. An even more elongated form is called प्लुत (Pluta), which is 3 units of time. It is represented with a ३. Eg: आ३.

Plutas are used when calling someone at a distance (sambodhana).

An example that Patanjali gives is of a rooster:

So all the above groups of swaras have a pluta form as well:

  • आ३
  • ई३
  • ऊ३
  • ॠ३
  • ॡ३

What about ए (e), ऐ (ai), ओ (o), and औ (au)?

Observe that in the way we pronounce these swaras, we make use of 2 other swaras.

  • ए (e) = अ (a) + इ (i)
  • ऐ (ai): अ (a) + इ (i)
  • ओ (o): अ (a) + उ (u)
  • औ (au): अ (a) + उ (u)

These swaras are therefore referred to as diphthongs (i.e. two vowel sounds that are pronounced together to make one sound).

All these 4 swaras only have a deergha and pluta variety and don't have a hrasva form.

A First Look at Pāṇini Aṣṭādhyāyī

Let me now introduce you to Pāṇini. Pāṇini was a great logician, a grammarian and a revered scholar. Of his various works, the Aṣṭādhyāyī is the most popular. In a world where things get outdated in a few years, here is a work that has survived at least 2,500 years! Talk of leaving a legacy!

For students of Computer Science, Pāṇini's Aṣṭādhyāyī codifies the grammar of the Sanskrit language. We can compare it to the Backus-Naur-Form or BNF Notation. Pāṇini employs a system of rules, known as sutras, to describe the formation of words, sentences, and their meanings. Pāṇini's grammar handles the intricacies of a natural language, including semantic nuances, which is significantly more complex than BNF, which is primarily concerned with syntactic structure. Let us revisit this in future.

But for now, back to swaras.

Legend has it that Pāṇini was deeply committed to codifying the Sanskrit language into a comprehensive system. This is akin to how modern day Computer Scientists may be thinking of building a formal grammar for a natural language like English. He was in deep meditation, seeking divine guidance to accomplish this monumental task. According to the legend, Lord Shiva, appeared before Pāṇini. To impart the fundamental structure of language, Shiva began to play his damaru. With each beat, the damaru produced a unique sound, representing a basic unit of the Sanskrit language. Pāṇini, with his keen intellect, grasped the profound significance of these sounds.

He transcribed them into fourteen verses, known as the Maheshwara Sutras. These sutras form the foundational alphabet of Sanskrit, encompassing both vowels and consonants. When chanted fast, they sound like a damaru.

Maheshwara Sutras

Let us look at the Maheshwara Sutras:

DevanagariEnglish Diacritics
अ इ उ ण्a i u ṇ
ऋ ऌ क्ṛ ḷ k
ए ओ ङ्e o ṅ
ऐ औ च्ai au c
ह य व र ट्ha ya va ra ṭ
ल ण्la ṇ
ञ म ङ ण न म्ña ma ṅa ṇa na m
झ भ ञ्jha bha ñ
घ ढ ध ष्gha ḍha dha ṣ
ज ब ग ड द श्ja ba ga ḍa da ś
ख फ छ ठ थ च ट त व्kha pha cha ṭha tha ca ṭa ta v
क प य्ka pa y
श ष स र्śa ṣa sa r
ह ल्ha l

Oh wow! As a Tabla player, I take a lot of inspiration from this. Imagine being able to produce sounds that represent every syllable of a spoken language! Now imagine being able to grasp those sounds and make sense of it.

Let us observe the first 4 sutras:

  • अ (a) इ (i) उ (u) ण् (ṇ) |
  • ऋ (ṛ) ऌ (ḷ) क् (k) |
  • ए (e) ओ (o) ङ् (ṅ) |
  • ऐ (ai) औ (au) च् (c)

In the Maheshwara Sutras, the last syllable in each sutra is an indicatory letter and is not to be considered literally. So the remaining syllables will be:

  • अ (a) इ (i) उ (u)
  • ऋ (ṛ) ऌ (ḷ)
  • ए (e) ओ (o)
  • ऐ (ai) औ (au)

Pāṇini, when describing the alphabets has only provided us with the hrasva variations of the swaras अ (a) to ऌ (ḷ) and the deergha variations of the swaras ए (e) to औ (au). The rest can then be derived from these. Isn't that compact?

(To be continued)

Test Your Knowledge

No quiz available

Tags