Rajesh Kumar

Optimizing life, one day after the next

Natural vs Programming Languages

22 Dec 2012

The number of similarities between natural (human) and programming languages can be pretty startling. Just the other day, I was discussing the parallels between the two forms of languages and I was amazed myself at how many commonalities I could find with such considerable ease.

The truth however is that this fact regarding the number of parallels shouldn't really have come across as a surprise. The two forms are still languages at a very fundamental level, not just by name. They were both essentially created to achieve one goal: to communicate. In particular, to communicate ideas and expressions. And in some cases, instructions.

Programming languages have a lot more similarities among them than what meet the eye initially. This means that if you know one programming language well, it's a hell of a lot easier to learn other programming languages, at least within the same family. The first two languages are the hardest to learn. The 3rd and higher are substantially easier, and it obviously gets easier as n grows.

This is not unlike the natural languages. If you know English, it makes it a lot easier to learn other Latin-based languages such as French and Spanish. Unfortunately, knowing English doesn't necessarily make it easier to learn languages outside the Latin family — languages like Hindi, Arabic or Japanese that are so profoundly different from English and French in a number of perturbing ways.

My life's experience with programming languages are a clear example of this concept. My first two programming languages were BASIC and PHP. After these first two, it was pretty straight-forward to pick up other similar languages in the family such as C, C++, C#, MATLAB, Javascript, Ruby, and Python. I merely had to learn what was different. However, learning new languages outside this core group got tricky. Languages like LISP/Scheme, AMPL, SQL, Regex, and Dart. To me, they felt like what learning Japanese today would be after 2.5 decades of English.

Fortunately, the more languages you know, the easier it becomes to pick up another one. If you know how to code in one language really well, you should be able to pick up multiple languages fairly easily, assuming you're willing to put in the effort required to practice each one. Furthermore, the more diversity you have in the list of languages you know, the easier it gets.

This is why when good software companies see that you don't know the language they're coding their app in (Scala for example), they simply resort to looking at how many other languages you know. They're hoping to see in your resumé that you have exposure to so many other languages with enough diversity between them that it'll be trivial for you to pick up a new one.

The biggest language debate of the 21st century among coders is if knowing several programming languages makes it easier to learn a new natural language, and vice versa. Anecdotal and empirical evidence certainly suggest it: good coders seem to have a fair command of the English language, but not necessarily the other way around. A consensus surrounding this hasn't been reached yet, but we certainly know that it doesn't hurt to know more languages. After all, the latitude of your ideas is arguably only as expansive as the cross-product of all the languages you know.

Languages are not be confused with algorithms. Being good at Java doesn't mean you know how to sort a list of numbers without calling some built-in function. Algorithms are largely language invariant just like ideas in natural languages are. Language is just a way to express an algorithm, a set of instructions if you will, to a computer much in the same way that we use language to express an idea or an instruction to another person.

However, unlike human languages, we tend not to express emotions and feelings via poetry in programming languages simply because computers don't know how to respond to them. If it isn't an instruction to do something, the computer pretty much ignores it.

*-*-*-*

What does it mean to say you know a language well? Does it just mean you've done it a lot and are fast with it? Maybe. But not entirely. Certainly working with a language a lot will make you faster in it. But the real reason is that the extended practice has made you extremely familiar with the basic expressions in the language. Control structures, for example, are perhaps the most basic concepts in any programming language — control structures like if, then, else, while, do/while, for, foreach, goto, blocks, yields, etc. Knowing a language well means you know how to express the idea of a control structure—the idea of controlling flow in your program—well.

The syntax might just be slightly different between different languages, but the base expressions are pretty much the same across all languages. Just like in all human languages, the base parts of speech are all the same — noun (additionally comprising of gender and number), verb, adjective, adverb, etc. From the outset however, they might look different in different languages.

The invention of new languages is of great interest to me. Why would someone want to invent a language? Natural languages, by their very definition, arise naturally. They arise out of a need to communicate with other humans. There isn't a set of people who sit together and write down the rules of a language like they might do for a Constitution. Well, they tried that once and it ended up being a massive failure. It was called Esperanto.

Computer languages, on the other hand, are invented to communicate with a computer. Therefore, the power of a language is determined by the complexity and sophistication of instructions you can give a computer. Therefore, a language with advanced concepts like recursion, closures, and anonymous functions for example may be considered more powerful than another language without them. Just like a language with pronouns and interjections is more beneficial than one without.

The power of a language is also largely determined by how effective you are at communicating instructions. Today, succinct languages are generally preferred over more verbose languages by virtue of enhancing developer productivity. A language too succinct however may have trouble being adopted since they're harder to maintain in the future. Maintenance is important since almost 90%+ of coding done today is adding features to an existing program, not writing new ones. And what good is a language that isn't adopted by anyone?

It's quite a feat to balance these two tough concepts: expressiveness and succinctness. Language designers are always walking a tightrope when it comes to making important design decisions regarding the complexity of their language's feature set, while at the same time keeping it succinct and definitively unambiguous.

*-*-*-*

Curious where programming languages come from? There's quite a few sources. They might come from a CS course project in school on compilers which in turn grows into something larger. This is not unlike how Linux used to be Torvald's class project which he then proceeded to open-source.

Some programmers, especially the crazy (good) ones, design languages as a hobby. Partly because designing good languages is so challenging and partly because it can be very rewarding in the end. These hobbyist languages are often the best since their designers aren't on any time pressure. The designers can take as much time as they want to make the right decisions for their language. Unlike the hobbyists, some people are just paid to design languages by big companies like Google (Dart), Microsoft (C#), and Sun (Java).

When it comes to the differences between natural and programming languages, programming languages also happen to be a lot stricter and less forgiving than natural languages. This is because human languages have significant in-built redundancy that allow us to resolve ambiguity using context. Programming languages have virtually no redundancy because that's extra work for the people who write compilers for languages. English is particularly notorious for built-in redundancy. You can easily drop 30% of words (filler words like prepositions, conjunctions and articles), drop all vowels, shuffle around all but the first and last letters of a word, and still manage to make sense of what is being said.

The fact that programming languages are stricter should come as no surprise. Computers are very precise in the instructions they like to receive. Furthermore, computers aren't really trained to clarify the meaning of an expression the way a human would do when in ambiguity. So in a lot of ways, a programming language is gauged by how easy it is for humans to make accidental mistakes (known as "bugs") communicating in it.

Owing to these concerns, programming languages are actually designed and engineered from scratch by either one or a small group of highly talented individuals. The cost of screwing up is high since the stakes are higher. Especially when you know your language could be used to write software that powers a traffic light, a robotic arm on the ISS, a NASA rocket, a nuclear warhead, an MRI machine, or perhaps your next Facebook game. You wouldn't want all your cows in Farmville to randomly disappear now, would you?

« The Agony of WaitingAgile Life Projects »

[ about | all posts | subscribe | resume | contact ]