RESEARCH REVEALS THE WEIRDEST LANGUAGES OF THE WORLD
Are you one of the 6,000 people in the world who speaks Chalcatongo Mixtec? Congratulations! You speak the world’s weirdest language. That’s what Tyler Schnoebelen and the researchers at Idibon, a natural language processing company, found when they statistically compared 239 languages to see how like or unlike they were to one another. Using the World Atlas of Language Structures, Idibon coded the languages for 21 characteristics including, for example, how subjects, objects, and verbs are ordered in a sentence, or how a language makes clear that a sentence is a question.
When Schnoebelen ran the numbers, Chalcatongo Mixtec, spoken in Oaxaca, Mexico, was the least like the majority of the world’s other languages. And it is pretty unusual: Schnoebelen describes it as a “verb-initial tonal language” that has no mechanism for demonstrating questions (so “You are alright.” and “Are you alright?” sound the exact same). It’s probably not surprising that some of the strangest languages are some of the most obscure. The second weirdest is Nenets, spoken in Siberia, followed by Choctaw, a Native American language from the central plains.
But some of the weirdest languages are widely spoken. The seventh-strangest language, Kongo, is spoken by half a million people in Central Africa. After that comes Armenian, then German. English ranks fairly high as well, coming in 33rd. There’s also no particular region of strange languages – the top 25 weirdest (pictured with red dots in the map below) are scattered across every continent. Mandarin is one of the strangest languages, while Cantonese is one of the most “normal.” And linguistic families are also no guarantee of similarity. Schnoebelen notes that while Germanic languages are all pretty weird, Romance languages run the full breadth of the strangeness spectrum, from Spanish, which falls in the Weirdness Index’s top 25, down to Portuguese, which ranked as one of the most mundane languages. Original text: Foreign Policy.
This fascinating article was published on Idibon.com last month and you can read the entire article clicking here.
METHOD AND VALUES
According to Idibon site, The World Atlas of Language Structures evaluates 2,676 different languages in terms of a bunch of different language features. These features include word order, types of sounds, ways of doing negation, and a lot of other things—192 different language features in total. So rather than take an English-centric view of the world, WALS allows us take a worldwide view. That is, the evaluate each language in terms of how unusual it is for each feature. For example, English word order is subject-verb-object — there are 1,377 languages that are coded for word order in WALS and 35.5% of them have SVO word order. Meanwhile only 8.7% of languages start with a verb — like Welsh, Hawaiian and Majang — so cross-linguistically, starting with a verb is unusual. For what it’s worth, 41.0% of the world’s languages are actually SOV order.
Because the data in WALS are fairly sparse the resources were restricted to the 165 features that have at least 100 languages in them but one problem is that if you just stop there you have a huge amount of collinearity. Part of this is just the nature of the features listed in WALS — there’s one for overall subject/object/verb order and then separate ones for object/verb and subject/verb.
The language that is most different from the majority of all other languages in the world is a verb-initial tonal languages spoken by 6,000 people in Oaxaca, Mexico, known as Chalcatongo Mixtec (aka San Miguel el Grande Mixtec). Number two is spoken in Siberia by 22,000 people: Nenets (that’s where we get the word parka from). Number three is Choctaw, spoken by about 10,000 people, mostly in Oklahoma. But here’s the rub — some of the weirdest languages in the world are ones you’ve heard of: German, Dutch, Norwegian, Czech, Spanish, and Mandarin. And actually English is #33 in the Language Weirdness Index.
The language that is most different from the majority of all other languages in the world is a verb-initial tonal languages spoken by 6,000 people in Oaxaca, Mexico, known as Chalcatongo Mixtec (aka San Miguel el Grande Mixtec). Number two is spoken in Siberia by 22,000 people: Nenets (that’s where we get the word parka from). Number three is Choctaw, spoken by about 10,000 people, mostly in Oklahoma. But here’s the rub — some of the weirdest languages in the world are ones you’ve heard of: German, Dutch, Norwegian, Czech, Spanish, and Mandarin. And actually English is #33 in the Language Weirdness Index.
The 25 weirdest languages of the world. In North America: Chalcatongo Mixtec, Choctaw, Mesa Grande Diegueño, Kutenai, and Zoque; in South America: Paumarí and Trumai; in Australia/Oceania: Pitjantjatjara and Lavukaleve; in Africa: Harar Oromo, Iraqw, Kongo, Mumuye, Ju|’hoan, and Khoekhoe; in Asia: Nenets, Eastern Armenian, Abkhaz, Ladakhi, and Mandarin; and in Europe: German, Dutch, Norwegian, Czech, and Spanish.
This is odd. Is this odd? One of the features that distinguishes languages is how they ask yes/no questions.The vast majority of languages have a special question particle that they tack on somewhere (like the ka at the end of a Japanese question). Of 954 languages coded for this in WALS, 584 of them have question particles. The word order switching that we do in English only happens in 1.4% of the languages. That’s 13 languages total and most of them come from Europe: German, Czech, Dutch, Swedish, Norwegian, Frisian, English, Danish, and Spanish.
Do you think that Lithuanian, Indonesian, Turkish, Basque, and Cantonese are weird languages? No, they don’t. They are really low on the Weirdness Index. They don’t seem typical to linguists and language learners but for these 21 features they stick with the crowd. Notice that we get isolates (like Basque) distributed throughout levels of Weirdness. Basque is “typical” but Kutenai, another isolate, is one of the weirdest of all languages. Even more surprising is that Mandarin Chinese is in the top 25 weirdest and Cantonese is in the bottom 10. This has to do with the fact that they have different sounds: Mandarin, unlike Cantonese has uvular continuants and has some limits on “velar nasals”.
At the very very bottom of the Weirdness Index there are two languages you’ve heard of and three you may not have: Hungarian, normally renowned as a linguistic oddball comes out as totally typical on these dimensions. Chamorro (a language of Guam spoken by 95,000 people), Ainu (just a handful of speakers left in Japan, it is nearly extinct), and Purépecha (55,000 speakers, mostly in Mexico) are all very normal. But the very most super-typical, non-deviant language of them all, with a Weirdness Index of only 0.087 is Hindi, which has only a single weird feature. Part of this is to say that some of the languages you take for granted as being normal (like English, Spanish, or German) consistently do things differently than most of the other languages in the world.
THE 10 WEIRDEST LANGUAGES:
For those who are curious, here’s Idibon’s 10 weirdest languages. Here is the full list, with the 21 weirdness features and all of the languages that had values for at least one of them (don’t trust those values, of course): http://idibon.com/wp-content/uploads/2013/06/Weirdness_index_values_full_list.xlsx.
1. Mixtec (Chalcatongo)
2. Nenets
3. Choctaw
4. Diegueño (Mesa Grande)
5. Oromo (Harar)
6. Kutenai
7. Iraqw
8. Kongo
9. Armenian (Eastern)
10. German
CONCLUSION: YOU’RE WEIRD!
Despite all of this, English still ranks as highly unusual (it comes in as #33 with an index value of 0.756). So, my friend, if you can read this, means that you are weird too. But if you speak one of the 10 languages below it seems that you’re speaking an usual and normal language:
230 Basque
231 Bororo
232 Quechua (Imbabura)
233 Usan
234 Cantonese
235 Hungarian
236 Chamorro
237 Ainu
238 Purépecha
239 Hindi
Text and Map from: Idibon.com