Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> English spelling has a reputation. And it’s not a good one." - never have i ever agreed with anything more

Quick reminder that writing != language. Even the highest fidelity writing systems are lossy encoding systems. In fact, the more phonologically accurate a writing system is to its language, the more it obscures the history of its words, especially words borrowed from other languages.

So from the perspective of someone interested in etymology, English writing's tendency to preserve old and foreign spellings is a good thing.



Plus, a more phonetic writing system is also problematic for dialectal variation. I pronounce marry/Mary/merry identically, as well as bag/beg, but other dialects distinguish them. I don't think the written standard would benefit from spelling them identically. That's relevant for everyday use, not just upsetting etymology enthusiasts.

Of course it also depends on how conservative the language is, like Finnish orthography is practically IPA, and yet Finnish is a freaking time capsule for words like borrowed Proto-Germanic *kuningaz and *wīsaz, which became king and wise in English, but kuningas and viisas in modern Finnish. So you can have both phonemic writing as well as etymological transparency if your phonology doesn't change much.


That is indeed a problem with English, but even then it is possible to come up with a morpho-somewhat-phonemic spelling that would be far more consistent than modern English - because the bar set by the standard orthography is really that low.

And OTOH even modern English spelling often doesn't distinguish differences that are there in most dialects (e.g. "bear" vs "near"), so this isn't even a new problem. Realistically I suspect there's some "minimal reasonable set" of phonemes that need to be distinguished to reflect the most prominently distinct pairs in all major dialects, even if some subtle dialectal distinctions might not be reflected in spelling.


Many Indian languages are written in scripts that mirror what is spoken. Silent letters don't exist and pronunciations that don't match the spelling are very rare. This does npt preclude the existence of rich dialects and accents.

This increases the complexity of learning to write the language -- 56 letters in alphabet and each combination of consonant+vowel and consonant+consonant takes on a different letter form instead of just being a string of independent letters like English.

But reading / pronunciation is straightforward. (No we don't have spelling bees :) )


Indian languages, yes, but the story is more complicated with languages that use Indic scripts.

Tibetan, Mon-Burmese and Thai scripts, as an example, all derive from the Brahmi script (through a long and sometimes windy ancestry), but neither reflects the modern pronunciation, hence mind numbing transcription systems.

Tibetan and Burmese languages are particularly notorious for codifying the archaic pronunciation of respective languages that has been frozen in time for centuries. It is a treasure trove for linguists that have got a time machine for free, but I don't think that the same can't be said modern speakers of both languages.


> Many Indian languages are written in scripts that mirror what is spoken. Silent letters don't exist and pronunciations that don't match the spelling are very rare.

I don't think that's true. From the northern Indian languages schwa deletion (https://en.wikipedia.org/wiki/Schwa_deletion_in_Indo-Aryan_l...) to the extreme divergence between the standard formal and spoken forms of languages in Southern languages (https://en.wikipedia.org/wiki/Diglossia), it's a stretch to say the scripts mirror what is spoken.

It's just that if you are a native speaker/reader, you are so fluent that you unconsciously auto-correct those inconsistencies - just like in English.

Even in the formal registers of each spoken Indian language, which should be in theory be more systematically consistent with their scripts, there are inconsistencies in spelling/pronunciation of loan-words from both foreign and other Indian languages (i.e. aspirates in South India and retroflex approximates in northern India, and any number of inconsistent renderings of English words in Indic script).


Very interesting and informative - thanks for sharing.


Phonemic spelling does not require a syllabary, though. Several European languages are also written "as spoken" using the Latin alphabet, usually with a few extra digraphs or letter variants. Or you can make the syllabary itself compose regularly, like in Hangul.

Indian languages are generally rich in phonemes though. My mind boggles at the notion of [n] [ɳ] [ɲ] [ŋ] all being distinct. I mean, I can reproduce each one of them on its own, but doing that in rapid speech, and worse yet, recognizing the same in others' speech...


> My mind boggles at the notion of [n] [ɳ] [ɲ] [ŋ] all being distinct.

They are phonetically distinct, but not phonemically distinct, which is to say that in most places they occur, they aren't used to distinguish words or meanings.

In particular, the velar nasal "ङ" or "ng" always appears adjacent to a velar sounds (k/kh/g/gh) and similarly the palatal nasal "ञ" always appears adjacent to palatal sounds (c/ch/j/jh), both as allophones of the nasal phonemes "m" (bilabial) and "n" (alveo-dental), basically just like we speak in English under the exact same conditions (like the nasal in the word "English"!)

You perceive a difference with Indic language and English because the Latin script doesn't distinguish nasals for velar and palatal points of articulation - it only distinguishes by bilabial (m) and alveolar (n), whereas Indic scripts do distinguish those, even though they offer no additional information.

The unique nasal sound which is often contrastive in many Indian languages is the retroflex nasal "ण" (ṇ). That's the one that it's easy to confuse in speech if you are not a native speaker, so it's the only one you need to pay extra attention to when learning.


I don't actually perceive a difference. For that matter, my native language doesn't have [ŋ] at all (not even before velars), so it's actually tricky for me to distinguish it in English as well.

But, as far as I know, the different nasals are phonemic in some languages of India. Which ones depends on which languages, but I do remember seeing at least one in which all four of these were distinct.


> but I do remember seeing at least one in which all four of these were distinct.

None of the major Indian languages I'm familiar with have 4 nasal phones, from either the Indo-Aryan or Dravidian language families.

In the Indo-Aryan languages, the convergence of the various nasals is so complete that they are all often represented with a single "dot" diacritic character when they occur at word junctions.

I'd be open to hearing examples of Indian languages that have 4 nasal phonemes, though.


It was Kannada, a coworker's language. Per Wikipedia it has five nasals, each with its own glyph:

m (ಮ) n (ನ) ɳ (ಣ) ɲ (ಞ) ŋ (ಙ)


> Per Wikipedia it has five nasals, each with its own glyph: m (ಮ) n (ನ) ɳ (ಣ) ɲ (ಞ) ŋ (ಙ)

There are 5 nasal glyphs, but like in other Indian languages, 2 (velar and palatal) are allophones of the others, leaving only 3 actual phonemes. Indian scripts are often overspecified, and not every glyph represents a phoneme.


How does that play out for languages that use characters that are pictorial.

eg. Egyptian Heiroglyphs, or Asian characters (esp. Korean which has a relatively young alphabet - which IIRC is phoneme based, or Chinese which has a very old set, which is used across multiple languages (eg. Mandarin/Cantonese/etc)


It plays out perfectly. E.g. Chinese is one of the least phonological scripts around, and this is precisely why old texts in it are more interpretable.

Korean Hangul is not ideographic (I think what you meant by pictorial?). It's a morphophonemic alphabet that just happens to organize the basic phonemic units into larger graphemes representing whole syllables - but in a completely predictable way. And it is another example of this playing out: the original Hangul was entirely phonemic, but over time pronunciation diverged from spelling, and today it's morpho-phonemic, and even then not perfectly so. So they preserved the history at the cost of some mismatch between the spelling and the sound.


> How does that play out for languages that use characters that are pictorial.

Chinese' pictorial writing completely obfuscates the historical state of the spoken language, to the extent that in order to reconstruct older phases of the spoken Chinese language, scholars have had to inspect old Chinese loan-words in foreign languages that do preserve the old phonetic structure.

An example of this is the discovery that Chinese tones developed from earlier final-consonants, which were lost in Mandarin, but are preserved in Cantonese, Japanese and Korean. i.e: Mandarin "guó" compared with the early borrowing into Japanese : "koku", both meaning "country".


That is very interesting, and along the lines of what I was wondering. Thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: