Country of the Blind: 2016

The Alphabet That Will Save a People From Disappearing

Interesting article at the Atlantic on the creation of a new alphabet for the Fulani language.

Marred by a slight instance of cultural chauvinism: I snorted my milk at the following sentence: "But unlike Arabic, whose short vowels are written as diacritical marks above and below letters, the script assigned its five vowels proper letters." Arabic vowels may be different from Latin vowels; that hardly makes them "improper."

I also learned a useful marketing idea for those creating a new alphabet: Choose a catchy word with no repeating letters; make those letters the first of your alphabet; then take that same word to be the name of your alphabet (just as "alphabet"="Alpha"+"Beta", the first two letters of the Greek alphabet). Such is the case with "Adlam", the new alphabet described here.

How Half Of America Lost Its F**king Mind

The best analysis I've seen of the 2016 election, from David Wong.

Starting up a New Language: Some Case Studies

Photo by Ryan Buterbaugh

How best to tackle a new language depends on many factors:

1. How difficult do you expect it to be?

2. Is a new script involved?

3. What's your immediate goal? Your ultimate goal?

4. What tools are available?

5. How much time do you have available?

Herewith some examples of how I chose to attack some new languages.

Swahili. (First of all, my answers to the questions above: 1: easy. 2: no. 3: just want to make incremental progress. 4: fair number of courses and books available. 5: almost none.)

The approach: I use the Language/30 Swahili course and an old edition of Teach Yourself Swahili, which gives a bare-bones presentation of grammar and vocabulary. I use Audacity to clip audio sentences from the Language/30 course and put the sentences with text into Anki flashcards. I create additional cards with words and sentences from the Teach Yourself book. The Teach Yourself cards have no audio, unless and until I find a good source for ad-hoc Swahili audio.

But Swahili pronunciation is pretty easy. The Language/30 samples should provide ample pronunciation practice, as well as providing a storehouse of useful sentences "in the bank." So far, at the rate of one new card per day (one new word or sentence every two or three days) it takes less than a minute a day, plus occasional prep time to create new cards.

Thai. (1: moderate to difficult. 2: yes. 3: just want to make incremental progress. 4: fair number of courses and books available. 5: very little.)

The approach. First I designed some mnemonics to help me memorize the Thai alphabet. I spent about a minute a day for some months just practicing the letters. I made Anki cards from the first few reading lessons of the Thai Pimsleur course, mostly just nonsense syllables, but it helped me get a foothold on reading. I continued with the Teach Yourself Thai book (with accompanying CD's--some form of audio companion is a huge advantage). Sample sentences from the book go onto flashcards, while I download audio for individual words from Google translate. (Note: I don't do this for Swahili because Google Translate's Swahili voice is a weird robot which I have no desire to emulate.) I also picked up Stu Jay Raj's unusual book Cracking Thai Fundamentals and started reading through it and making flash cards as necessary (it has less memory-intensive content). And by dint of some effort, I devised a streamlined formulation of the rules for determining tones from the Thai script.

Generally speaking, for each word, phrase, or sentence I have as many as five cards:

(1 "English to Thai") Input: English. Output: Thai audio.
(2 "Reading") Input: Thai script. Output: English and incidentally Thai audio.
(3 "Listening") Input: Thai audio. Output: English and incidentally Thai script.
(4 "Spelling") Input: English and Thai audio. Output: Thai script.

In line with a principle that good flash cards should expect a single response, the term "incidentally" here indicates that the card presents the information as part of the response but I need not reproduce it as a "correct" response.

Only a few cards are of type 4. I am ramping up gradually to writing Thai. I use conditional compilation for Anki to generate these cards only when I specify.

On the other hand, I like the daily exposure to reading Thai.

I currently spend about four minutes a day working through my Thai Anki deck, adding new cards from the Teach Yourself book, Cracking Thai Fundamentals, or occasionally the Pimsleur reading lessons as necessary.

Starting up Khmer and Burmese was very similar, the main difference being a relative paucity of available resources.

Sumerian. (1: moderate. 2: yes. 3: finish Hayes book [see below]. 4. Chiefly books. 5. very little [see a trend here?])

Sumerian is a dead language, which entails both advantages and disadvantages.

Advantages:

(1) You need not be concerned about acquiring a perfect accent. For most dead languages, scholars are happy to inform you about the pronunciation with considerable detail and subtlety, but I'm always a little skeptical that they know quite as much as they think they do. Whether or not they really know what they're talking about, there's no harm in making things easy on yourself. This does not mean pronunciation can be ignored. I have yet to come across a writing system in which sound does not play some partial role at least. In the case of Sumerian, the writing is a combination of ideographic and phonetic symbols. Many words and phrases have alternate spellings. Having a crude idea of the pronunciation makes it much easier to recognize these.

(2) Likewise, you may choose to (or be forced to) downplay skills of writing and listening in favor of reading. Focusing on just one of the four standard language skills certainly simplifies matters.

For Sumerian, I depended on a single resource: John Hayes' book A Manual of Sumerian Grammar and Texts. Incidentally I see the hardcover edition of the book is now going for $4000. which makes me feel strangely wealthy. The book provides many samples of Sumerian texts, with vocabulary lists and extensive discussions of each.

Sumerian is written in a cuneiform script, which my computer does not support. Certainly the computer would not reproduce the variety of sizes and shapes of the historical texts. Here Anki's ability to handle images facilitates greatly. By scanning a page, and clipping out samples of text, I can reproduce any word or phrase from the book.

Generally I use two Anki cards for each item of Sumerian text:

(1 "Reading") Input: Sumerian text. Output: the transliteration (phonetic reading) of the text.
(2 "English") Input: Sumerian text and transliteration. Output: the English translation.

I also use Anki Cloze cards for grammar rules, historical facts, etc.

This approach lets me work through at a steady (if slow) pace, while committing to heart the memorization-related content and reading some phrases of text (while reviewing Anki cards) every day.

"Nude" and "Flesh" Are Not Colors

Today is paint day. 🎨 #keepingbusyformysanity

A photo posted by Chyrstyn Mariah Fentroy (@chyrstynmariah) on Sep 1, 2016 at 8:35am PDT

In the "institutional/pervasive racism" category for today, note the photo above, courtesy of Chyrstyn Mariah Fentroy, who, based on the evidence of the photo above: (1) is a ballet dancer, and (2) has a richly-hued complexion. The story told by the photo is that apparently it is impossible to buy ballet slippers in any color other than off-white. If your natural skin shade is something other than off-white, then you are doomed to hand-paint every new pair of ballet slippers that you buy. (Apparently ballet slippers that contrast with your skin are taboo.)

I sympathize with all ballet dancers who have to deal with this. It is something of a burden to have to do this all the time. More important the situation sends a message that you're the wrong color to be doing this. Sure, you can shrug it off, but at some point everyone experiences shrug fatigue.

I learned of this particular issue thanks to a Huffington Post article by Katherine Brooks. Ms. Brooks is a serious contender for the Lack of Self-Awareness Awards. Note how the article becomes incomprehensible halfway through thanks to the profligate use of the term "flesh tone." What does this mean? Is it the pale peach one finds in a box of crayons? I literally could not follow the sense of the article once the term "flesh tone" is bandied about with reference to a diverse cast of characters.

Why are people still talking about "flesh tone"? I learned back in 1975 that this was uncool:

Marilyn Goes to Poland!

Marek P. has translated the Marilyn Method for memorizing the pronunciation of Chinese characters into Polish for his site Zyskiwanie Przewagi.

More Arabic Mnemonics

This is a companion to my earlier post on mnemonics for Arabic roots. An interesting feature of Arabic is that most words can be decomposed into a root and a pattern. The root consists strictly of consonants (usually three), while the pattern consists of vowels plus possible additional consonants.

In English we can see something of how this works by considering the words sing, sang, sung, song, singer. We could think of these composed of an underlying root s-ng overlaid on various patterns 1i2, 1a2, 1u2, 1o2, 1i2er.

We could combine some of these patterns with a different root dr-k to yield new words: drink, drank, drunk, drinker (but no dronk, unfortunately).

To repeat my earlier example, in Arabic, there is a root ك ت ب (k-t-b) which has to do generally with writing. Some of the words formed from this root by fitting it into different patterns would be:

كَتَبَ (kataba = k-t-b × 1a2a3a) "he wrote"
كِتَاب (kitaab = k-t-b × 1i2aa3) "book"
مَكْتَبَة (maktaba = k-t-b × ma12a3a) "library"

Most (not all) Arabic roots consist of exactly three consonants. The earlier post described a fairly complicated method for encoding these three consonants in an English word of phrase. This is not so simple because Arabic distinguishes many consonant sounds which English doesn't.

To be honest, the previous method is a sledgehammer, really important only for the most stubborn cases. Many roots (like the forementioned k-t-b) can stick in the mind with modest effort—especially if you remember just one of the derived words (such as "book").

In putting this method into practice, I realized that the puzzle needed another piece—an idea I borrowed from Heisig's method for Chinese characters. This extra piece is an English name for the root, analogous to Heisig's keywords. In the case of Arabic, I don't call this name a "keyword" but rather a "syndrome." For example, the syndrome for the forementioned k-t-b root is "write."

One-by-one I peruse my Arabic dictionary (which groups together words sharing the same root), and do my best to choose an English syndrome which comes closest to capturing the spectrum of meanings associated with the root. I don't do this for every Arabic word I encounter, but only for those that I find particularly hard to remember.

For example, one pair of words that gave me trouble was:

طابِق (Taabiq), plural طَوابِق (Tawaabiq): a floor or story of a building, versus:طَبَق (Tabaq), plural أطباق ('aTbaaq): a dish or course of a meal.

Both share the root طبق (T-b-q), but I could see scant connection in the meanings. But perusing the various words associated with this root, it seems most are related to the idea of one thing on top of another. The floors of a building are arranged this way, as is the lid for a dish. The idea of a "dish" of a meal seems to be derived from the latter. I therefore chose the word "superpose" as the syndrome for the T-b-q root.

Here's a list of some of the syndromes I have selected thus far, along with a sample word for each.

dissociate	هجر (*h-j-r*)	هاجَرَ haajara (migrated)
disjoin	فصل (*f-S-l*)	فَصْل faSl (to dismiss, fire)
highborn	شرف (*sh-r-f*)	يُشْرِف yushrif (supervises)
hone	حدد (*H-d-d*)	حّدِيِد Hadiid (iron)
fan out	نشر (*n-sh-r*)	نَشَرَ nashara (published)
confer	خول (*kh-w-l*)	أَخْوال 'akhwaal (maternal uncles)

This leaves the issue of the pattern. How do I remember that طابِق (Taabiq) is a building story and طَبَق (Tabaq) is a dish rather than vice-versa? (This particular pair was a real issue for me.) These words have the same roots but different patterns. The Arabic grammarians had the clever idea of using the root فعل (f-`-l) "do" as a "neutral" root. They then described a given pattern by applying it to this "neutral" root. For طابِق (Taabiq), the pattern of vowels is (_aa_i_). For طَبَق (Tabaq), the pattern of vowels is (_a_a_). The Arabic terms for these patterns would be فّاعِل (faa`il) and فَعَل (fa`al), respectively.

For the most part I handle this with a very simple device (also highly adaptable to many other situations). For a given pattern, I choose just one example of the pattern to use as a mnemonic hook for the pattern itself. For example consider the two patterns just mentioned: _aa_i_ and _a_a_. The former also appears in the word هاتِف haatif "telephone" and the latter in the word بَصَل baSal "onions". I use these words to represent their associated patterns. Thus the word طابِق (Taabiq) is broken down into superpose × telephone (root × pattern) while طَبَق (Tabaq) is broken down into superpose × onions. And my trouble in keeping this straight is resolved by creating whatever mental images serve to associate telephone with story of a building and onions with dish.

In choosing a keyword to represent a given pattern, I try to choose something concrete, visualizable, and as distinct as possible from other keywords.

Here's a list of patterns and their associated keywords. Both this and the previous list are works in progress. I add words as I find I need them.

عَباءة `abaa'ah wool cloak	فَعَالَة *_a_aa_ah*
مُدُن mudun towns	فُعُل *_u_u_*
صَحْن SaHn plate	فَعْل *_a__*
فُلُوس fuluus money	فُعُول *_u_uu_*
بِدَل bidal suits	فِعَل *_i_a_*
أَرْجُل 'arjul legs	أَفْعُل *’a__u_*
مَسْرَح masraH theater	مَفْعَل *ma__a_*
بَصَل baSal onion	فَعَل *_a_a_*
قُفَّاز quffaaz gloves	فُعَّال _u_2aa_ (2 indicates double vowel)
رِسالة risaalah letter	فِعَالَة *_i_aa_ah*
سِجْن sijn jail	فِعْل *_i__*
خَلِيج khaliij gulf	فَعِيل *_a_ii_*
أَنْهَار 'anhaar rivers	أَفْعَال *’a__aa_*
سَتائِر sataa'ir curtains	فَعَائِل *_a_aa’i_*
رِياح riyaaH winds	فِعَال *_i_aa_*
قُفْل qufl lock	فُعْل *_u__*
مَنْزِل manzil house	مَفْعِل *ma__i_*
فَواكِه fawaakih fruits	فَواعِل *_awaa_i_*
عَرُوس `aruus bride	فَعُول *_a_uu_*
هاتِف haatif telephone	فَاعِل *_aa_i_*

A couple more examples: طَوابِق (Tawaabiq), plural of طابِق (Taabiq), breaks down into superpose × fruits (T-b-q × _awaa_i_), whereas أطباق ('aTbaaq), plural of طَبَق (Tabaq), breaks down into superpose × rivers (T-b-q × 'a__a_).

The Arabic verb forms get special treatment. Each verb form is a coordinated set of patterns describing the various verb forms. These are numbered from one I through ten X. Thus for a Form-IV verb, you know not only that the past-tense pattern is i__aa_a, but the present-tense pattern is yu__i_. More information is available in an Arabic grammar book or various places on the Internet.

For the verb forms, rather than the example method I use for other patterns, I chose a code word for each form from II through X:

Form II	ninja
Form III	maiko, or mikado
Form IV	ramen
Form V	anime
Form VI	shinkansen
Form VII	Godzilla
Form VIII	futon
Form IX	pagoda
Form X	sushi

I was guided by the Major system in choosing the code words, but that is not really important. Almost any set of visualizable code words works just as well once committed to memory. I also chose words with Japanese connotations. You might wonder why. Why not choose words with Arabic connotations—sultan rather than sushi, for example? The reason is that sultan may well turn out to be a specific vocabulary item to be memorized. Using the same word as a code word creates a minor possibility for confusion.

You notice also that I have two different code words for Form III. This is to distinguish the two versions of Form III that have different patterns for the "verbal noun". Maiko is used for those verbs with a verbal noun of pattern مُفَاعَلة (mu_aa_a_ah). Mikado is used for those verbs with a verbal noun of pattern فَعَالٌ (_a_aa_).

Form I presents a more complicated situation. Any of the three Arabic vowels a, i, u may appear in the past-tense form, and any of the three may also appear in the present-tense form (although not all combinations appear). The Form I code words are chosen to describe these vowels as well:

tatoo: past a, non-past u
taxi: past a, non-past i
tatami: past a, non-past a
titan: past i, non-past a
tutu: past u, non-past u

(BTW in researching this post I learned that there are actually fifteen verb forms, but forms XI to XV are extremely rare. The system is easy to extend in any case.)

The Irony/Sincerity Gap Strikes Again

As previously noted here and here. Compare the American and Japanese trailers for Disney's upcoming Moana:

Note how the American trailer emphasizes humor while the Japanese trailer plays up sentiment.

Spelling Thai Tones, Simplified

Image by DALL-E

(Revised for additional clarity 23 December 2022. This includes changes to the names of the tone markers.)

In my extremely leisurely study of Thai, I have reached the point of wanting to learn the rules for expressing the tones in writing. Thai being a tonal language, each Thai syllable takes one of five possible tones. The written language does describe the tones unambiguously, according to arcane and seemingly sadistic rules.

(I indulge my curmudgeonly side here. I well appreciate that an English word like “fraught”—questions like what it means, why it is spelled that way, and what the equivalent present-tense verb is, for example—must be just as frustrating to the foreign-language student.)

In Thai, a given syllable’s tone is affected by several factors (all to be explained later on):

1. The "consonant class": there are three of these.

2. Any tone marker found on the syllable: there are four of these ่, ้, ๊, ๋. Or five if you want to consider "no marker" as an additional marker.

3. The type of syllable: there are three of these.

And the output is a tone: there are five of these.

So this is a process with 3×3×5 = 45 possible input combinations and five possible outputs. In the worst of all possible universes, we would have to memorize what the tone is for each of the 45 possible input combinations. The Wikipedia article on Thai script (which I used as my reference) summarizes things in a table with seven rows and three columns, so there are only 7×3 = 21 indivicual cases to memorize. Oh great. The Wikipedia article uses a table, and a diagram and a flowchart to explain it, and I still think it’s pretty complicated.

After about a day of staring at the table rightway-round, upside-down, and inside-out, I think I’ve managed to pull out the essence. It comes down to just seven short rules, five rules for unmarked syllables and two for marked syllables.

The traditional terminology (as used by Wikipedia) is confusing. Consider this:

The three consonant classes are "high", "mid", and "low".

The tone markers are "high", "falling", "low" and "rising".

The syllable types are "live", "dead long", and "dead short".

And the syllable tones are "high", "falling", "mid", "low", "rising".

So when you see "high" you don't know if this is a consonant class, a tone marker, or a syllable tone. Recipe for confusion. My first step was to replace this terminology with something both more vivid and non-redundant. In our new system, the terminology runs as follows:

1. Colors represent consonant tone classes. "Red" replaces "high", "black" replaces "mid", and "blue" replaces "low". This is in line with my system for memorizing the Thai consonants.

2. Verbs of motion represent syllable tones. "Flying" replaces "high", "falling" replaces "falling". (Hey look, it's the same! Also likewise for "rising".) "Walking" replaces "mid" and "crawling" replaces "low".

3. Animals represent syllable types. "Live" syllables become "lions". Dead syllables become "dogs". We have "long dogs" and "short dogs". Fuller explanation comes below.

4. And conveyances (so to speak) represent the tone markers. For example the "wings" tone marker is associated with the "flying" syllable tone.

Tones for unmarked syllables

Here are the rules for unmarked syllables:

Lions walk, but….
Red lions rise.
Dogs crawl, but….
Long blue dogs fall.
And short blue dogs fly.

Perhaps some explanation is in order…

I won’t go into much detail on the syllable tones as such; see this Wikipedia diagram for a graphic representation, or this nice video from Benny Lewis. As names go, these are pretty good, each being a rough description of the corresponding tone contour. Keep in mind, for syllable tones:

mid=walking
low=crawling
high=flying

Now to the “colors.” Thai consonants come in three categories, the main function of which appears to be giving clues as to tones. For example, we could think of ข and ค as two different versions of “K”, which impart different tones (not always the same) to the syllables they head. For example, ขา is pronounced something like “kah” with a mid tone and คา is pronounced exactly the same, except with a rising tone. Again, the “color” of the consonant is merely one of several factors determining the tone of the syllable.
The traditional names for these three categories are “high”, “mid”, and “low.” (Note that in the Wikipedia table, these are the headings of the three columns.)

As mentioned before, these are in fact the worst possible names for the three categories. First of all, as “high”, “mid” and “low” are already used as names for three of the five tones, describing consonants by the same terms is a recipe for confusion. The exception would be if, for example, a “high” consonant always gave a syllable the “high” tone for example, but such is not the case. Check the “high” column of the table again. Note that the “high” tone is the only one which cannot occur with a “high” consonant. Similarly for “low” tones and “low” consonants.

That's why I decided to drop the “high”, “mid”, “low” terminology and use colors instead. My system for learning the Thai alphabet uses vowel sounds to help remember the consonant class. "Red" consonants are given names with vowels E and I, as in "rEd". Blue consonants are given names with vowels O and U, as in "blUe". And black consonants are given names with the vowels A, as in “blAck”. (It just happens that the “red” consonant column of the Wikipedia table is shaded red, and the “blue” column is shaded blue. Huh, fate.) For the record:

The blue consonants are: งณนมญยรลฬวคฅฆชฌฑฒทธพภฟซฮ
The red consonants are: ขฃฉฐถผฝศษส
The black consonants are:กจดฎฏตบปอ

And finally, the tone is affected by syllable type. “Lions” (live syllables) are distinguished from “dogs” (dead syllables). The former end in a vowel or a “sonorant” (like M, N, etc.). The latter end in a “plosive” (like “K”. “T”, etc.). In short, if you can imaging singing the syllable, stretching it out indefinitely (like “Liiiiioooonnnnn...”) then it’s a lion syllable. If not (like “Dooooog”—once you hit the “g” you are done) then it’s a dog syllable.

We need to distinguish "long" and "short" dogs. This depends on the vowel of the syllable: a long vowel yields a long syllable, and a short vowel yields a short syllable. My system for learning the Thai vowels will show which vowels are long and which are short.

This is all the background needed for our five rules. To summarize:

Colors represent consonant classes.
Verbs (of motion) represent tones.
Animals represent syllable types.
Now let’s revisit our five rules for unmarked syllables.

I. Lions walk, but…. In other words, a live syllable gets a mid tone, with the exception that…

II. Red lions rise. A live syllable with a high-class consonant gets a rising tone.

III. Dogs crawl, but.… A dead syllable gets a low tone, with the exception that…

IV. Long blue dogs fall. A long dead syllable with a low-class consonant gets a falling tone.

V. And short blue dogs fly. A short dead syllable with a low-class consonant gets a high tone.

That’s it. These five rules encompass all the information in the top three rows of the Wikipedia table. It can’t really get better than this, because you need at least one rule for each tone.

Tones for marked syllables.

Who invented the Thai script? It would have been so easy just to let a syllable’s tone be specified by the tone mark (conveyance). And we could dispense with almost half the Thai alphabet. Oh, well….

We give each of the tone marks a name based on the tone it describes in most cases:

  ่   Knees (goes with "crawling")
    ้   Parachute (goes with "falling")
    ๊   Wings (goes with "flying")
    ๋   Rocket (goes with "rising")

We then just need rules to handle the exceptions. Looking at the bottom four rows of the Wikipedia table, we see that syllable type (lion versus dog) is irrelevant. We also see that red and black syllables are always the same, except for the blank areas (which represent situations that never occur—so we need not worry about them). The only exceptions concern blue syllables. We use “bluejay” to represent such a syllable—dead or alive, but starting with a blue consonant.
Just two rules for two exceptions:
VI. Kneeling bluejays fall. (In other words, a syllable with a blue consonant and marked with ่ (Knees) gets a falling tone.

VII. Parachuting bluejays fly. (In other words, a syllable with a blue consonant and marked with ้ (Parachute) gets a rising tone.

Did I say seven rules?

It turns out there are two loopholes described in the comments following the Wikipedia table. Both concern a consonant changing its color under certain conditions.
In the first case any of the consonants งญนมวยรล, which are normally blue, change to red when they follow an unadorned red or black consonant or a silent letter ห (H). These particular blue consonants have no red equivalents (for example, ญ and น are both blue “N”, but the Thai alphabet has no red “N”), so you can think of this as a way of improvising a red consonant.
In the second case, the letter ย, which is normally blue, changes to black when it is prefixed by อ. This rule is limited to four words which start with the sequence อย: อยาก, อย่า, อย่าง, and อยู่. It turns out all four words come out with low tones, the first because “(Black) dogs crawl”; the other three because they carry the “Knees” marker.

See also:

Learn the Thai Alphabet in One Hour

...And Learn the Thai Vowels

Country of the Blind

The Alphabet That Will Save a People From Disappearing

How Half Of America Lost Its F**king Mind

Starting up a New Language: Some Case Studies

"Nude" and "Flesh" Are Not Colors

Marilyn Goes to Poland!

More Arabic Mnemonics

The Irony/Sincerity Gap Strikes Again

Spelling Thai Tones, Simplified

About Me

Search This Blog

Blog Archive

Labels

Followers

Some links worth checking out