Saturday, December 6, 2008

etymology of word apparently doesn't help

I was reading the Toronto Star a week ago.
The author described a young man with an "illusive" smile.
Apparently this was quite fetching.
I suspect the author meant "elusive".
Illusive is clearly related to illusion in its origin.
Elusive, to elude.

Folks that learn language primarily by sound and not sight will make these errors in English.

I think I busted some other major newspaper for confusing "elicit" with "illicit". This is even less forgivable. Elicit is a verb and illicit is an adjective. A simple understanding of word order should have prevented this. BBC online this week had "it's" and "its" confused.

I have advice to aspiring journalists. Visit some web sites. Google a few terms such as "common mistakes English".
D: Punctuation humor.
The exclamation mark is "... a screamer, a gasper, a startler or (sorry) a dog's cock".
Poor writers are prone to using triple exclamation marks for effect. !!!
Any good editor would never accomodate 3 of these. ... ... [=

Friday, October 31, 2008

an example of the need for simpler rules: the apostrophe

D: I have the seventh sense. I see dead punctuation everywhere. Nobody else can see it. And the punctuation doesn't know it's dead. <:

I never quite figured out why most folks find the apostrophe so difficult. I mastered it in primary school. Yet many of my liberal arts grad friends have yet to figure it out.

Punctuation is not set in stone. What is considered appropriate usage has varied by time and place. There is still disputes even among the most learned of us.
Having said that, consider the following passage:

  • My sister's friend's investments (the investments belonging to a friend of my sister)
  • My sister's friends' investments (the investments belonging to several friends of my sister)
  • My sisters' friend's investments (the investments belonging to a friend of several of my sisters)
  • My sisters' friends' investments (the investments belonging to several friends of several of my sisters)
I will review the most common errors.

1) "dog's" This could mean the possessive "the dog's bone" but no plural.
2) "it's" This can only mean a short form of "it is". Perhaps "'tis" was clearer.
I think I know why people screw that up. My, your, his, its. BUT Henry's, Mary's ...
They know sometimes there is an apostrophe. The English system is clear as mud.

Aside - the US rule suggests using an apostrophe for the plural of an acronym. For example, RRSP's. In Canada we don't ( I corrected the Royal Bank for the GIC's).
I much prefer our system. It allows more information. For example, how does one clearly indicate the possessive of GIC? It resembles plural with the US rule.

D: what to do?
Well ... if I ever get around to completing Hioxian, I plan to allow an optional overt indicator of what role a piece of punctuation is playing.
I read "Eats, Shoots & Leaves: The Zero Tolerance Approach to Punctuation ".
It was a fun read. Each punctuation symbol typically had only a handful of uses. This allows me to use Hioxian with a "subset usage indicator" mark on each symbol.

Friday, October 10, 2008

Dion in interview with CTV encounters problems with disambiguation.

Sitting for a taped interview with Steve Murphy, the anchor for CTV Halifax, Mr. Dion was asked: "If you were prime minister now, what would you have done about the economy and this crisis that Mr. Harper hasn't done?"

Mr. Dion replied: "If I had been prime minister 2½ years ago?"

Mr. Murphy replied: "If you were the prime minister right now."


Mr. Murphy repeated the question again. Mr. Dion asked: "If I was prime minister starting when? Today?"

D: Sure, Dion could have handled his response better.
But the question was ambiguous.
Verb tense for hypothetical situations is tricky!

For present unreal events, we put the verb in the condition clause one step back — into the past:

  • If the Bulls won another championship, Roberto would drive into Chicago for the celebration.
  • I wish I had tickets.
  • If they were available anywhere, I would pay any price for them.
  • If he were a good friend, he would buy them for me....

For past unreal events — things that didn't happen, but we can imagine — we put the verb in the condition clause a further step back — into the past perfect:

  • If the Pacers had won, Aunt Glad would have been rich.
  • If she had bet that much money on the Bulls, she and Uncle Chester could have retired.
  • I wish I had lived in Los Angeles when the Lakers had Magic Johnson.
  • If I had known you were coming, I would have baked a cake.
D: very confusing to an ESLer! (English as Second Language)

D: I assume Dion understood the rule that generally verb tenses must agree.
Typically, if one begins in past tense then one stays in past tense.
Easy, right? [=

Tuesday, September 23, 2008

Bad English (or: when I read your lyrics)

This is my pitch for received grammar and syntax lessons.
Yes, yes, subcultures have viable and legitimate ways to express concepts.
Having said that, a single standard prevents groups from remaining isolated solitudes.

Mandarin and Cantonese are largely mutually incomprehensible.
Apparently their written forms are much more similar.

Without proper central instruction in English, kids are forced to learn from pop culture.
Sure, there is the occasional TV show writer that uses whom correctly. Sadly, usually even when the educational background of the character does not warrant it (Bones...).

And then... there are lyrics.
I understand one must sometimes shoehorn a word to fit the beat of a song.
However, often a simple rephrasing would have sufficed.
Haha! A site by pedants, for pedants. And I could have been the president! <:
Justin Timberlake's, "What goes around"
The Lyrics:
When you cheated girl, my heart bleeded girl.
It should be my heart "bled."

Gwen Stefani's, "Rich Girl"
The Lyrics:
If I was a rich girl.
That's wrong. It's supposed to be "If I WERE a rich girl".
Submitted by: G. Mag
D: Gwen can do whatever she wants... mmmm.

Kid Rock's, "All Summer Long"
The Lyrics:
"We didn't have no internet"
Double negative - yes he did have no internet, it was 1989!
D: I hear this at work. English has so many places to place a negative indicator.

The Killers', "When You Were Young"
The Lyrics:
"every once and a while"
it should be "every once IN a while"

Nickelback's, "Believe it Or Not"
The Lyrics:
Believe it or not everyone have things that they hide
everyone HAS things that they hide. It does that the entire song and annoys me SOOO much!!
D: this resembles confusion over whether a noun is collective or not.
Seagulls would be plural. A flock of seagulls is not.
(Note the weakness in my own writing - I tend to shift verb tenses too much!)

Savage Garden's, "Truly, Madly, Deeply"
The Lyrics:
I want to lay like this forever
Until the sky falls down on me
It's LIE, not LAY!
D: here is how I remember this one. Lay is something I do to somebody else, LOL!

English quirk of the day:

advice vs advise | accept vs except | affect vs effect | a lot/alot/allot
all ready vs already | all right vs alright | alone vs lonely
altogether vs all together | any vs some | apart vs a part
And the list goes on and on and on and...
P.S.: vs., not vs or v.s..

Monday, September 22, 2008

economical use of pixels in fonts (pic) (pic)

D: the third web site shows how we have been spoiled with post-SVGA very high resolution monitors. We assume we can indefinitely scale a Truetype font, and the pixel count will support it. But this places a cap on how small a legible symbol can be.
I show above examples of a 3x3 font that is *just* legible.
Early computer fonts were only 5x8 - see Atari stuff.
Then we went to 8x12.

Because my Hioxian system uses an alphanumeric display layout, it doesn't work below a certain pixel resolution. In my case, I think 5x8 was *just* sufficient. It would be incredibly chunky.
At the size of a computer monitor pixel, the details would blur together.
Still, it is interesting to ponder just how minimal and spartan a font can be and still function.

Aside: I was talking to buddy Rick H., a computing prof at WLU, about 3D displays. I started thinking about that movie "Contact" with Jodie Foster. OK, I started thinking about Foster first, LOL! Anyway, I read the book too - I thought the movie was more concise and focused. The alien font is difficult for the protagonists to decipher. They finally realize that aliens use a 3D layout v.s. 2D like ourselves. I have to assume they have sweet 3D displays using circular polarization, like our latest ZScreen technology. It would seem that those aliens are related to that freaky mutant of our world, the Mantis Shrimp. It is the only living thing we know that can see and meaningfully use circular polarized light.
I thought it would be fun to turn that alien alphabet part of that movie into real 3D.
I admit I am intrigued by the potential of using 3D info to increase data density in a font.
The high end circular polarized 3D tech is only now coming onto the market, and a decent display runs 5000bux. But we could be on the very cusp of a dramatic change in the way we present computer information. It is pretty cool to think of that!

Friday, September 19, 2008

computer translation interlingua, problem with natural languages

"So far all approaches to solve the multilingual component have run into serious difficulties...Other approaches set out to use one language (almost always English) as a pivot. Again the results were held back, this time because the use of a natural language as an interlanguage occasions ambiguity. As the researchers note, early attempts at using a natural reference language to build machine translation systems go back 20 years ago, and the results were no good."

D: the con-lang (controlled natural language) approach has shown some promise.
Attempto is a good example, and one I've studied.

Attempto Controlled English (ACE) is a controlled natural language, i.e. a rich subset of standard English designed to serve as specification and knowledge representation language. ACE allows users to express professional texts precisely, and in the terms of their respective application domain. As any language, ACE must be learned to be used competently, but this amounts to learning the differences between ACE and full English, formulated as a small set of ACE construction and interpretation rules. Once written, ACE texts can be read and understood by anybody.

ACE appears perfectly natural, but — being a controlled subset of English — is in fact a formal language. ACE texts are computer-processable and can be unambiguously translated into discourse representation structures, a syntactic variant of first-order logic.

D: this removes some of the impressiveness of Lojban, which makes a similar claim (wiki source).

Lojban (pronounced [ˈloʒban]) is a constructed, syntactically unambiguous human language based on predicate logic. Its predecessor is Loglan, the original logical language by James Cooke Brown.

D: finally, as an aside, I'll note A++, which is a learning tool to understand computer programming. One deals in the most rudimentary concepts, and composite concepts are clearly indicated as such.

A++ stands for abstraction plus reference plus synthesis which is used as a name for the minimalistic programming language that is built on ARS.

ARS is an abstraction from the Lambda Calculus, taking its three basic operations, and giving them a more general meaning, thus providing a foundation for the three major programming paradigms: functional programming, object-oriented programming and imperative programming.

ARS Based Programming is used as a name for programming which consists mainly of applying patterns derived from ARS to programming in any language.

D: the book is online.

Thursday, September 18, 2008

humour of english word boundaries, phrasing

  1. Who Represents is where you can find the name of the agent that represents any celebrity. Their Web site is

  2. Experts Exchange is a knowledge base where programmers can exchange Advice and views at

  3. Looking for a pen? Look no further than Pen Island at

  4. Need a therapist? Try Therapist Finder at

  5. There's the Italian Power Generator company,

  6. And don't forget the Mole Station Native Nursery in New South Wales,

  7. If you're looking for IP computer software, there?s always

  8. The First Cumming Methodist Church Web site is

  9. And the designers at Speed of Art await you at their wacky Web site,

  • Grandmother of eight makes hole in one
  • Deaf mute gets new hearing in killing
  • Police begin campaign to run down jaywalkers
  • House passes gas tax onto senate
  • Stiff opposition expected to casketless funeral plan
  • Two convicts evade noose, jury hung
  • William Kelly was fed secretary
  • Milk drinkers are turning to powder
  • Safety experts say school bus passengers should be belted
  • Quarter of a million Chinese live on water
  • Farmer bill dies in house
  • Iraqi head seeks arms
D: similar Esperanto ambiguous words...

Esperanto Meaning A Meaning B
"a purchase" "a contemptible little thing"
"to alternate" "to sneeze at"
"avarice" "a group of grandfathers"
"a diet" "a minor deity"
"age of dignity" "a swim in a dike"
"an exterior" "a former world"
"an accomplishment" "a group of elves"
"a daughter" "dirty linen"
"a galley" "a drop of bile"
"a colleague" "a big neck"
"a pumpkin" "a city of cakes"
"lavendery" "in need of cleaning"
"an oxeye daisy" "someone licking"
"a casserole" "a sea-tale"
"a modulation" "a fashionable guy"
"a ream of paper" "a papal mistake"
"a person" "a sounding-out"
"pretend" "needing to be ready"
"speed" "a turnip-sprout"
"regular" "aristocratic"
"re-seeing" "child of a daydream"
"a sardine" "a Sardinian woman"
"sensitive" "without theme"
"sugar" "a drop of juice"
"urine" "an aurochs cow"

Then there is the English ambiguity in such statements such as the one that follows, due to imprecise use word ordering.

"Men don't like to talk about their relationships with each other."
as opposed to
"Men don't like to talk to each other about their relationships."

D: I'd like to contrast this with the (soon to be released!) Decimese.
Syllable construction: for most vocabulary, exempting function words, consonant-vowel (CV).
Increasing levels of detail: consonant1-(vowel), consonant2-(vowel) etc. (C1VC2V...)
Verbal and written shorthand (this is 'informal slang' for when the full word in understood by context apparent in a situation, or from introducing the word earlier. Much like we might say "George" and then use "he" for the rest of the conversation.)
Function words have 'free-hanging end vowels'. That is to say, they use a variable form of vowels to indicate they are not embedded within the middle of a standard vocabulary word.
Similarly, voiced/voiceless consonant pairs denote the start of middle of a word.
Finally the termination of the word is a nasal consonant 'cap'. N, NG, or M.
Ergo standard vocabulary has the form CVCV....CV(nasal).
Word order indicates noun, verb and adjective/adverb.

I'd like to share a funny personal anecdote. A few years ago I experimented with online dating.
I agreed to meet this local goth girl. The night before she sends me a link to her website.
She was a pagan and her site was called Freak's Haven.
However she did not use capitals or punctuation or spacing.
So what I saw was freakshaven.
It could be "freak's haven".
Or... freak ... shaven?!
I had no idea what I was getting into, LOL!

What would that have looked like in Decimese?
First of all the word order would have indicated either
1) shaven freak or
2) freak who shaves/did shave etc.
1) adjective- noun or
2) subject- verb.

I expect there will still be plays on words.
Using a shortened slang term normally reserved for another word can still be used for humour, for insults, for double entendres.
So subcultures will still be able to define themselves from mainstream culture by not using received-truncation forms taught in school.
E.g. Instead of C1VC8V(nasal) for a certain word, they could resort to C3VC8V(nasal).

Trivia: 95% of languages use either post or preposition in clauses, despite other theoretically available options.
I may end up treating prefixes/suffixes word modifiers in the same grammatical class in this fashion.
It is unusual to effectively allocate such concepts to essentially 'closed class function word' categories, but for many generic concepts, it might be desirable.
For example, quasi- para- pseudo- demi-...

I am still pondering embedding clause/phrase heads in a word.
I *could* use consonant clusters, limited to (C plus L/R or W/Y), with the LRWY + vowel being a permutation generator) but this would preclude using those 2 pairs in consonant clusters for 2 simple binary states such as is /isn't and such.

Friday, August 29, 2008

brain studies and an optimized language

"(fMRI) to see which parts of the brain were active when volunteers memorized pairs of words such as "motor/bear" or "liver/tree." In this experiment, the volunteers either learned the pairs as separate words that could be fitted into a sentence, or as a new compound word, for example "motorbear," defined as a motorized stuffed toy."

D: other studies:
D: you gotta explain the details of referent-objects and whatnot to kids.
I have 2 nieces, an infant and toddler, so follow this stuff for my sister.

"This research proves the existence of a universal neurological basis for dyslexia. It also highlights the impact that the complexity of orthography can have on reading proficiency of dyslexics and therefore the severity of the disease and the ease of diagnosis. This means that in the Italian population there may be hidden cases of dyslexia. On the other hand, otherwise mild cases of dyslexia may appear far worse in irregular orthographies like that of English or French."

In English, there are 1,120 ways of representing 40 sounds (phonemes) using different letter combinations (graphemes).
D: !!!
D: loud background noise impairs language acquisition.

The researchers discovered that the representations guide when and how the statistical computations are carried out by listeners. Specifically, the study shows that consonants serve mainly to distinguish among words, whereas vowels tend to carry grammatical information. According to researchers, listeners are sensitive to this difference.

D: a summary of brain studies might result in the following principles of design:
1) SOV word order.
2) vowels for grammar, but consonants for vocabulary
3) a simple and regular spelling orthography.

Is anyone aware of a language that uses 2)? It sounds complicated to design.

I gotta make a working sample of that Hioxian phonemic alphabet.
Every time I see the subdivided bike symbol in the bike lane spray-painted on, or the alphanumeric display on the vending machine at work, I think of it.
A result of a phonemic v.s. phonetic system, and by that I mean showing multiple ways to articulate the same sound, is there unfortunately will be multiple ways to show some phonemes. As a matter of convention, one could default to the most common one.
After all, does it matter if one place the tongue tip on the top teeth or the front palate?
(Neat website, pic is from it.)

Wednesday, August 6, 2008

UN year of the language

Here's the first chunk of the press release:

The General Assembly this afternoon, recognizing that genuine multilingualism promotes unity in diversity and international understanding, proclaimed 2008 the International Year of Languages. ...The Assembly, also recognizing that the United Nations pursues multilingualism as a means of promoting, protecting and preserving diversity of languages and cultures globally, emphasized the paramount importance of the equality of the Organization's six official languages (Arabic, Chinese, English, French, Russian and Spanish).... Further, the Assembly emphasized the importance of making appropriate use of all the official languages in all the activities of the Department of Public Information, with the aim of eliminating the disparity between the use of English and the use of the five other official languages.

D: notice anything? Yep - designed languages have dropped off the radar.
This can summed up as a proposal to
1) learn more major languages
2) at least preserve minority ones.
The rhetoric is that of preserving species biodiversity.

My desire to discuss a 2045 world language for the UN is against the current trend.
Esperanto has been out of fashion for a century, at least in gov't circles.

My next 2 entries will be:
1) a review of Lojban
2) a critique of Ygyde.

Monday, April 21, 2008

artlangs (artisitic) used in fictional works

D: today we'll look at Tolkien and his fantasy languages from Lord of the Rings (LoTR). Plus, Klingon from Star Trek.

Tolkien was a huge language designer.
"...artistic languages usually have irregular grammar systems, much like natural languages. Many are designed within the context of fictional worlds, such as J. R. R. Tolkien's Middle-earth and Mark Rosenfelder's Almea. "

"Parallel to Tolkien's professional work as a philologist, and sometimes overshadowing this work, to the effect that his academic output remained rather thin, was his affection for the construction of artificial languages. The best developed of these are Quenya and Sindarin, the etymological connection between which formed the core of much of Tolkien's legendarium. Language and grammar for Tolkien was a matter of aesthetics and euphony, and Quenya in particular was designed from "phonaesthetic" considerations; it was intended as an "Elvenlatin", and was phonologically based on Latin, with ingredients from Finnish, Welsh, English, and Greek"

D: this artlanger did not like auxlangs at all!
"Tolkien considered languages inseparable from the mythology associated with them, and he consequently took a dim view of auxiliary languages: in 1930 a congress of Esperantists were told as much by him, in his lecture A Secret Vice, "Your language construction will breed a mythology", but by 1956 he had concluded that "Volapük, Esperanto, Ido, Novial, &c, &c, are dead, far deader than ancient unused languages, because their authors never invented any Esperanto legends"

D: what little success Esperanto had was due to its association with the internationalist then world peace movements.

Klingon is similar to Elvish in why it is successful.

"A small number of people, mostly dedicated Star Trek fans or language aficionados, can converse in Klingon. Its vocabulary, heavily centered on Star Trek or 'Klingon' concepts such as "spacecraft" or "warfare", can sometimes make it cumbersome for everyday use — for instance, while there are words for "transporter ionizer unit" (jolvoy') or "bridge (of a ship)" (meH), there is currently no word for "bridge (that you drive over)". Nonetheless, mundane conversations are common among skilled speakers."

D: note the "small number of people" vastly exceeds the number that speak Esperanto!
Again, we see the power of inspiring the imagination of viewers, and showing language as part of a the rich backdrop of some fascinating culture.

D: I plan to showcase my languages in similar fashion. VERSE (google dinosnider666), a futuristic quasi-creole with a tone system, is just part of a sea nomad subculture. I try to imagine a scenario where the pressures would exist to bring such a language into being based on the utility it provides.
Decimese works better with a completely isolated culture, perhaps a scifi new planet colony. Auxlangs suffer from a conundrum. They would be useful- if only everyone already knew them! But nobody wants to be an 'early adopter', for fear of being only able to talk to themselves.

Attaching to some ideological movement has some merit for an auxlang, essentially adopting some surrogate subculture to call its own.
Lojban and "The Brights" and/or hard atheists might be an example of this.
D: I'll cover Lojban tomorrow. It deserves its own entry.

The Brights (self-named).
D: at best they are religious naturalists. At worst, they simply place their faith in science.
D: they just accept that humans will feel such religious sentiments as joy, a sense of wonder and awe and mystery. But direct it towards the natural universe.


Sunday, April 20, 2008

novial. rigid word order in lieu of other indicators. stress.

"Novial was first introduced in Jespersen's book An International Language in 1928. It was updated in his dictionary, Novial Lexike, published two years later and further modifications were proposed in the 1930s, but the language became dormant with Jespersen's death in 1943. In the 1990s, with the revival of interest in artificial languages brought on by the Internet, many people rediscovered Novial."

D: yet another example of a founder who functions as "charismatic leadership". With their demise, ardour is dampened and the language falls by the wayside as a historical footnote.

"Note that in Novial the Nominative and Accusative pronouns are the same.

The standard word order is, as in English, subject-verb-object. Therefore, the object need not be marked to distinguish it from the subject: E.g:

  • me observa vu – "I observe you"
  • vu observa me – "you observe me"
D: note how many aspects of language are not needed if rigid word order takes care of it.

"The personal possessive adjectives are formed from the pronouns by adding -n or after a consonant -en. This is in fact the genitive (possessive) of the pronoun so men means both "my" and "mine" ("of me"): E.g:

  • "My dog" = Men Hunde
  • "The dog is mine" = Li Hunde es men

Possession may also be expressed with the pronoun de: de me, de vu, and so on."

D: I like the idea of an optional explicit indicator though.

Many English-speakers say "You and me" instead of "You and I". This rule confuses us.


All adjectives end in -i, but this may be dropped if it is easy enough to pronounce and no confusion will be caused. Adjectives precede the noun qualified. Adjectives do not agree with the noun but may be given noun endings if there is no noun present to receive them.

[edit] Adverbs

An adjective is converted to a corresponding adverb by adding -m after the -i ending of the adjective."

D: English has word order SVO - subject verb object.

With details, adjective-noun verb-adverb adjective-noun. There are exceptions in common parlance. Star Trek's "to boldly go where no man has gone before..." technically ought to be "to go boldly". However, the -ly makes it clear enough. Alternatively, very rigid word order in the form of adjective/adverb-noun- adjective/adverb-verb ... is also clear. My Decimese will count on this. I write it sSvVoO for (adjective to noun/subject) subject (adverb to verb) et al. I can use the same indicator for adjectives and adverbs, since word order renders which it is clear.

Decimese standard vocabulary items (other than closed class function words and certain special core concepts) terminate in a Mandarin-esque style in -N, -M and -NG. These are simple enough for Cantonese speakers. If adjectives and adverbs terminate in, say, -N then only one possible interpretation is possible.

I.e. word terminations -N -M -N -NG... MUST be adjective-subject adverb-verb...

Note how very clear the word boundaries will be. In fact, I have possibly FOUR indicators of word boundaries at any one time:

1) nasal consonant ending,

2) voiced or voiceless consonants by word start or mid-word position,

3) long or short vowels by same, and

4) completely predictable syllable stress.

To be frank, I was not completely sure which method to use. There is a trend in languages to stress the first syllable if heavily suffixing or the penultimate (second last) syllable if heavily prefixing. But what is a heavily taxonomic system? It really is not either. I will likely stress the syllable with the nasal consonant word-ending. Stress is actually a fairly flexible concept. Is it pitch, loudness, or duration? All 3? 2? Which 2? Or just 1?

Anyone that has ever heard New Zealand English will realize how much this can throw a listener off.

Predictable stress if a good quality of Finnish, and part of what makes it so easy to learn.

English has certain tendencies. PUP-py. KIT-ty. But then: gi-RAFFE. Lotsa kids think "RAF" is the first syllable by over-applying this rule. Contrast contrast and contrast. The noun form tends to stress the first syllable, the verb form the second.
D: Ygyde, while not as simple as Finnish, is a whole lot more regular than English.
"If the last letter (phoneme) of Ygyde word is a or i, the last syllable is stressed. If the last letter of Ygyde word is neither a nor i, the syllable preceding the last syllable is stressed (stressed antepenultima). Stressed syllables are underlined: ooo, oooo, ooooo, oooooo, ooooooo. The stress helps distinguish similar words, for example, iby (right) and ibi (along)."

D: but then their alternative pronunciation, in itself a good idea, opens a can of worms.
"Long Ygyde syllables are made of three letters. They are called long syllables. Different speakers of Long Ygyde can use different phonemes in the same place of the same word. For example, vowel a and vowel e are interchangeable. Those who cannot pronounce a can pronounce e and vice versa. Long Ygyde's phonemes are divided into 11 groups; 3 vowels: first vowel is a or e, second vowel is u or o, third vowel is i or y, and 8 consonants: first consonant is b or p, second consonant is d or t, third consonant is g or k, fourth consonant is w or f, fifth consonant is z or s, sixth consonant is j or c, seventh consonant is m or n, and eighth consonant is l or r. Letter r may be pronounced like r in the word car and in many other ways. The remaining letters are pronounced like the Standard Ygyde letters. The total number of the long syllables is the same as the total number of the standard syllables. The Long Ygyde is a spoken only language of those who cannot pronounce some phonemes of the Standard Ygyde."

"The following table explains how to translate vowels and syllables from Standard Ygyde to Long Ygyde. If the compound word has any Long Ygyde syllables, its last syllable is stressed to distinguish the word from Short Ygyde words."
D: I confess thatI am quite intimidated by this.

English quirk of the day: syllable stress rules.

1 Stress on first syllable

Most 2-syllable nounsPRESent, EXport, CHIna, TAble
Most 2-syllable adjectivesPRESent, SLENder, CLEVer, HAPpy

2 Stress on last syllable

Most 2-syllable verbsto preSENT, to exPORT, to deCIDE, to beGIN
D: I was not even aware of how any particular suffix can indicate advanced stress rules!


“The limits of my language mean the limits of my world”

Ludwig Wittgenstein quotes

Thursday, April 17, 2008

euroclones. a case for them. interlingua.

"The language Occidental, later Interlingue, is a planned language created by the Baltogerman naval officer and teacher Edgar de Wahl and published in 1922.

Occidental is devised with great care to ensure that many of its derived word forms reflect the similar forms common to a number of Western European languages. This was done through application of de Wahl's rule which is actually a small set of rules for converting verb infinitives into derived nouns and adjectives. The result is a language relatively easy to understand at first sight for individuals acquainted with several Western European languages. Coupled with a simplified grammar, this made Occidental exceptionally popular in Europe during the 15 years before World War II, and it is believed that it was at its height the fourth most popular planned language, after Volapük, Esperanto and perhaps Ido in order of appearance."

D: a language minimaxed for Europeans will fail outside that region.
There is a lesson here. One cannot design a language based on prioritizing certain principles, then somehow expect it to surpass those limitations.
For example, Loglan/Lojban was designed to test the Sapir-Whorf (not Worf, trekkie!) hypothesis. It has far too many phonemes, particularly diphthongs to be global.
Enthusiastic followers have pointed out the robust first-predicate logic possible with it. Somehow they fail to see the illogic of their position that this makes it a suitable world language.

However, a euroclone language, loosely derived from Greek and Latin but simplified, does have a possible application. It could serve as a regional Euro interlang. Inter(national?) language.
You see, EU has 23 going on 25 official languages. Each speech or document must then be copied in the 22 other languages. The math for this is the same of for the Birthday Paradox.
D: 23 has 253 possible pairs.

The European Commission is seeking to make us all speak in Brusselsese by donating millions of its documents to translation software developers.

The commission described the donation of its "collection of about one million sentences and their high quality translations in 22 of the 23 official EU languages" as a step further in its "efforts to foster multilingualism as a key part of European unity in diversity".

D: I think the number of pairs is more like 400 around 30 language pairs, and so on.
I am of the opinion that a language should be optimized for computer translation also.
I will cover various con-lang attempts tomorrow.

D: however, the machine translation ignores the need for real-time live communication.
There are some nifty gadgets to help US troops with Arabic in the Middle East.
D: however, I would not want to rely much on a machine for face-to-face conversation.
After all, machines need batteries and won't be present or working at all times.

A language designed with human and computer translation in mind could occupy both niches.

Oddly enough (not), Decimese has dual modes:
1) Use of CV syllables embedded in the word with LRWY to start, or
2) Use of free-standing work particles of same motif, but obviously the alternative vowel form.
The main problem with this approach is that 5 short and 5 long vowels far exceed the phonemes of much of the world. English is very vowel rich - just ask somebody that speaks Spanish.
I recruit H to allow variant pronunciation.

English quirk of the day: multiple ways to make the same sound.

E/ e (me), ee (feet), ea (leap), y (baby)

QOTD: Language shapes the way we think, and determines what we can think about.

Benjamin Lee Whorf quotes (American Linguist noted for his hypotheses regarding the relation of language to thinking and cognition and for his studies of Hebrew and Hebrew ideas, 1897-1941)

Monday, April 14, 2008

rise of the isolating aux-lang
"Professor Lancelot Hogben devised Interglossa while fire-watching on the roof of Aberdeen University during a war.[1] He was inspired to remove all inflections from Interglossa by the publication of Latino Sine Flexione by Peano in 1905 but thought that the list of vocabulary was too extensive to be of much use as an IAL. For this reason he made Interglossa's vocabulary much more basic. A draft of Interglossa was originally published by Hogben (by the publishing company Pelican Books in London) in 1943 as "Interglossa: A draft of an auxiliary for a democratic world order, being an attempt to apply semantic principles to language design". Hogben listed 880 classical words and roots that he believed would suffice for basic conversation."

D: that Latino Sine Flexione.
"Though Peano removed the inflections of Latin from nouns and adjectives, he did not entirely remove grammatical gender, permitting the option of a feminine ending for occupations. The gender of animals is immutable. All forms of nouns end with a vowel and are taken from the ablative case, but as this was not listed in most Latin dictionaries, he gave the rule for its derivation from the genitive case. The plural is not required when not necessary, such as when a number has been specified, the plural can be read from the context, and so on. Verbs have few inflections of conjugation; tenses and moods are instead indicated by verb adjuncts. The result is a change to a positional language."

D: much of the world cannot use infixes adeptly. That means using isolated words instead of adding elements to existing ones.
Lang26 doctrine agrees.
"[3] As for the grammar, we should look to the IAL's priorities. To begin with, the IAL will mainly be used for essential international communication. It will be a true auxiliary language - mostly limited to and focused upon practical necessities. As such, its grammar might well be initially based on the pidgin or Interglossa (original Glosa) model - strict word-order, three tenses and no inflections. The opening phase of the IAL might also be regarded as a global pidgin in terms of its chiefly mundane concerns, and like these utilitarian tongues, which are designed for real-time situations where context provides physical subjects and objects and most of the action, it will require hardly any grammar."

D: An aux-lang must decide whether it is to be a first or second language.
Will it be learned in childhood, when complexity is not so hard, or in adulthood when it is hard?
Synthetic Grammar

Greek, Latin, Arabic and French - major IALs up until recent times - have grammars which employ affixes rather than fixed word order, i.e. they are synthetic rather than analytic. Synthetic grammar is more complex, and can be impenetrable, but it does have the ability to reduce speech and text-length - since affixed words effectively contain a phrase or clause within themselves.

The decline of these great languages as IALs is related to the spread of universal education and literacy. In days when education was highly selective, an ability to cope with classical languages and synthetic grammar was par for the course. The organised movement to reform English spelling accompanied the advent of mass education for much the same reason (LANGO Chapter 9)."

D: a handful of talented academics can learn a synthetic language as adults. Scribes historically, and linguists /classics majors more recently. But for a language intended for the masses...

"Analytic Grammar

Analytic grammar facilitates the laboriously learnt second-language, painfully acquired in isolation or small groups, much more than the mother-tongue absorbed amid the varied life of a speech community; the analytic sentence parses itself for the benefit of the busy or discouraged student. Another important consideration is that those with a synthetic mother-tongue can easily understand analytic grammar, but not vice-versa"

D: an interesting solution if a language that begins as a mass-learned second or third language, an analytic aux-lang. But one that can become or also has a mode that is synthetic.

My VERSE is much like that. Somebody who grew up with it could use tone for supplemental detail. However if talking to somebody not so adept, the tones can be 'unpacked' into the original word particles used to denote those details. If fluent, they can optionally use one or the other, or both.What they choose would be based on the demand for brevity, redundancy (clarity) and simplicity.

That remind me of a sign I once saw in a mechanic garage. You can have 2 out of 3 - fast and cheap but not good. Good and fast but not cheap ... It was funny.

Well, language is much like that. We choose design priorities and must sacrifice other elements to get there.

And example of a fictitious language that chooses fast as the design priority is Heinlein's Speedtalk.

"Speedtalk is an idea for a new language put forth by Robert A. Heinlein in his novella, Gulf (1949). Speedtalk was defined as an entirely logic-based language which, in the course of the fictitious work, served its purpose as an intriguing plot device. The basic concept was that the conlang would utilize a complex syntax with a minimal vocabulary and a phonemically extensive alphabet (including such letters as œ, ħ, ø, and ʉ), and it was therefore considered extremely efficient. In one example (the only one given), a single word meant "The far horizons draw no nearer."

Many of these ideas have been incorporated into the Ithkuil language."
"In the 1940's Robert A. Heinlein wrote a science fiction story named Gulf, which described the exploits of a society of supermen who used a language named Speedtalk. The premise, as Heinlein described it, was that every word in the language consisted of only a single phoneme, and thus each sentence would be only as long as a single English word. Heinlein argued that people who spoke such a language would be able to think more quickly as well, by virtue of the fact that their thoughts would all be in Speedtalk. As a result, they would be able to squeeze centuries of experience into a few decades of calendar time and would experience a longevity of the mind, if not of the body"

D: one can see echoes of the taxonomic languages like Ro, with their emphasis on a 1:1 ratio between phoneme and lexeme.
English occasionally has only a minimal phoneme pair difference to denote different meaning. The prefix in REview and PREview comes to mind.

D: my VERSE does not use rising/falling pitch or gemination. Gemination is varying phoneme duration to denote meaning. Finnish uses this for both vowels and consonants. Japanese uses it for vowels only. I knew an ESL teacher who worked in Japan. She was not even aware that Japanese had vowel gemination!
I find Speedtalk has a bias. The emphasis on data density per SYLLABLE ignores data density per unit of TIME. Mandarin has various advanced rules to shorten the duration of their lengthy rising/falling tone. I find a pitch register system, where the tone is held steadily for each syllable allows fast syllables. The data density gains of a rising/falling tone and geminating language are largely illusory. Per second, the data density and relative simplicity of a pitch register system wins out, IMHO.

I saw some show about a crisis on board some passenger jumbo jet. The human factors analysis indicated that during certain staccato exchanges in English, the data rate was about 1 per second.

In written form, the data rate was about 2/3 per letter. Some digraphs/trigraphs and silent letters reduce the efficiency.

Imagine a data rate for VERSE with 24 quarter pitch notes available of a magnitude shift more than English! And a writing system - Hioxian - that has a perfect 1:1 ratio, as well as certain truncations for consonant clusters and using variants of the number naming convention in Decimese for a syllabary.

English quirk of the day. Just one verb tense reviewed.

QOTD: "An international auxiliary language should serve as a broad base for every type of international understanding, which means, of course, in the last analysis, for every type of expression of the human spirit which is of more than local interest, which in turn can be restated so as to include any and all human interests. "

The Function of an International Auxiliary Language

Friday, April 11, 2008

solresol and tone in language

D: this was just a novelty language, but was so distinctive it is worth noting.

Solresol is an international auxiliary language designed by Jean Francois Sudre in 1827. He published his major book on it, Langue musicale universelle, in 1866, though he had already been publicizing it for some years. Solresol enjoyed a brief spell of popularity, reaching its pinnacle with Boleslas Gajewski's 1902 publication of Grammaire du Solresol.


Solresol words are made up of only seven different syllables. These syllables can be represented in a number of different ways — as musical notes of different pitch, as spoken syllables (based on solfege, a way of identifying musical notes), with colours, symbols, hand gestures etc. Thus, theoretically Solresol communication can be done through speaking, singing, flags of different color, etc. — even painting."

"The teaching of sign languages to the deaf mute was forbidden between 1880 and 1991 in France, contributing to Solresol's descent into obscurity."
D: being adopted, or opposed by governments can affect a language's success. I imagine that applies to NGOs too.

D: Solresol uses only pitch. Natural languages may use pitch to supplement or modify basic words.
However, some are more complex. In Yoruba there are three pitches (high, low, and middle) and the meaning of a word is determined by the pitch on the vowels. For example, the word "owo" in Yoruba could mean "broom", "hand", or "respect" depending on how the vowels are pitched. Also, "you" (singular) in Yoruba is o in a middle pitch, while the word for "he, she, it" is o in a high pitch. Change of pitch is used in some African languages (such as Luo) for grammatical purposes, such as marking past tense."

D: a language using a tone register resembles musical notes.
Mandarin is more complex.
"Syllables consist maximally of an initial consonant, a glide, a vowel, a final, and tone. Not every syllable that is possible according to this rule actually exists in Mandarin, as there are rules prohibiting certain phonemes from appearing with others, and in practice there are only a few hundred distinct syllables."

D: these rules leave Mandarin with only about 400 syllables, compared to the c. 12,000 in English. Using tone effectively prevents huge presence of homophones.
Everyone knows the example from Mandarin with the four "ma" words that mean horse/mother/hemp and scold.

D: Mandarin uses rising and falling tones, as well as having complex rules for how they interact called Sandhi. I can do the basic tones but that is it.

English only uses tone a bit in intonation. It might show a syllable stress. One raises the pitch of the last word in a question. But that is it.

My VERSE project is being used in a sci-fi story I'm writing on a futuristic sea-nomad subculture. I throw many different linguistic backgrounds together- world refugees. The pressures of a pidgin/creole are present, though without the particular features of one imperial to one colonial language. Instead of just generating vocabulary with tone, I propose methodically modifying the meaning of a grammatical element.
Take, for example, the following sentence.
"Dog bite boy."
A dog? Those dogs? Some dog? Did bite? Is biting? Will? Has been?
The Western musical octave contains 7 whole pitch notes, and 12 half notes.
The Arabic system includes 24 quarter pitch notes.
Using 7 whole notes, each with 4 possible quarter pitch notes, I indicate these nuances via tone.
This language could be very brief yet detailed. Used with old-style English auxiliary verbs, articles and such, it could function as redundant agreement for clarity. A nice bonus is that this language is backwards compatible, like a computer operating system, with less-or-none tonal pidgin English versions.

Learning pitch at and early age as part of a language, plus musical lessons leads to an amazing prevalence of speakers with perfect pitch!
"While we may never know the definitive answer, new research from the University of California, San Diego has found a strong link between speaking a tone language – such as Mandarin – and having perfect pitch, the ability once thought to be the rare province of super-talented musicians.

The first large-scale, direct-test study to be conducted on perfect pitch, led by psychology professor Diana Deutsch of UC San Diego, has found that native tone language speakers are almost nine times more likely to have the ability."


Perfect, or absolute, pitch is the ability to name or produce a musical note of particular pitch without the benefit of a reference note. The visual equivalent is calling a red apple "red." While most people do this effortlessly, without, for example, having to compare a red to a green apple, perfect pitch is extremely rare in the U.S. and Europe, with an estimated prevalence in the general population of less than one in 10,000."
D: babies are born with perfect absolute pitch, but not relative perfect pitch.
D: this is a cute monthly review of various languages.
Note how many figure prominently in works of fiction.
I read a book by the Futurology Society. It noted how many authors inspired many more readers with fictional accounts than with non-fiction. There is a lesson to be learned here.

D: English quirk of the day. A poem.

"A moth is not a moth in mother,
Nor both in bother, broth in brother,
And here is not a match for there
Nor dear and fear for bear and pear,
And then there's dose and rose and lose -
Just look them up - and goose and choose,
And cork and work and card and ward,
And font and front and word and sword,
And do and go and thwart and cart -
Come, come, I've hardly made a start!
A dreadful language? Man alive!
I'd mastered it when I was five!"

QOTD: "As a matter of fact, a national language which spreads beyond its own confines very quickly loses much of its original richness of content and is in no better case than a constructed language. "
Edward Sapir

Thursday, April 10, 2008

Wilkins and Ro - taxonomic types

D: I will now touch upon few aborted language attempts. Nobody really learned these, so they are mere historical curiosities. Ro is inspired by Wilkins.

"He divided the universe in forty categories or classes, these being further subdivided into differences, which was then subdivided into species. He assigned to each class a monosyllable of two letters; to each difference, a consonant; to each species, a vowel. For example: de, which means an element; deb, the first of the elements, fire; deba, a part of the element fire, a flame. In a similar language invented by Letellier (1850) a means animal; ab, mammal; abo, carnivore; aboj, feline; aboje, cat; abi, herbivore; abiv, horse; etc. In the language of Bonifacio Sotos Ochando (1845) imaba means building; imaca, harem; imafe, hospital...

The words of the analytical language created by John Wilkins are not mere arbitrary symbols; each letter in them has a meaning, like those from the Holy Writ had for the Cabbalists. Mauthner points out that children would be able to learn this language without knowing it be artificial; afterwards, at school, they would discover it being an universal code and a secret encyclopaedia."

D: the language had problems, though. Some words don't fit well into the categories.

"Let us consider the eighth category, the category of stones. Wilkins divides them into common (silica, gravel, schist), modics (marble, amber, coral), precious (pearl, opal), transparent (amethyst, sapphire) and insolubles (chalk, arsenic). Almost as surprising as the eighth, is the ninth category. This one reveals to us that metals can be imperfect (cinnabar, mercury), artificial (bronze, brass), recremental (filings, rust) and natural (gold, tin, copper). Beauty belongs to the sixteenth category; it is a living brood fish, an oblong one."

D: I was inspired enough by this last observation about children that I started playing with words that children could say at various ages that would later also form the basis for such a language. See my Childese effort.

D: Ro is a modern variation of Wilkins' effort.
Like Solresol, Ro is an a priori philosophical language, with a vocabulary derived not from natural languages but from a classification structure. The sense of a word is indicated by its initial letters; for instance, in Ro, bo- is the category of "sense-affecting matter", and color words (falling under this category) begin with bofo-: bofoc means "red", bofod means "orange" and bofof means "yellow".

D: a problem is the very similarity that makes such words easy to learn. For example, a cucumber and pumpkin are both vegetables. If they vary by only one phoneme, context is not of much help to tell them apart. I guess this is a case of choosing your poison. Easier to learn or easier to understand spoken, once learned. This is a constant theme in language design. Often you get something but you lose something. For example, removing agreement between grammatical elements makes speaking a language easier. No more "I am" but "you are". However, this very aspect of English serves to give a listener not one but two chances to catch the intended pronoun-verb "to be" combo. I introduce optional tonal agreement in VERSE.

Ro Design principles

All of the language is stretched across the alphabet. Pronouns begin with A and mathematical words begin with Z, living things with L and M: mu are animals, mul are birds, mulca are swimming birds, mulcam is a duck.

D: My Decimese effort is taxonomic. I use syllables v.s. single phonemes, something that has not been tried to date. I suffer great problems with brevity due to this. I needed to introduce various tricks for "spoken shorthand" to address this. I make the CV syllable the basis for each lexeme, rather than C or V. Typically, taxonomic languages use alternating CVCVC or VCVC... to do so.

D: Ygyde is another example of this. I will touch upon that language in its own entry later.
Suffice to say, it shares uncanny similarities to Decimese. We vary in details, though, and the devil is in the details. Anybody reading the lang53 article and the UPSID phoneme data will inevitably show similarities, when attempting a taxonomic basis.
"Names of letters and scientific constants are 2 letters long. Names of variables are 4 letters long compound words. Proper nouns are 6 letters long compound words except for names of people and some geographic names, which are 8 letters long. Names of complex chemicals and proteins are proper nouns made of two words. Precise biological names of species are made of three words."
"All other words are either 5 or 7 letters long. They are compound words coined by combining a vowel prefix with two or three morphemes. Examples of the Ygyde compound words:
aniga (corrupt) = a (adjective) + ni (secret) + ga (money)"

D: taxonomic languages typically suffer from a lack of brevity. They also rely on each phoneme/letter so much that minor errors in hearing and typing will completely skew the meaning.
Typically, this lack of redundant phonemes for clarity is offset by fewer phonemes for clarity. This in turn reduces brevity. Alternatively, if they contain brevity there may not be enough base categories to ensure enough nuance in the basic vocabulary.
This has certainly vexed me greatly.

Perhaps the most interesting variant of this is AUI:
" Probably the most bizarre artificial "universal" language of recent times is aUI (pronounced "a-OO-ee"), the "Language of Space." aUI, meaning "space-spirit-sound" or "space-language," and advertised as the "Pentecostal Logos of Love and Peace," was launched on Planet Earth in the 1960's by John W. Weilgart, an Austrian-born Iowa psychiatrist who claimed to have learned the language as a young boy from a little green elf-like humanoid from outer space. The little green spaceman told Weilgart that aUI was the literally universal language used by intelligent beings on all planets throughout the Cosmos. aUI, according to Weilgart, is a perfectly logical and rational language, and learning aUI can actually cure a person of irrational thinking patterns. "
D: quite the character!

Tomorrow, I will look at Solresol, a musical pitch-only language, then segue into various strategies to use tone in more conventional languages.

English quirk of the day:
"I before E except after C" ... not!
beige, cleidoic, codeine, conscience, deify, deity, deign,
dreidel, eider, eight, either, feign, feint, feisty,
foreign, forfeit, freight, gleization, gneiss, greige,
greisen, heifer, heigh-ho, height, heinous, heir, heist,
leitmotiv, neigh, neighbor, neither, peignoir, prescient,
rein, science, seiche, seidel, seine, seismic, seize, sheik,
society, sovereign, surfeit, teiid, veil, vein, weight,
weir, weird

Both French and Latin are involved with nationalistic and religious implications which could not be entirely shaken off, and so, while they seemed for a long time to have solved the international language problem up to a certain point, they did not really do so in spirit. "
Edward Sapir