Saturday, December 26, 2009

comparison english/chinese diphthong phonotactics. hioxian...

Standard English diphthongs
RP (British) Australian American
GA Canadian
low [əʊ̯] [əʉ̯] [oʊ̯]
loud [aʊ̯] [æɔ̯] [aʊ̯] [aʊ̯]
lout [əʊ̯]1
lied [aɪ̯] [ɑe̯] [aɪ̯]
light [əɪ̯]1
lane [eɪ̯] [æɪ̯] [eɪ̯]
loin [ɔɪ̯] [oɪ̯] [ɔɪ̯]
loon [uː] [ʉː] [ʊu̯]4
lean [iː] [ɪi̯]4 [ɪi̯]4
leer [ɪə̯] [ɪə̯] [ɪɚ̯]3
lair [ɛə̯]2 [eː]2 [ɛɚ]3
lure [ʊə̯]2 [ʊə̯] [ʊɚ̯]3

Chinese (Mandarin).

Just lemme cut 'n paste here. Macs- I hate 'em.
SIlvia is rebuilding my PC so I can load up the Font design software for me again.
I get by with a little help from my friends...

D: OK so diphthongs- dual vowels- with phonetic meaning and lexemic content are important for Decimese.

Keep in mind that our vocabulary generating, based on the consonant-vowel (CV) syllable versus just a single C or V makes brevity difficult.
Most taxonomic languages suffer from being unclear when heard. If only one sound stands between 2 different meanings, then any background noise can be a problem.
E.g. if roboba is banana, while robobe is orange, then the rest of the word that is not different is of no help.
My CV-syllable basis is a compromise to retain the benefit of 'Ro' while minimizing the shortcomings.
Witness: if robocabam is banana, and robocatim is orange, then we have a much better chance of hearing the difference.
Metronyms seem to be a theme in Decimese.
I studied the top 1000 words of English - the most common ones- this month.
A taxonomic philosophical language like Ro fares pretty well with them.
I found I was trapped in my English language while trying to suss out a few concepts behind most words.
I kept digging up synonyms, or other common words. I can see 'thinking outside the box' will prove hard.
Things like 'kinship', 'social status/organization hierarchy', social prestige and such kept repeating as concepts.

Special syllables.
Anything that starts with an H can be used in this fashion. Hi, Ha, He-...
Similarly, mid-word, embedded inside a word, -hi-, -ha- ... can also denote 'special' meanings.
It's hard to explain with Decimese, since the premise is different that other languages.
Instead of starting at, say, a list of pronouns, I attempt to build them up from one iteration earlier.
I.e. primitive math, geometry (and therefore space) concepts.
MELTS- math, ethics, logic, time, space.
Lots of languages claim they have some special design basis or emphasis. How many deliver?
Math concepts - the space/time - really DO form the foundation behind my closed-class words.

HIOXian. Insight.

I knew that English meaningfully uses voice/voiceless consonant pairs. E.g. P, B.
Chinese instead uses aspiration. P, Ph (breathy). They just use 'B' to stand in for Ph.
I just assumed this meant I'd need to constantly denote voiced/voiceless pairs and standard/heavily aspirated ones.
I was wrong. It means - and this is just in reference to Decimese - that I DON'T need to bother with either!
The HIOXian characters I was making were overly complicated to hand-write in cursive script.
Too many swirls and flourishes. It took too long.
Look at the basis for Decimese words. CV-CV-CV... (end with nasal consonant) - for standard lexicon.
A little more: The very first -word-initial- consonant *can* be voiceless, whereas the internal ones are not.
E..g bapan. (B-P pair).
Keep in mind that word boundaries will still be clear even without this. It is just one additional aid.
Enough of the world cannot use voiced/voiceless consonant pairs that to differentiate between them is a non-starter.
So now the HIOXian cursive script becomes much simpler.
I imagine that the computer font could also neglect both the English voiced/voiceless and Mandarin standard/aspirated dual aspects.
It would be less phonetic and accurate, strictly speaking. But would make for a more spartan and minimalist appearance.

I was having an additional problem with HIOXian - the opposite, in fact.
A character that is TOO minimal tended to have a series of disconnected parts. Worse, the letters could be on either the right OR left hand side of the character space. This is NOT a big deal with COURIER font (regular even spacing)- though a series of such unfortunate characters side-by-side could still cause confusion.
I found myself looking at a much earlier proposed design I had explored.
Let us look at the labial aspect - lips.
Trivia: labia is already plural. Labium is singular. Our bilateral human bodies favour the existence of a 'dual' plural option.
Decimese will be able to capture this nuance with its single/dual/plural system. Decimese will be looting various natural languages to increase the power and scope of its closed class function words.
Ergo, Decimese can express:
1) labium. one lip.
2) set of labium - labia.
3) labia-ez? Presumably multiple sets of the same kind. Or one human's grand total.
I put wayy too much thought into um language LOL. What can I say - I'm a cunning linguist! <:
Cuz the HIOXian letter is really nothing more than a stylized human head in cross-section, only slightly stylized so that it can almost claim to still be pictographic (a picture) instead of ideographic (an idea).
OK. so imagine this human head looking to the left. The lips are obviously the left-most anatomical part that is involved in speech. Logically, the teeth would be next.
When the lips are engaged with each other, for example in the bilabial nasal "M' sound, or the initial part of the plosive 'P', we need to show the two vertical left-side 'meaning segments - bars' in contact.
Here's where it gets interesting. The HIOXian letter, being a static and unchanging figure like any letter (unless we dress up a font with video animation) will have difficulty expressing a change of anatomical positions.
For example, M shows no change in bilabial lip position. Whereas the plosive P obviously has the lips part in the second part o the movement. The air flow bar segments indicate that the P is a plosive movement, with a sudden forceful exhalation.
However we see that the M and P sounds require different later expressions of lip position compared to each other.
I initially considered indicating when lips and teeth were retracted-not in contact. I could also do this with the tongue.
Anyway, instead of denoting retracted, the plosive P sound can use this second position of 'retracted' to denote the second half of the movement.
Note: I struggled with wide/narrow tongue positions. I realized this week that I can express this as an air-flow aspect, with the implication that the tongue is behaving in said fashion.
So the HIOXian figure (think old calculator display) has two bar segments on each facing.
The left/top is top lip/ retracted after movement. By extension, the right/top would be dentition - the top teeth - retracted after movement.
The top 1/2 of relatively vertical bar segments are reserved to indicate tongue position of contact on the mouth top.
(Lip) (Teeth) velar ridge - hard palate- soft palate.
I also struggled with this. Do I need to show adjacent physical contact between body parts?
I.e. if the fricative "F" is top teeth-bottom lip, then should those bar segments be in contact.
I eventually ruled this out cuz I need to free up some segments for additional meanings.
Ergo F would show the lower lip - left/low/vertical and upper teeth- mid/high/vertical bar segments.
There is no change in position during a fricative. Air resistance is denoted via the leftmost bar segments.
Back to the tongue.
I eventually realized that I really DON'T need to denote tongue contact point very specifically.
Only certain combinations are possible. For example, nobody will use the rear raised part of the tongue to make contact with the teeth. I suppose this IS possible. Certainly. an L sound can be had via multiple positions. Different phonete, yes, but same phoneme - same meaning. This is much like how aspiration is unused in English, while remaining present as a physical side effect of bioarticulation.

It is not phonemic in English. In English we say that the aspirated [p] of 'pill ' and the unaspirated [p] of 'spill' are allophone

D: whereas Mandarian Chinese DOES use the P/P(aspirated) pair for phonemic and hence lexical functionality.

Ok so back to the tongue.
So far I think I'm reserving the bottom half of the fairly vertical /|\ internal bar segments for the tongue.
It may be the I will need them to denote details about air flow. If so, I just use one - | - to denote the tongue and let the user figure out the rest.
A theme that is emerging about the HIOXian phonetic, ideographic, anatomical letter system is how optional many details can be. Want a fancy cursive? Indicate all optional details.
In a rush to short-hand yer notes in class while the prof drones on? Less detail.

An interesting side effect of HIOXian is the ability to mechanically compare to figures or a series in a word, and guess how the sounds will deform. This would be based on both sound sequence, as well as syllable stress. Rules to apply could be used without the user even understanding anything about co-articulation or any other aspect of linguistics.
Take, for example, the word "elephant"
e- le- phant. e-le-fant.
E! (primary stress) le (no stress) fant (secondary stress).
We can expect that the second syllable -le- will become truncated and deformed.
The 'eh' sound in colloquial speech become instead the 'schwa' sound from 'the' - think a really brief and weak 'uh'.
The diacritic in HIOXian - strictly optional BTW - would indicate syllable stress.

Origin of HIOXian - it was for a sci-fi short story I am writing. Basically about an after-the-holocaust 'Rosetta Stone'.

The Rosetta Stone is an Ancient Egyptian artifact which was instrumental in advancing modern understanding of Egyptian hieroglyphic writing. The stone is a Ptolemaic era stele with carved text made up of three translations of a single passage: two in Egyptian language scripts (hieroglyphic and Demotic) and one in classical Greek. It was created in 196 BC, discovered by the French in 1799 at Rosetta and contributed greatly to the deciphering of the principles of hieroglyph writing in 1822 by the British scientist Thomas Young and the French scholar Jean-François Champollion. Comparative translation of the stone assisted in understanding many previously undecipherable examples of hieroglyphic writing.

D: how do you get illiterate primitive savages to self-teach how to read???
With a picture of a human head in cross-section. And so HIOXian was born. That was a few years ago.
I'll post the story.

HIOXian diacritic and English syllable stress system.
There are various ways to denote syllable stress. Obviously a mono-syllable language has no need of it- Vietnamese?
English happens to use 3 parts:
1) pitch
2) duration
3) volume.
Normally, writing short-hand suggests a simple mark for primary and secondary stress would suffice.
Because in English this is linked to duration.
The diacritic is UNDER a consonant and OVER a vowel.
An option to speed up reading speed.
CV-CV is then very clear.
Anyway, a syllable MUST have a vowel. While a consonant is optional.
In Decimese, this is NOT true. No syllable is smaller than CV-.
Ergo, in Decimese we have the option to denote syllable stress on ... the CONSONANT.
Weird huh.
Why would we want to do that?
Tonal languages.
Lexical-tone, to be precise. Such as Mandarin Chinese.
Roughly speaking, a high or low or mid pitch changes word meaning.
ma ma ma ma - scold, hemp, horse, mother.
We then have the option later to explore this aspect on the vowel.

Anyway, so on a Decimese consonant, the diacritic (subcritic versus supercritic) would appear as:

| /|\ |
- -

Nice artwork, huh? <:

OK so the simplest thing to do would be to simply use the central post marked to show stress.
But then for large enough words in, say, English, we could not show both secondary and primary stress.
So we use - -. That way, we show timing- duration.
In a language like French, it does not change duration as part of syllable stress.
The additional vertical bars can be used to express *some* pitch and volume information, with 2 levels.
|/ \|
The central bar, well, again it is a nice 'cheat' to short-hand stress info. Primarily for cursive writing to keep it brief.

Suprasegmental meaning - for example, interrogative.
We raise the pitch on the last word in English to show a question.
The presence of 2 levels of pitch means - wait for it- we can start showing prosody date if we wish to also.

So we have a rough outline, in summary of
1) the main square HIOXian character which
a) contain initial anatomical position
b) shows change/movement.
c) shows air resistance and qualities
With a diacritic that shows
a) volume
b) duration
c) pitch.

My room is gonna be a mess. Make lotsa sheets of HIOXian blank characters, explore schemes.
Try to map 50-60 phonemes onto them. !
I am primarily interested, in descending order, in
1) Decimese
2) English
3) Chinese
being mapped well onto the characters.
4) IPA is a consideration, I'd like to be able to express more nuance.
I've finally accepted that I cannot supplant International Phonetic Alphabet in the core character plus diacritic.
I'd hafta use other variations of the HIOXian shape to do so.
For example, the left OR right half of the HIOXian letter plus sub/super diacritic.
OR: one of three tiers, if we treat the HIOXian character space as a stack of 3 diacritics.
But that is for later. I have wayy to much work as it is.
Practically speaking, my first use will be taking notes myself in HIOXian minimalist cursive writing.
That means 2) English is in fact what I should focus on for now.
Only by using my system will I get a feel for its strengths and limitations. Then I can maybe refine it.
One day, maybe it will be ready for primetime.
One day... year.... decade.... lifetime?

Wednesday, December 23, 2009

blog entry 99 5/8. how to get computer to 'meet us in middle' for translating

The Dutch language has many combinations of words whose features cannot be explained by simply looking at the qualities of the individual words. The meaning of 'missing the boat', for instance, isn't always the same as 'being too late to catch the boat'. This type of word combination doesn't pose problems to people, but linguistic computer systems, such as speech recognition software or programmes preparing automatic summaries, just don't recognise these expressions.

Grégoire prepared a list of about 5000 unpredictable word combinations. She divided them up into different classes on the basis of their structure. She looked at the rules of singular and plural; for example you can't 'take to your heel', just 'take to your heels', and 'take to those heels' doesn't work either. Grouping together various classes of word combinations can minimise the amount of manual work to incorporate the list into a computer system and it means that the list can be used for many different systems.
D: notably, this idiom is also difficult to new immigrants and other ESL students.

So any strategy to aid the computer will also aid humans.

A most difficult thing to do is to refrain from using idiom.
I find this nearly impossible, in practice.
English has already been etched into my mind, and this playful and cultural phrasing is part of it.
I think speaking less English than I know may be as difficult as learning more.

Idiom: an expression, word, or phrase that has figurative meaning — its implication comprehended only through common use; whereas the literal definition of the idiom, itself, does not communicate its meaning as a figurative usage.

ace in the hole CAN, UK, USA A hidden advantage or resource kept in reserve until needed
Achilles' heel Global A person's weak spot
across the board Global Applies to everyone or everything
against the grain...

Many idiomatic expressions are based upon conceptual metaphors such as "time as a substance", "time as a path", "love as war", and "up is more"; the metaphor is essential, not the idioms.

D: thus my interest in Decimese having overt optional explicit indicators of context.

English: spatial: forward.
Time: forward.
Wouldn't a time/space indicator be nice.

Out of interest, I developed a geometrical representation of pronouns yesterday.
Pretty simple. It just indicated a circle. In it and we have first person. One dot in centre and we have I.
More off -centre but inside and we have we. And so on.
Laying bare such concepts sans picture but using word-phoneme-lexemes highlights this aspect hidden in English.
D: sweet - many chapters from a book on idiom called "Metaphors We Live By".
I'll hafta read that.

I've been reading over Toki Pona recently. That means the good language.
It is by translator Sonja Kisa in Toronto. It received some media coverage.
She's quite the character!
Anyway, TP only has um 130 words.
Then it relies heavily on compounding to express nuance.
Because these compound nouns are defined in detail as standard, that means learning a whole lotta
multiple-word lexemes after the initial 130 words.
In some respects this resembles Ogden's Basic English. It had a basic vocabulary of 850 words.
The key concept here is that of metronym.
A word that captures a whole class of words would qualify. "Thing" or "item", e.g..

So in many respects, we are simply delaying the need to memorize vocabulary.

I'm really enjoying TP.
It explores just how minimal a language can be, and still function.
It shops around in natural language for 'simplifications' in grammar- then uses ALL of them.

The phoneme inventory is extremely well thought out, being nearly universal.
The only thing she could do that remains is to reduce vowel sounds from 5 to 3 - AUI.

Now I am sooo bad at languages that I am still finding learning TP hard.
My roomie is learning it too. He is a natural-language polyglot, and has guffawed at some of the simple parts.
But we are gonna practice speaking it in our household.
I am using 'cheats' to learn the vocabulary.
For example, NASA means, among other things, 'crazy'.
How did I remember it? You'd hafta to be CRAZY to wanna go in space. NASA goes into space.
NA - NAry
SA - SAne.
Many other words, I recall with naughty memory aids. Sex is always more memorable, since it is taboo.
Sex should be used as often as possible. <:

PIPI and LILI suggest use of pidgin reduplication, in these cases to indicate small.
PIPI - insect. LILI- little.

There are alotta tongue-in-cheek in-jokes in the word names, I think.
Ike - as in Nixon- I think J said it means to lie haha.
"I am NOT a crook!"

I've read some Tao. I hadda keep rereading Tsu to understand him, so didn't get that far yet.
Toki Pona is supposed to express Tao philosophy.

Maybe it is having a good impact on me.
I forgave a coworker some minor, old, and ultimately un-memorable slight from years ago.
Doesn't seem much, but it is a start.
I have found recently that I have only been hurting myself with my grudges. Maybe it's time to let them go.
My roomie told me yesterday that "forgiveness is a selfish act" ... I think he's right.

Happy Holidays!

Sunday, December 20, 2009

blog entry 99 3/4. importance of overt social cues, context.

I had considered staying home tonight. Apparently I should have.
But there I was. I needed to pick up (haha. pick up.) my paycheque from the night club where I work.
What I WANTED to do was compare the phonotactics of Mandarin Chinese and Standard English.
This is a precursor to my vocabulary generation in Decimese.

Unfortunately what I did do was go to pick up my pay cheque.
I decided to stay and dance. I had some beer, alternating them with water.
I'm not actually drunk. The amazing part is that the social gaffe would have occurred even if I was sober.
I cannot multitask. I cannot grasp literal and metaphorical interpretations of statements simultaneously.

Aside: I was talking to my dear friend Michelle, of 14 years of acquaintance.
I noted that a man's brain, transferred to a female clone, would fit after a number of decades once it shrunk sufficiently.
She pointed out that observation could be considered very offensive.
Of course, we both knew that women have a similar number of neurons but they are more densely packed.
I didn't mean any offence. She didn't take any. She's known me too long.

But tonight: wow, just wow. Epic social fail. I couldn't make this stuff up. The stuff of LEGEND.

Background information: setting: Me. Two female coworkers. Ex of my last GF. I cannot think straight around him. I get, well, petty and vindictive. I had done so well with invoking Tao and Buhddist mental states the whole night.
Then - just for a minute- I lost it.
I'll make him jealous, I said. Dance with 2 coworkers, both attractive females- we' d been doing that all night.
But that wasn't enough for me. I wanted to make him very jealous.
I'm too polite to suddenly lift up and swing around a woman without asking first.
So I asked them.
Herein lies the problem. I didn't say "lift up".
I said "pick up".

Like I said - I cannot multitask. And would have done the same sober.

Let us examine the multiple word lexemes of "lift up" and "pick up"'.

Lift up - (Webster's Dictionary) - 1. To move in a direction opposite to that of gravitation

That is also synonymous with "pick up" ... well, sometimes. (Palm smack on forehead here).

Pick up - alternative definition - 4 a : to enter informally into conversation or companionship with (a previously unknown person) .

Um, seriously didn't occur to me.
So - wait for it - I suggested this to both of them in sequence. Wasn't sure why they looked so scandalized
Then. I got it.
Not literal/physical "pick up". PICK UP. OMG. Wow. Just wow.
I seriously didn't understand that initially.

Decimese: I've been hashing out schemes to indicate spatial, then temporal, then metaphorical overt indications in my aux-lang. Sadly, my native tongue of English is much more subtle and imprecise. And contextually driven.

Yup. That's right. Instead of 'can I spin the two of you around for minute'?
Yup. Wow.

So I'm sitting here, feeling like a major ASS.
One word. Pick. Not lift...

My language Decimese will have not only indicators (like word particles) for various metaphorical meanings.
Social cues will be one of them..

It gets better.
The staff X-mas party is tomorrow.
And if I don't attend now, I'll be a chickenshit.
So I'm going.
But it feels like facing a firing squad.

I think I have sold everyone on optional overt word particle social cue indicators.

Merry X-mas.


Friday, December 18, 2009

Entry 99 1/2. Sapir's essay, parallels of English and Esperanto. I give up on Esperanto...


We may begin with simplicity. It is true that English is not as complex in its formal structure as is German or Latin, but this does not dispose of the matter. The fact that a beginner in English has not many paradigms to learn gives him a feeling of absence of difficulty, but he soon learns to his cost that this is only a feeling, that in sober fact the very absence of explicit guide-posts to structure leads him into all sorts of quandaries
Anyone who takes the trouble to examine these examples carefully will soon see that behind a superficial appearance of simplicity there is concealed a perfect hornet's nest of bizarre and arbitrary usages. To those of us who speak English from the earliest years of our childhood these difficulties do not readily appear. To one who comes to English from a language which possesses a totally different structure such facts as these are disconcerting.
The precise disentanglement of all these relations and the obtaining of anything like assurance in the use of the words is a task of no small difficulty. Where, then, is the simplicity with which we started? It is obviously a phantom. The English-speaking person covers up the difficulty for himself by speaking vaguely of idioms. The real point is that behind the vagaries of idiomatic usage there are perfectly clear-cut logical relations which are only weakly brought out in the overt form of English. The simplicity of English in its formal aspect is, therefore, really a pseudo-simplicity or a masked complexity.
D: Does this look familiar? Claims of simplicity beyond what can be defended?

D: so the same arguments are used to similar effect by those who
1) like English as the standard
2) wish to reject English as the standard world language.
But do the advocates of 2) fare much better.

I consistently grind to a halt in the middle of my 'basic' Esp-o book.
I've realized it is just too hard for me.
See the entry on Turkish infixes- complexity is NOT the problem.
Multiple uses for the same trick have given me great trouble.
Vocabulary items have.
Endless infixes and forced agreement have.

English IS the superior choice for a world language, insofar as its basic grammar IS more simple.
AND it is. Cuz it doesn't have Latinate grammatical infixes.

Here is a random list of things that irked me the last time I attempted to learn Esperanto:
1) Collegio but pag*o. G and J respectively.
Now I'm sure my bias as an Anglophone will be here.
But the only word I know that sounds like the former is colleague. Not college, not others.
Yet pag*o retained the sound of 'page'... why?
2) Precipe is not related to precipice, but means principally. Why the loss of a critical N?
3) Akvo but lago. No rhyme or reason to voiced/voiceless consonant pairs mid-word.
4) no overt identification of whether verbs are transitive or not. Done to self, or other.
How is this different than Sapir's criticism of English?
5) coverto - false friend. Not covert. Cover, as in letter envelope. So much for natural language being 'easier'.
6) renkontas. Doesn't mean re-enknontas. The sloppy syllable rules mean endless spoofing.
Is that the core word? Is it an infix? Who the hell knows.
Likely this is true of many many others too.
7) mal- means opposite. BUT the word for opposite is - wait for - contrau. Didn't use the prefix as a word root.
8) there are 3 ways to say member-of-nation. Yeah, that's easy.

But the real nail in the coffin was -n.
It can mean object.
It can mean movement towards a position.
e.g. La katos saltas sur la tablon.
- the cat jumps on the table -to.
- the cat jump on-to the table.
So what did I hafta do there?
Keep the whole sentence in my head, in working memory.
Access WHICH meaning of -n it is. Complete. Wow, just wow.
Wait - it gets BETTER.
Adverbs showing place, movement towards, ditto.
Li iris hejmen.
He went home -to.
He went to home, or homewards.
For directional - supren.
... and this has what to do with direct object -n?
The funny thing is that using other nasals at the word's end could have cleared this up.
Hejmem. Hejmeng.

10) also, how do I say "lingvo"? Lin-g-vo. Well that is downright euponious!
Not sure why ng never made the cut.

Ultimately, the problem, IMHO, is a confusion over verbal versus written language.
So busy trying to make words look the same on paper that the pronunciation is lost.
A side effect of slavish adherence to the Roman Alphabet.
11) to review, even common words have diacritics. OK for a few rare of imported ones, but common ones?! leading to...
12) Kipf's Law. Common words should be short.
There - tie. Here - Near/there - C*i-tie (3 syllables!).
Multisyllable common words. Diacritics on common words.
Zamenhof just didn't do his homework!

It's funny that he was aiming at accessible.
To avoid accusations re: cultural neutrality, he didn't want to commit to any particular word order.
That meant latinate infixes to denote grammatical function, and fluid word order.
But THAT was as culturally loaded as word order!

On that note, we could have skipped all those diacritics if we had limited the phonemes to 26- the number of Roman Alphabet letters. Instead, we get this mess that much of the world cannot say.
Though Europe can.

It's a Euro-interlang posing as a world interlang.
Ultimately, it is just European bigotry in the guise of cultural neutrality.
A wolf in sheep's clothing.

I'm done learning Esperanto- or failing to. I am not finding it much easier than French, to be frank (though not Frank).
I couldn't learn French to save my life either.

As soon as we use the fireplace, I'm tossing the Esperanto book in it.
Before I waste any more of my life on the infernal thing!!!
Update- I've read over the list of the most common 1000 English words. There are relatively few repeating patterns and concepts. On track for end of year summary of core concepts for Decimese.
Oh yeah- my system has no more than 26 sounds, maps 1:1 on to the Roman Alphabet- and does so without diacritics!

Saturday, December 5, 2009

just how many "speak Esperanto"?

D: from Wiki.

Native speakers
Main article: Native Esperanto speakers
Ethnologue relates estimates that there are 200 to 2000 native Esperanto speakers (denaskuloj), who have learned the language from birth from their Esperanto-speaking parents.[1] This usually happens when Esperanto is the chief or only common language in an international family, but sometimes occurs in a family of devoted Esperantists.

Every year, 1,500–3,000 Esperanto speakers meet for the World Congress of Esperanto (Universala Kongreso de Esperanto).[43]


the length of study time it takes Francophone high school students to obtain comparable 'standard' levels in Esperanto, English, German, and Italian.[24] The results were:
2000 hours studying German =
1500 hours studying English =
1000 hours studying Italian =
150 hours studying Esperanto.

Finnish linguist Jouko Lindstedt, an expert on native-born Esperanto speakers, presented the following scheme[39] to show the overall proportions of language capabilities within the Esperanto community:
1,000 have Esperanto as their native language.
10,000 speak it fluently.
100,000 can use it actively.
1,000,000 understand a large amount passively.
10,000,000 have studied it to some extent at some time.

D: I wonder if that holds up as a rule of thumb for various other languages.
The whole magntitude- pyramid.
D: I also wonder how other aux-langs stack up for time required to speak them.

I imagine ease-of-learning cannot be an afterthought in language design.
'Tis either a design priority from the beginning or does not happen.

I wonder if anybody has ever thought to compare various aux-langs in this regard.
Aside: A magnitude is a 10x shift.
Decibel means a doubling of sound for every 3 DBs.
A math-based naming convention could encapsulate any variant of such schemes.
For example, a tripling every 4 units of increase. Dunno why one would wish to, but one could.
I imagine a robust fraction naming system would be pivotal.

Well, so the question I pose is how does one promote one's language?
Most auxlangs never develop a significant following. Heck, I imagine many designers cannot themselves speak their own language fluently, LOL.
To summarize the success of earlier attempts:
1) expert professional backing
2) find a niche demographic that emotionally identifies with some purported theme of the language
3) right time, right place
4) an enthusiastic and motivated founder, often a highly charismatic speaker and writer.
5) gain support of various government (? and NGO ?) organizations.

Beyond that, I'd say the Internet is the key to promoting actual language acquisition.
Language lessons that are made by educators that understand teaching of language is key.
I am a literacy tutor of the Laubach school myself.

I wonder about use of a vanity press.
D: nice breakdown of costs.

I am still intrigued by the prospect of using Magnetic Poetry.

There are PDF-digital only sites that sell products.
I personally ordered Cyberpunk 2020 RPG from one. Called drive-thru I think.
No paper- no publishing and S&H costs.
D: DIY fridge magnet word set.
Maybe I'll get motivated enough this winter.
Esperanto really lends itself to this.
There are online virtual Magnetic Poetry sets.
I tried to contact the designer, but failed.
Any Espo-ist out there about to make a custom Esperanto version?
Cuz I'd wanna host something like that!

My roomie J is a talented linguist, a real polyglot.
He said he'd learn Esp-o just so I have somebody to practise with.
He speaks masterful English, tutors in French, gets by in German, and dabbles in Russian and Japanese, and some Chinese.
I'd be curious what his thoughts on Esp-o will be.

Aside; Kiph's Law.
Mod. Greek ("KEH"); Esperanto ("KIE")
Hmm. We say and - VcC.
Greeks say keh- CV. Esperanto says CVv. The small "V" denotes a compound vowel sound, albeit brief.
Hmm. I would have expected Kiph's Law to suggest that more languages would have very simple VC or CV forms of 'and'.
Like French- et.

From the point of view of Decimese, reducing 'and' to some math plus-variant concept is a good starting point.
Then we can 'trick out' the word with various consonant clusters 'n vowel diphthongs for nuance.

OK. So I have only one entry left this year. It'll be an overview of the closed-class English words, reduced to their component meanings. Together with a proposal of how to rebuild these meanings in Decimese.
It'll be my first vocabulary items.

LOL and finally I'll have something concrete enough to be criticized. I'll hafta learn to defend my own languge.
So far I've been all offense and no defense. It is easy to criticize, but much more difficult to produce quality work...

Friday, December 4, 2009

Esperanto - years later, still sucking.


D: I chopped down my Basic Esperanto book. I lost the chapter headings and page numbers in the process.
It doesn't fit anywhere. It is this annoying mid-sized paperback.
I found I kept leaving it at home. It wouldn't fit in any pockets.
This seems to be the most common how-to Espo book.
Yet another oversight a century later that helped to doom the language.

Mandatory Esperanto rant:
Kiu- interrogative for who.
Then I notice Kiuj - plural interrogative for who.
OMG. Every detail that English lacks is another nail in the coffin of an auxiliary language.
My brain hurts just thinking about it!

Then I noticed the basic vocabulary items of boy, son, and brother.
Boy- knabo (kuh-nah being such a common sound, you kuh-nah-oh what I mean? [=)
Son- filo. As in filial.
Brother- frato. As in fraternal.
These are 3 separate vocabulary items that must be memorized separately.
Let's analyze the lexical content of each word.
Boy - a male child from birth to adulthood
Son - a human male offspring especially of human beings
Brother - a male who has the same parents as another or one parent in common with another
D: a quick aside. I took a feminist soc class at Windsor U.
I wrote a paper on um reproductive techonologies, I think.
I used the terms male and female throughout it in lieu of boy/girl and man/woman.
It turns out that turned the Prof bonkers! Oops, LOL.

OK, so boy is the more basic of terms. Both brother and son (as well as father I suppose) all denote boy/male in terms of relative relationship.
Kinship terms are some of the most primordial language terms, ubiquitous in every culture.
Odd that Mr. Zamehof, aka the Big Z (zzz...) didn't infix this particular to death, like he seeemed to do with everything else.

Here is where the approach of Decimese would work very well. Time and biology are innate to it in a way not possible as an after-thought.
MELTS - math, ethics, logic, time, space.
At the very heart of the language design, giving it direction and emphases unique to Decimese.
Once we have defined boy as human/male/ time-less, some, then
we can develop variants for kinship that show relative relationships.

A summary of Decimese design principles.

D: I summarized this for an e-mail to my sister.
I hope she has the good sense to delete these rants I write after too much coffee. [=

For those of you who are not familiar with my blog, I hope to make both a new world language and writing system.

The hypothetical reason is
- United Nations will be 100 in 2045
- League of Nations nearly adopted Esperanto
- but didn't
- maybe (???) there will be some enthusiasm for a world language by then
- English is hard for much of the world to learn
- English dominance is waning
- The Mandarin Chinese language is waxing - rising
- Mandarin Chinese is difficult for, well, most of the world
- a logical solution now and then is a compromise language easy for both English and Chinese speakers.

... Decimese!

Language alphabet stuff will cost few hundred. Need texts to figure out damn font editor program.
Silvia is rebuilding my PC on the weekend so I can start playing with it at least.
My letters should be first cuz IF (if) the Language X Institute (check 'em out) adopts it as their proposed standard for a world language, then I have skilled and influential professional backers.
There is a nice documentary out there called the History of English- enjoyable.
After lunch with Fotini, I'll complete the first installment of Decimese. I.e. the closed class function words.
The stages are
1) study all repeating themes in basic vocabulary words. E.g. I, you, he, it - singular vs plural. Et al.
2) then reassemble them - just use chart at first to show all variables. E..g They - out/far (space terms), plural(math term), pronoun indicator (grammatical).
3) then condense using a compression cypher. I.e. use of consonant clusters and vowel diphthongs to repack into something about the same size as a standard pronoun.
Kiph's law - rule really- says common words will also be smaller, more brief.
It is right annoying when they are not.
If I invest the time and energy in keeping the 'mortar words' small, then the size of the 'brick' words- lexicon - will not matter so much. I intend to use a large number of basic concepts, and then heavily use compounding, complete with some compressed prepositions to indicate relationships.
I don't believe in smaller bricks than that. Too few basic vocabulary items, and meanings are too
vague and ambiguous.
Sure, more single-vocabulary (complex syllable structure) items makes for brevity.
But it results in more lexicon being brute-force memorized. I for one have a poor memory.
For example, P-I E protolanguage has a term for marriage. It is give-heart.
Marry-age. Fusion, agglutination - it has become less obvious. Bad example I guess.
I consider use of complex sylllable structures to generate more core vocabulary to be a trap, a dead end.
Painting into a corner. If we reserve consonant clusters/vowel diphthongs for some trans-word
generic meanings, then we can apply a handful of rules repeatedly to compact words.
I am trying to mitigate the weakness of my hybrid taxonomic language approach.
Part taxonomic, part word-compounding for vocabulary.
I have no dogmatic adherence to either system. Fork or spoon? Spork! <:

By backing up one additional step beyond other aux-langers, my language design should
be innnovative, interesting and perhaps more flexible.
Other than a few common examples, I have no desire to generate much standard vocabulary.
I wish to use a Lojban/Linux approach. Let a community of online enthusiasts devote their energy.
So long as they are guided by the language design principles, it should work out OK. ...

D: plus the HIOXian letter system. Not really an 'alphabet' per se.
Optimized for Decimese's phonemes, but it can be adapted to anything the IPA can express. And should do so much more clearly and methodically! Without the need for multiple letters the Americans use with the Roman Alphabet in lieu of IPA's symbols.
Optimized for computer monitors and good visual clarity. A 1:1 letter/phoneme relationship. Starting fresh with a new letter system, to avoid confusion with existing sound/letters.
IPA contains many 'false cognates' of a sort.

OK here is my promise: I'll summarize all variables in the 300 or so words in English's closed class function words by year's end. This is my 97th blog. I'll make that this year's final- 100th - blog entry. There.

A summary of Decimese design princples.

Thursday, November 19, 2009

D: I saw this New Scientist article today on superstring particles.
The naming convention is erratic.

The counterpart to an electron is a selectron.
BUT the counterpart to a photon is ... a photino.

Applying a regular rule would have resulted in either
1) selectron and ? sphoton.
2) electino and photino.

The neutron gets renamed a neutralino.
Using this convention, a photon would have a counterpart called a ... photalino?

My point is that a single renaming system for counterpart particles is clean and neat and simpler.

The problem is in part the laissez-faire syllable options.
Photon. CVCVC.
Electron. VCVCCCVC.
Decimese tends to shoehorn words into a handful of predictable syllable formats.
This makes the business of regular naming convention more obvious.

Mandatory comment on Esperanto. <:

This lack of careful planning of syllable forms leads to no end of problems.

Di- root for diety.
Diet- root for diet.
-et- infix for tiny
-o ending for noun.
Dieto. Diet? Or tiny diety?

Congrats- homophones. In some ways worse, it is also spelled the same way.
So more confusing than, say, which/witch. Or there/they're/their.

Vowel gemination: diino. di (diety) -in- (feminine) -o (noun).
Dee- ee - no.
Implication for what is effectively vowel gemination:
This effectively means that syllable duration, since it carries lexical meaning via vowel gemination cannot also convey syllable stress.

Within a word, stress is on the penultimate syllable, with each vowel defining a syllabic nucleus: familio [fa.mi.ˈli.o] "family".

D: so diino ought to be pronounced, with a break between V and V (identical) as
dee -eeeee- no.
Here's is where it gets fun.
Before considering syllable stress, the word is already
dee -ee -no.
What do you get in reality, allowing for colloquial speech?
Good luck with that one.

Esperanto was trapped the moment it attempted to use word-building derived form typical European syllable forms, followed by Latinate infixing.
Possible alternatives include Lojban's approach.
Syllable types used in cmavo, gismu, and lujvo:

Type Sample Word Factors Syllables

CV di bridi 17 x 5 85
Cy ky dikyjvo 17 17
CCV bri bridi 48 x 5 240
CCVV grai xagrai 48 x 4 192
CVC gug gugde 17 x 5 x 17 1445
CCVC ckel mickelcre 48 x 5 x 17 4080
.V .e .e 5 5
.VV .ai .ai 4 4
.iV .ia .ia 5 5
.uV .ui .ui 5 5
.y. .y. .y. 1 1
'V 'i la'i 5 5
'VC 'ir po'irpoi 5 x 17 85

Total: 6169 syllables.

The following syllable types occur only in le'avla (borrowings) and names:

CVVC raig raigbu 17 x 4 x 17 1156
CCVVC kreig kreig. 48 x 4 x 17 3264
'VV 'ai ta'aino 4 4
'VVC 'ais ni'ais. 4 x 17 68
CR dr gugdrnede 17 x 3 41
iy (reserved) 1 1
uy (reserved) 1 1

Grand total: 10704 syllables.

So the English speaker

Sunday, November 15, 2009

phonetic alphabet not well suited to English

"Just sound it out."
How many times have you heard well intentioned English teachers say that?

Well, there is a reason.
Those 40+ English sounds are not only hobbled by the 26-letter Roman alphabet.
Those 40 sounds are hard to distinguish.
The endless ah eh aw aae variants.

Not like Italian- 5 vowels.
Or Japanese - Japanese has the following phonemes: 5 vowels /i, e, a, o, u/, 16 consonants /

So HIOXian won't magically solve English spelling conventions, or make it more highly intuitive.
English is itself too complex for that.

Monday, November 9, 2009

babies born knowing their native tongue a bit. vocabulary design. esperanto! <:

In the latest study, Kathleen Wermke of the University of Würzburg in Germany and colleagues claim that the womb may also be where we learn to produce our first sounds. Wermke's team found that French newborns produced "rising" or low to high, contours of sound, whereas German newborns preferentially produced "falling", or high to low, contours.

The researchers write that these are "consistent with the intonation patterns observed in both of these languages", and conclude that these differences are in place so soon after birth that they must have been learned prenatally.

D : well how about that.

My roomie and I just watched the first half of a documentary show called "The History of English". (Punctuation placement intended.)
This show portrays a language that very nearly never became what it is today.
It served certain political ends to get used, often in support of local rulers.
It absorbed vocabulary from many other tongues.
However, the basic stucture remained stable. This basic structure is reasonably easy to learn.
Again, suprasegmental aspects and function words remained highly resilient.

During the Renaissance, English generated vast amounts of new vocabulary from Greek and Latin sources. This resembled the method of Esperanto in passing.
Some words continue to be used, while others fell out of favour.
This leads to arbitrary incidents where the obvious opposite prefix cannot be used to generate the opposite word from the word core.
As much as one can praise English, a neat and concise core vocabulary is not such an aspect. It was described as nuanced and flexible and extensive.
I feel sorry for a new immigrant ESL student who gets to rote-memorize huge lexicon lists. I somehow doubt they use the word nuanced...

The Anglo-Saxon Germanic attempt to create a 'pure' English that was true to its roots is telling. Many new words were coined that involved compound words.
This reminded me a bit of PIE (proto indo european).
I think the PIE word for marriage literally means "give-heart".
Such an approach can complement thoughtful infix use for nuance, while retaining a fairly small core vocabulary to learn.
As Esperanto and Lojban have found out, a language needs a pretty extensive basic vocabulary to compound to generate more vocabulary.
If there are too few, the newly generated compound words are sufficiently vague to still require constant interpretation and clarification.
This is also true of infixes.

Esperanto lets you invent your own vocabulary

You can combine words, prefixes, and suffixes as you speak to make new words. In English, you can't just stick "un-" in front of the word "recommend" (unrecommend? disrecommend?). In Esperanto, if the opposite of a word ― a noun, a verb, an adjective ― makes sense, go ahead! Malrekomendi is perfectly good Esperanto. You want to really, really malrecommend something? Malrekomendegi! Every Esperanto speaker will get your point.
Esperanto has a recognizable vocabulary

You may have recognized several of the Esperanto words above, or seen related words in English (bona -> bonus, alta -> altitude, feliĉa -> felicitous). About 70% of Esperanto vocabulary is directly or indirectly derived from Latin roots, many of which also appear in English. Another major chunk is from Germanic roots (hundo, hound or dog). So you'll understand a good part of the words with little trouble.
D: Latin/Germanic- claiming of particular ease to English on one hands, then claiming some mythical international character on the other...

D: the English-to-Esperanto contains c. 17,000 items.


There are about 5000 official roots in Esperanto. You can find a hard-to-navigate list of all the official roots here:

D: but the extensive PIV dictionary has about 15,000 items.

This is an appropriate place for me to say a few words about the material for the dictionary. Much earlier, when I had examined and rejected every non-essential from the grammar, I had desired to exercise the principles of economy in respect of the word-material also. Thinking that it was a matter of indifference what form any particular word took, so long as it was agreed that it should express a given idea, I simply invented words, taking care only that they should be as short as possible, and did not contain an unnecessary number of letters. Instead of using "interparoli" (to converse), a word of eleven letters, why should we not express the idea just as well by some word of two letters, say, "pa"? So I simply wrote the shortest and most easily pronounced mathematical series of conjoined letters, to each factor of which series I gave a certain meaning (e.g., a, ab, ac, ad, ba, ca, da . . .; e, eb, ec . . .; be, ce . . .; aba, aca . . . etc.).

But I immediately rejected this notion, for my own personal experiments proved that these invented words were very difficult to learn, and even more so to remember. I came to the conclusion that the material for the dictionary must be Romance-Teutonic, altered only so far as regularity and other important requirements of language demanded. Standing upon this ground, I soon observed that the present languages possessed an immense supply of words already international, with which all the nations had a prior acquaintance, and which formed a veritable treasure house for the future international language--and, of course, I utilised this treasure.

D: Zamenhof on his basic language design philosophy.
He rejects philosophical language tenets from the get-go.

There was much to lop, alter, correct, and radically to transform. Words and forms, principles and postulates, jostled with and opposed each other, whereas in theory, taken separately and not subjected to extended tests, they had appeared to me perfectly good. Such things, for instance, as the indeterminate preposition 'je,' the elastic verb 'meti,' the neutral termination 'aŭ,' etc, possibly would never have entered into my head if I had proceeded only upon theory. Some forms which had appeared to possess a wealth of advantage proved in practice to be nothing but useless ballast, and on this account I discarded several unnecessary suffixes.

D: but here is the price to pay for starting with such a hodgepodge...

This problem I considered for a long while. At last the so-called secret alphabets, which do not necessitate any prior knowledge of them, and enable any person not in the secret to understand all that is written if you but transmit the key, gave me an idea. I arranged my language after the fashion of such a key, inserting not only the entire dictionary but also the whole grammar in the form of its separate elements. This key, entirely self-contained and alphabetically arranged, enabled anyone of any nationality to understand without further ado a letter written in Esperanto.

D: a concept I explore in HIOXian anatomical-based stylized pictograms - ideograms.
D: Basic English by Ogden and derivatives claim a core vocabulary of about 1000 words.
Allowing for multi-word lexemes and generated compound words, this likely approaches 10,000. One can pay now (early) or later.
Less vocabulary core items, but more unclear and individually learned MWLs?
Any MWL that cannot be deconstructed and understood from its original commponents amounts to another basic vocabulary item that must be learned.
They often amount to idiom, which must be learned as 'part of the culture'.

I remain convinced that careful first design principles for both closed and open class words, with rules for both, could yield an unprecedentedly concise-but-flexible quality in a vocabulary.

Reduplication is only marginally used in Esperanto. It has an intensivizing effect similar to that of the suffix -eg-. The common examples are plenplena (chock-full), from plena (full), finfine (finally, at last), from fina (final), and fojfoje (once in a while), from foje (once, sometimes). So far, reduplication has only been used with monosyllabic roots that don't require an epenthetic vowel when compounded.

D: You see a lot of reduplication in creoles.
It is often a way to differentiate two similar sounding words.

In addition to the root words and the rules for combining them, a learner of Esperanto must learn some idiomatic compounds that are not entirely straightforward. For example, eldoni, literally "to give out", means "to publish"; a vortaro, literally "a compilation of words", means "a glossary" or "a dictionary"; and necesejo, literally "a place for necessities", is a toilet. Almost all of these compounds, however, are modeled after equivalent compounds in native European languages: eldoni after the German herausgeben, and vortaro from the Russian словарь slovar'.

D: leading to the typical scenario of a tourist trying to find a toilet and desperately needing to pee, while trying to convey one's idea, LOL.

I guess that is no more ridiculous than folks asking for a bathroom - when there is no bath!
D: in conclusion, a language designer should not shy away from a sufficiently extensive core vocabulary in the name of conciseness.
A language does need to be able to express concepts likely to be encountered in life.
At the same time, duplication and word redundancy ought to be minimized.
I suppose some overlap if preferable to gaps in the ability to express concepts.
Multiple ways to say the same thing will likely crop up from time to time.
D: here I quote The Satanic Bible, er Ranto. <:

E2: Clarity

These affixes are often baffling. In , "cigarette box", <-uj-> means "(bulk) container". But it also occurs in , "Sweden" (not "Swedish ghetto") and , "apple tree" (not "apple barrel").

D: how is that for clear? <:
The trouble is that Zamenhof emphasived largely the syllable and not the single phoneme.
Because of his Latinate word-generating approach, with all the limitations of a natural language, this was bound to happen.
Some various possible translation errors for the above example might be:
- apple box for apple tree
- swede box for Sweden.
- cigarette tree
- Swede... tree?
<: LOL!
There is no clear system to sort out a word segment that is part of the root/stem versus part of an infix, either suffix or prefix.

D: to me, the easiest way to avoid this quagmire is to have certain syllable formats dedicated to root/stem, while others clearly only modify the basic meaning.
Simply using word particles versus infixes also helps, and is of more use to a language speaker unused to much agglutination.

overfussy distinctions - all mean "to marry".
D: his criticism about clockwork methodology to generate vocabulary would also apply to a philosophical language, though there is likely less overlap and redundancy.
Nonetheless, some very odd and often useless hypothetical words could be spoken.

Zamenhof was if anything overzealous in this department, stuffing his "basic" wordlists with trivial distinctions such as "a kiss" versus "a noisy kiss", and so on; who asked for these?
F3: Simplicity

This is the inverse problem, overlooked by Zamenhof. Language learners want to be able to start communicating with as little rote learning of vocabulary as possible.

D: "a noisy kiss" seems to be an ideal time to use compounding versus coining a new core term. Like I said, "Z" was haphazard.
G5: Elegance

Shoehorning words into this system can mangle them horribly.

= "by marriage, bilious, repentant, ancient"

D: figuring out the origin and etymological meaning of such trunctations is impossible.
I think this aspect can be best understood as a reaction to the very brief Volapuk language.

"coffee" (near-globally ) becomes , etc.

D: My Japanese friend Hiroshi says something more like co-hay.
So I am not sure how internationa kafo is.
D: A problem with Esperanto, given the lack of planning to carefully use certain syllable formats for certain language aspects, is not 'wordiness' but lengthiness.
By that I mean quite a few vowels, and a long time to say the word.
Hund- (dog root). Hundo- dog.
The temptation to tack on a whole new syllable for each nuance (grammatical element, verb tense) ensures this.
A latinate word generating system all but ensures this.
To some degree, any natural language basis would.
To be fair, hundo means dog, hundoy means dogs. Hundoyn is dogs dogs/object.
So after the initial -o syllable, additional nuance does NOT add syllables.

Philosophical and/or taxonomic IALs have problems.
Not THOSE problems though...

Tuesday, October 13, 2009

difference between English and Chinese dyslexia

In contrast, the new findings show that developmental dyslexia in Chinese is really two disorders: a visuospatial deficit and a phonological disorder combined.

Siok and her colleague Li Hai Tan say the difference can be traced to the characteristics of the two languages. "In English, the alphabetic letters that form visual words are pronounceable, so access to the pronunciation of English words is made possible by using letter-to-sound conversion rules," Siok said. "Written Chinese maps graphic forms—i.e., characters—onto meanings; Chinese characters possess a number of intricate strokes packed into a square configuration, and their pronunciations must be memorized by rote. This characteristic suggests that a fine-grained visuospatial analysis must be performed by the visual system in order to activate the characters' phonological and semantic information.

With respect to HIOXian:

HIOXian lacks "intricate strokes" and "fine-grained visuospatial" detail.
Any particular phoneme/phonete symbol is roughly comparable to the visual complexity of the Roman Alphabet.

I suppose the proposed scheme to colour-code stacked consonant clusters and vowels could encounter learning disabilities. However, that system is strictly optional.
Hmm, guess I hafta pick the colours for that with colour blindness in mind.
Enough males have it that it needs to be treated as the same seriousness as dyslexia.

To reiterate, HIOXian lacks any visual component for meaning smaller than a dash: -.
This means it can be read by ageing folk with visual problems, it can scan and copy easily without too much degradation, it is forgiving of errors.

I was looking at Visual English by Bell has many very fine swirls and flourishes that convey essential meaning. They would not display well on a computer monitor. They also are not easy to write quickly and neatly.
From the point of view of a typesetter, Visual English is handy, since like the Cree syllabary, the symbols are recycled by rotating by increments of 90 degrees.
This is not a concern of mine. I am concerned with computer typing on one hand, and cursive writing on the other.
Visual English also illustrates the struggle between incorporating elements into the core symbols versus using various after-the-fact diacritics, accents and even specialized additional characters. IPA also struggled with this design issue.

I'd like to point out that even without the up/down spacing of consonants/vowels, the symbols would still be clear to a literate end user. Don't quote me just yet, but I don't think any vowel symbol will correspond with a consonant symbol. In fact, I mean to plan ahead enough that there is no such confusion.
This makes the up/down spacing simply an additional redundant way to convey information.
For English, it would serve to faciliate learning HIOXian as well as to increase reading speed.
English has enough phonemes that they will not map onto HIOXian without use of accents to convey additional information. This should not be an issue for Decimese.
To avoid the need for diacritics for Decimese, I will need to incorporate such features as plosives, fricatives, glides and semi-liquids into the core character.
I think I could live with LRWY and H requiring diacritics.

Wednesday, October 7, 2009

thoughts on Hioxian

I have a coupla weeks off here, so have revisited the Hioxian letter system.

I had hoped to avoid use of diacritics as much as possible.
But the consonants likely require it.
Not for Decimese, with its very limited selection of phonemes.
But for English and others, yes.

I have a Mac right now and don't know how to save pics.

I have been playing with cursive writing with the Hioxian system.
I'd like to take my notes in it next fall, when I return to university for my thesis.

First of all, English requires that I flesh out the consonant diacritic system for the 40+ sounds in English, subject to particular dialect.
The vowel is the core of any syllable.
The diacritic for English vowels ought to show syllable stress.
This includes a combination of volume and duration in English.
However, it does not necessarily in other languages.

Plus for VERSE English, I wanted to also denote tone pitch. For Chinese too.

So we have an overlap of standards.
The colour-code system could work for a computer.
I.e. RGB - red green blue. If denoting both, say, duration and pitch on a vowel, perhaps it would include overlapped red and green on a bar segment. So it'd be a third colour derived from those two. Alternatively, a blue-yellow-red basis.

In computer font, the spacing is perfectly regular, like courier font.
This is important to indicate edge boundaries for characters.
I had pondered a more abstract system that always possesses the central lower 'trunk' or root bar segment, and assumed a front head facing for meaning.
But it resulted in some very arbitrary left/right meaning segments.
And so it ceased to be particularly intuitive.

It is generally an easy task to determine which way to read the hieroglyphs even if the meaning is not understood. Hieroglyphs with a definite front and back (for example, a human or animal) will generally face the beginning of the sentence, as well as being oriented in the same direction as any large human or divine figure in the associated art work. (However, in some instances, they will be reoriented out of respect to face any important personage, such as a king or deity.)

As an example, if a tableau contains a picture of a man seated and facing right, then all hieroglyphs written in text above or behind the man, and having a definite front and back, would be oriented to the right as well. The actual hieroglyphs would be read from right-to-left because these images almost always face the beginning of the sentence. (Text written in front of the man might very well be oriented to the left, facing the man out of respect.)

In cursive HIOXian, the rounded slant of characters could help show boundaries.
Since the actual shapes of phonemes are fairly limited, seeing boundaries should be fairly clear with this aid.
What I'd REALLY like to do is portray anatomical speech mechanisms in such a way that the character is always whole and continuous, without breaks.
This may be possible, but will take much WORK.

I cannot believe it but geocities is shutting down.
For that reason, I'll just cut 'n paste this website here.


Hi. Many fine attempts at a more phonetic alphabet have been made. For typesetting on obsolete printing presses, Bell's Visible Speech is excellent. It uses Cree syllabary-style of figures rotating 90, 180 and 270 degrees. It shows symmetrical pairs, like voiced/unvoiced very well.
Ygyde's alphabet bears a similarity to various short-hand writing schemes. It is fairly intuitive, and is well suited to handwriting.

However, I wished to make a system even more transparent, as well one optimized for computer displays. After a coupla years of fooling around with ideas, I have tentatively settled on Hioxian. The name is a reference to "HIOX", or the 16-segment alphanumeric font of calculator and early display fame.
A computer font requires particular qualities to be optimal. Thin lines and decorative flourishes are to be avoided. Bars should be medium thick for visual clarity. The only angles should be 45 degrees, so the pixels appear smooth. Leaning and curved characters are to be avoided. These requirements made me select the HIOX character, but not angled, and square v.s. rectangular. The remaining space thus freed up is then reserved for an optional diacritic of similar motif. In a pinch, the use of an old calculator display with the decimal indicating vowels as opposed to consonants can work.

Reading speed is increased by methodically using a mechanism that the Roman alphabet uses only sporadically and not at all methodically. Some letters are higher such as klh. Some are lower, such as gjp. Most are centered like aeiou. Wouldn't a system where character height always means something be nice? This would both increase comprehension and reading speed. In my case, vowels occupy the bottom 2/3 square of a rectangular space, and consonants the top 2/3.
Small particles resembling periods and commas are to be avoided, since they are difficult to read in small font, for those with vision problems, and do not scan/copy legibly.

Making a system more intuitive than Visible Speech was not easy. I fooled around with various systems. For the longest time I planned to use the 4 central segments that form a plus-sign shape as abstract phoneme-type indicators. However, I now wish use a stylized pictographic system, an ideographic one since it is highly stylized. Picture the human head from the side, facing left. For consonants, the top/bottom lips and teeth are left and center respectively. The nasal cavity is along the top. Air flow through the mouth would horizontal and bottom. Voice via the voice box would be bottom/right vertical. Vowels and consonants can indicate points of contact and tongue position via various diagonals.
A supplemental system for punctuation and numbers uses a similar motif. The punctuation characters are indicated with the left and right halves of letter and diacritic. The numbers and related concepts are indicated by using the top and bottom 1/3 diacritics sans a middle section. I may manage a 'visual arithmetic' system like Octomatics. A diacritic contains 5 (fairly) vertical and 2 horizontal segments apiece. Marking left to right for 1-5 and right to left for 6-10, we can indicate 1-10 and 1-10 in a single character space. Varying the horizontal segments could denote such concepts as whole number/fraction, square/root (powers), and metric prefixes for large and small.

Interestingly, for my spinoff project from Deafese called Mathese, or Decimese, I could map simple C-V form syllables onto the number system. This requires a limit of 10 possible consonants or vowels in any particular position. Or one of 20 consonant pairs, since only 1 of the pair is appropriate with Decimese's rules. Use of a binary counting system resembling Octomatics could possibly condense data even more.

We are left with a diacritic above vowels and below consonants respectively. I had intended to reserve the vowel diacritic tone in the pitch register system of VERSE. In Decimese, consonant clusters are severely restricted to C _+ LRW or Y + vowel. Recruiting the diacritic to indicate a LRWY consonant cluster leaves space to spare. English allows S + consonant plus LRWY in some cases. I had enough space left for, say, STR-. The word "strengths" is the ultimate test of capturing English consonant clusters. Notably, NG and TH are effectively digraphs. Conversely X and Q both represent compound consonants, in these case KS and KW respectively. A similar system could be used for adjacent vowels and diphthongs, since English lacks a tone system other than intonation in general.

I hope to explore using rainbow-style colour-coding to indicate multiple consonants or vowels stacked on one another for compactedness. For example, red could denote the first consonant. Blue could denote the second, but this could be purple where the 2 figures overlap in certain segments. And so on. In theory, the word "strengths" could be denoted as follows:
1) English: CCCVCCCCC
2) English/phonetically: CCCVCCC
3) using stacking (or my diacritic system): CVC. [=

Obviously, I need to hash out the entire system phoneme by phoneme. I then need to produce a complete downloadable font. I tested some sample Hioxian characters on my friends. They were guessing the sound before I could finish explaining it! <:
If things go well, perhaps I can design a system like IPA, but one that is easily understandable by the majority of normal people who are not linguists!

March '08.
Haha, hard to believe I was ever that ambitious.
Now I'd just be happy to successfully map the Decimese phonemes onto HIOXian!

Aside: I rewatched the Movie "Contact".
I remain intrigued by the potential offered with polarized displays to show more information embedded in otherwise flat 2D letters.
I suppose we could suggest syllable stress in a true 2D display simply by varying font size too.
I like the idea of additional information being optionally invisible in 2D.

My ultimate goal with HIOXian remains to supplant the Roman alphabet.
Aim high!
However, if it only serves as a complement to the Decimese language with its very limited phonemes and syllable choices then I will be content.

Monday, August 31, 2009

twitter kids coin new words

As youngsters spend more and more time chatting to their friends online, so they have tended to express themselves in language that mirrors everyday speech more closely.
Some words are contractions of expressions that have themselves only just become established online, most notably 'noob', for those who cannot be bothered to say – or more probably to write – 'newbie'.

D: I guess the word for Avalon means apple too but originally meant all fruit in Celtic.
I'm watching a TV series on the history of English and its evolution.
It is the structure and not the vocabulary that carried it through.
Cultural dominance can be indicated by metronyms.
The French contributed fruit, which displaced avalon. Avalon was demoted to a niche.

Friday, August 28, 2009

XML and embedding info in computer images

"Such systems have typically required end users to use a manually developed ontology – a lexicon of predefined concepts used to assign machine-readable semantic meaning to information – and then train the software to correctly annotate different images.

The ImageNotion system strips away much of that complexity for the end user, combining semantic annotation with a variety of other technologies, from text mining and object recognition to face detection and face identification, in order to permit many more images to be accurately annotated with little or no user intervention."


D: this bears some passing resemblance to Attempto/ACE, in that it attempts to restrict English language usage.
Leading to the usual quagmire of dozens, if not hundreds, of popular natural languages.
See the entry on EU costs for translation. And the UN - OMG!

Thursday, August 13, 2009

what grade level are political speeches written at?

Other speeches delivered during the convention ranged from grade levels of 10.5 to 6.4:

* 10.5 (Al Gore),
* 10.3 (Bill Clinton),
* 8.7 (Hillary Clinton)
* 8.0 (Michelle Obama)
* 6.4 (Joe Biden)

Two of the most important and popular speeches in American history, The Gettysburg Address by Abraham Lincoln and Martin Luther King’s I Have a Dream speech also registered at ninth grade levels (9.1 and 8.8 respectively).
D: so one needs to speak at the level of basic literacy, not functional literacy.

So what are all those kids learning after that for?
They can watch TV. <:

The Laubach teaching method has a system of 5 levels.
Level 2 seems to be the inability to smoothly pronounce polysyllabic words without breaking them down.

Frankly, this works well in Decimese with its basic CV- construction.

Core concepts of one syllable words from ancient languages.

D: I might be in a position to take weekdays off next month.
I'd use that to finish my Hioxian letter system and to develop the tier 1 of Decimese.
Tier 1 is strictly closed class function words.
Doesn't sound very hard, until one realizes I don't just make a clone of English.
Or any other attempt.
I attempt to incorporate the composite nature of many closed class words.
I harp mostly on pronouns, since they are the most obvious example.
As always, the whole language is derived from MELTS - math, ethics, logic, time, space.

I'm reading "The God Delusion" right now.
I'd like a language that has optional overt indicators of literal/metaphor/spiritual/ a religous "truth".
Imagine if the ancients had that.
We'd know if we are looking at a parable or anecdote, for example.
History or myth.

brain treats living things discretely from non-living

For unknown reasons, the human brain distinctly separates the handling of images of living things from images of non-living things, processing each image type in a different area of the brain. For years, many scientists have assumed the brain segregated visual information in this manner to optimize processing the images themselves, but new research shows that even in people who have been blind since birth the brain still separates the concepts of living and non-living objects.

D: I'd like to have indicators for
1) living and
2) human.

The above article might explain why pets are called he/she and not it.
I pointed out to a woman I know that I refer to pets as it.
Whether the dog is male or female does not matter to me.
A bitch or a curr are the same in how I interact with it.
The woman I spoke to has named her car, and it is masculine, LOL.

I found out the hard way that parents don't like their infant or toddler called it.
But they have no discernible gender, so I don't see what else I would call ... it.

I propose for Decimese
1) default to "it" - third person, singular
2) option to indicate living
3) option to denote masculine/feminine, with neuter default.
4) option for human (perhaps sentient, to allow for AIs/aliens/transgenics? <;)
So "he" would be third person, singular, neuter with human, living, and masculine denoters.
He/dog would be same but without human.
He/car would be same but without human, living.
The same car could take the default "it" pronoun.

Unlike it/he/she, I want the root/stem of "it" to be clear in he or she.
I also want more potential detail with plural forms.
It/ it(s). He(s). Et al.
They doesn't indicate gender or living or anything else.

The key theme is optional overt indicators of details.

Tuesday, August 11, 2009

engrish signs, funny, informative


Sunday, August 9, 2009

idiom - clear and not

But then, the background for "pulling someone's leg" isn't exactly clear as day either. By idioms that make sense, I mean those rooted in an observable phenomenon. "He's barking up the wrong tree." In other words, he's looking in the wrong place for what he wants. The phrase is rooted in the observable (or at least imaginable) phenomenon of a dog confused about where that pesky cat or squirrel escaped to.

Presumably all idioms start out this way, but some have become disconnected. An idiom, or idiomatic expression, cannot be puzzled out by breaking it down into parts and defining them individually. An idiom works as a whole: to kick the bucket meaning "to die," for instance.

"Idiom" comes from the Greek word idios, meaning "one's own." An idiomatic expression is one that "we understand among ourselves," even if it baffles foreigners, such as Miguel, a student of translation in Spain, who a few months ago posted a query online, "[W]hat is in the bucket and why would anyone kick a bucket in the first place[?]"

D: the punctuation fail is from page A2 of this weekend's Globe and Mail.

"Pranknet used untraceable Skype accounts to route calls creating a shield of anonymity."

D: to route calls COMMA! Gah.

Page A3 last weekend, then A2 this weekend.
I wonder if they'll drop the ball on the front page next weekend...
All for lack of a proof-reader.
I'm for hire for a 100 bucks, by the way. <:

Monday, August 3, 2009

study of kipf's law

" The law of brevity, proposed by the American philologist George K. Zipf, along with others, shows that the most frequently-used words are the shortest ones.

...when dolphins move on the surface of the water they tend to perform the most simple movements, in the same way that humans tend to use words made up of less letters when they are speaking or writing, in so-called "linguistic economy".

The research study includes the case of Oscar Wilde's novel The Picture of Dorian Gray. The most-used word is the three-letter article "the", while other larger ones, such as "responsibilities" are hardly found at all."


D: "A" and "I" are in the top 10 most common words and are only one letter.
We do not see a two-syllable word until #61 for "people".

This pressure for brevity is problematic for Decimese.
My approach to core vocabulary closed-class function words is pretty wordy - at least until certain consonant cluster data-compression approaches are used.

Take, for example, the pronoun "I".
Single, first person, pronoun.
There are 3 pieces of information there.
When we contrast he, she and it, we also find masculine/feminine and human/not categories.
Essentially, we now have FIVE discrete facts.
Add a pair-plural category, and we now have SIX.
Being able to include human/not and masc./fem. aspects to all the pronouns might be nice - particularly if we ever see sentient AIs, LOL.

How to compress such a complex approach into one syllable?
Basic word form: CV (consonant then vowel) + ending.
Consonant clusters L/R, W/Y. Plus limited vowel diphthongs.


Initially, I had hoped to have ten vowel sounds.
Then I realized that was hopeless for an international language.
I was forced to reduce this to five - pretty much as per Esperanto.
I still like the idea that 'long vowels' may be optional for advanced speakers, while word particles or additional syllables can stand in within the same role.

Here is a proposed basic design for a Decimese pronoun.
1) single/ plural (math concept) - plus 'pair' dual plural concept.
2) masc./fem. (one can use this with animals, or even objects if desired)
3) in/out (space concept, or math object manifold concept)
4) near/far (space concept)
5) human/not (one can humanize something if desired)
Some interesting options result from this approach.
One could say I BUT
a) not human
b) masculine (I, man! <:)

Core syllable: CV (plus ending consonant or 'cap syllable')
C plus LRWY plus V plus vowel diphthong plus ending.

There would be pressure to minimize detail, once the subject has been described.
You / plural, or they, would likely become you.
She becomes it.
We becomes I.

The H-sound serves all manner of special functions, creating special-duty syllbles in the form H plus vowel.
This was borrowed from Ygyde.

Closed class function words could use reserved syllable and word forms.
CV is the most common word form.
For vocabulary needs, I stick to CV plus (nasal consonant ending).
This means that any CV word is by defintion a function word.
The objection would initially seem to be a loss of clear word boundaries.

I meant.
Me meant.

But in Decimese this is not true.
Nasal consonants always imply a word-final consonant position.
Yes, this is a shameless attempt to cater to the Chinese.
The word-initial consonant is clearly differentiated from word-middle position by the voiced/voiceless distinction.
(Chinese could use heavy aspiration in lieu of this.)
E.g. P/B pair. Bam. Not pam. Bapam. But not pabam.
Note: this IS culturally biased.
English, I think, tend to devoice word-middle-position consonants.
Whereas the French do the opposite.
So yup, once again we have a cultural bias.
I know.
English is king, and Chinese is the rising star.
French was fighting a losing battle against Esperanto to maintain its pre-eminence in diplomacy a century ago.

Back to pronoun construction.
5 possible word-initial consonants. 5 vowels.
25 possible permutations.
4 possible consonant clusters.
??? possibly 5 possible vowel diphthongs.
OK, let's narrow this down.
Only ONE word-initial consonant designated for pronouns.

1 x 5 x 5... 25.
Because LRWY consonant clusters are mutually exclusive without additional syllables, we would want various mutually exclusive states described.
At first blush, I think single/plural, plural/dual and collective noun might be a good default.
With about 5 additional vowel diphthongs, once again, we want mutually exclusive.
Something we *could* do, though it sounds complex, is allows some vowel diphthongs and/or consonant clusters to denote compound conditions.
E.g. single AND masculine. Plural and feminine.
HE. And THEY (plural of she).

Ceqli is willing to sacrifice some brevity for clarity.
Go. I.
Zi. You.
Gozi. ... We.

Some languages have we-but-not-you or we-but-not-he designations.
Again, a variant prime number system could work.
(see much earlier entry).

If I just wanted to map English pronouns then my job becomes much easier.
The he/she/it difference does a sloppy job of identifying the subject.
Some dogs get called she.
Some cars get called she.
Only third-person gets a gender identifier.
It makes this optional.
And plural hides it again.
Sticking to spatial/math concepts, we then have strictly optional gender and human indicators.
Suddenly, calling a bitch (female dog) 'she' or 'it' ceases to be sloppy.
A category for living/not and adult/not is also useful.
Dog. Bitch. Curr. Puppy/ dog. Plural/not.

At public speeches, the speaker will often use the term 'ladies and gentlemen'.
Plural-human-adult-honorific, males same.
Six syllables.
Plural-human-adult-honorific, second-person (out, near).

Note the implications for vocabulary building of this word-particle approach.
Dog, bitch, curr. One word, plus some endlessly recycled core concepts.
A pair of jeans. A whole bunch jeans. Dual plural, plural.
A murder of crows, a heard of cattle (which is not clearly derived from cow).
Plural crow. Plural cow.
Steer, stallion, et al.
Just masculine, adult.
Hmm. Adult / not denotes living inherently.
Dog, puppy. Cat, kitten.
Person, people. Human indicator.
She/ dog. A bitch, living indicator.
She/ car. Female, no living or human indicator.

Well that's enough for now.

Saturday, August 1, 2009

insight, incorporating English punctuation into Decimese. Globe and Mail punctuation Fail!

My pal Roberto is back from Europe.

In the course of conversation, he mentioned Portuguese has two verbs for temporary and permanent states.
His name IS Roberto.
His job *is (for now)* data entry.

Anyway, the French/ Esperanto method to express possession is informative.
My French sux ass.
Le gateau de la femme.
Le gateus du (le le) homme.
Esperanto uses a similar mechanism without the masculine condensed approach.
What if... you used a consonant cluster in a designated part of the word to denote this invisible punctuation trick in English?
The cake of the woman.
The cake of the man.
The cake of the women/men.
The woman's cake.

Globe and Mail this Saturday, page A3.
Can YOU find the punctuation fail?
That spoiled my morning.

Yup, you guessed it - comma retardation.
Second only to apostrophe retardation.
So what happpened here?
One does need 2 commas IF one is indicating a description that delineates only a sole subject.
The dog, which lived in the doghouse on the hill, ...
THIS is not an example of this.
Please please please, somebody fire this journalist.
Then find the editors that let it slip through.

Along with the largely lackluster writing.
The Focus section made up for it, I guess.
Here's a hint.
If there should be a pause in spoken English, then there should likely be some punctuation there in written form.

Tuesday, July 28, 2009

Ontario U students failing language test more. how 'n why

But Barrett can't explain some of the problems.

"Students say, 'How could I have failed? I got 93 in Grade 12 English!' "It's extremely puzzling to me," Barrett said.

High schools all have different standards, she said. Unlike the United States or the United Kingdom, Canada doesn’t have standardized university entrance exams for high-school students.

In addition, the university level doesn't have a formal relationship with high school teachers in Canada, so there's no simple way to communicate the skills that are necessary for success in university.

The most common mistakes made by English-speaking students are punctuation errors, she said. Students often don't know how to use a colon, or an apostrophe.

"Possessives are a nightmare," said Barrett.

Ironically, these problems are less of an issue for students learning English as a second language. Their grammar is not so bad, but they don't always have a "feel" for everyday quirks of the language.

For example, they might not understand why you can say, "I flew to London on a plane," but you wouldn't say, "I drove to Toronto on a car," Barrett said.

One of the biggest problems for foreign students is lack of practice in speaking English.
D: 5% worse than 5 years ago.

I could not have entered university today with my high school marks.
Articles like this vindicate my suspicion that high school marks are being inflated.

I have advice for U students.
When you get an essay back, read over the marking.
See what grammar and syntax errors you made.
I could have improved faster if I had done so...

Friday, July 24, 2009

creepy xray of speech, animated

D: it looks terribly complex, doesn't it?
Consider that nerve signals arrive at different times to various speech parts.
The parts have different musculature and inertia.
It boggles the mind that we can speak!

Friday, July 17, 2009

on how to categorize african click-sounds. IPA

"We wanted to classify clicks in the same way we classify other consonants," said Miller, who was a visiting faculty member at the University of British Columbia during the 2008-2009 academic year. "We think we've been pretty successful in doing that."

N|uu is severely endangered with fewer than 10 remaining speakers, all of whom are more than 60 years of age. Linguists are working diligently to document the unique aspects of this language before it disappears.

D: see the related article list.
Science Daily is excellent for related articles.
I can spend hours just clicking related links - and have.


D: here is opera translations in IPA.
This demonstrates how precise a phoneme system can be.

D: a sample.

Now if I can just finish HIOXian, LOL!
Darn prepaid credit card still won't work after 2 months of hassle.
(For the font editor software textbook.)
I should have just bought a prepaid from 7/11.
BMO is driving me NUTS.

As of 2008, there are 107 distinct letters, 52 diacritics, and four prosody marks in the IPA proper.

D: So about 150 'letters'.
HIOXian should manage that, but much more methodically.
And more clearly on a computer monitor, or for scanning and copying.

Thursday, July 9, 2009

plain language on medical consent forms

The Toolkit is based on plain language—a communication style centered on the audience's needs and abilities. Researchers can see how to use plain language in study materials through the Toolkit's many concrete examples, including an alternative word list. Here's a brief excerpt:

* Instead of Abdomen, try Stomach, tummy, belly
* instead of Abrasion, try Scrape, scratch
* Instead of Absorb, try Take in fluids, soak up
* Instead of Abstain from, try Don't, don't use, don't have, go without
* Instead of Accomplish, try Carry out, do
* Instead of Accrue, try Add, build up, collect, gather
D: sometimes the plain language is either imprecise or requires multiword lexemes.
But sometimes it works just fine.
English suffers from massively redundant vocabulary.

See my entry on sofas, couches, and chesterfields.

I once needed to explain "celibate" as "can't get none" in one of my factory placements, LOL!

Tuesday, July 7, 2009

monkeys know grammar

BBC today: monkeys recognize bad grammar.

D: I work in a blue-collar setting.
There are certain errors that the 'unwashed masses' make in English.

1) double negatives.
"I ain't done nothing wrong."
2) strong/weak verbs
"I seen him."
3) in a few cases, confusing parts of speech.
I.e. these/ those/ them/ their.

D: I saw the high school scores for literacy.
I think only 15% were failing some benchmark test in Ontario.
In English, 5-10% would be expected to fail without intervention due to learning disabilities.