Monday, August 31, 2009

twitter kids coin new words

As youngsters spend more and more time chatting to their friends online, so they have tended to express themselves in language that mirrors everyday speech more closely.
Some words are contractions of expressions that have themselves only just become established online, most notably 'noob', for those who cannot be bothered to say – or more probably to write – 'newbie'.

D: I guess the word for Avalon means apple too but originally meant all fruit in Celtic.
I'm watching a TV series on the history of English and its evolution.
It is the structure and not the vocabulary that carried it through.
Cultural dominance can be indicated by metronyms.
The French contributed fruit, which displaced avalon. Avalon was demoted to a niche.

Friday, August 28, 2009

XML and embedding info in computer images

"Such systems have typically required end users to use a manually developed ontology – a lexicon of predefined concepts used to assign machine-readable semantic meaning to information – and then train the software to correctly annotate different images.

The ImageNotion system strips away much of that complexity for the end user, combining semantic annotation with a variety of other technologies, from text mining and object recognition to face detection and face identification, in order to permit many more images to be accurately annotated with little or no user intervention."


D: this bears some passing resemblance to Attempto/ACE, in that it attempts to restrict English language usage.
Leading to the usual quagmire of dozens, if not hundreds, of popular natural languages.
See the entry on EU costs for translation. And the UN - OMG!

Thursday, August 13, 2009

what grade level are political speeches written at?

Other speeches delivered during the convention ranged from grade levels of 10.5 to 6.4:

* 10.5 (Al Gore),
* 10.3 (Bill Clinton),
* 8.7 (Hillary Clinton)
* 8.0 (Michelle Obama)
* 6.4 (Joe Biden)

Two of the most important and popular speeches in American history, The Gettysburg Address by Abraham Lincoln and Martin Luther King’s I Have a Dream speech also registered at ninth grade levels (9.1 and 8.8 respectively).
D: so one needs to speak at the level of basic literacy, not functional literacy.

So what are all those kids learning after that for?
They can watch TV. <:

The Laubach teaching method has a system of 5 levels.
Level 2 seems to be the inability to smoothly pronounce polysyllabic words without breaking them down.

Frankly, this works well in Decimese with its basic CV- construction.

Core concepts of one syllable words from ancient languages.

D: I might be in a position to take weekdays off next month.
I'd use that to finish my Hioxian letter system and to develop the tier 1 of Decimese.
Tier 1 is strictly closed class function words.
Doesn't sound very hard, until one realizes I don't just make a clone of English.
Or any other attempt.
I attempt to incorporate the composite nature of many closed class words.
I harp mostly on pronouns, since they are the most obvious example.
As always, the whole language is derived from MELTS - math, ethics, logic, time, space.

I'm reading "The God Delusion" right now.
I'd like a language that has optional overt indicators of literal/metaphor/spiritual/ a religous "truth".
Imagine if the ancients had that.
We'd know if we are looking at a parable or anecdote, for example.
History or myth.

brain treats living things discretely from non-living

For unknown reasons, the human brain distinctly separates the handling of images of living things from images of non-living things, processing each image type in a different area of the brain. For years, many scientists have assumed the brain segregated visual information in this manner to optimize processing the images themselves, but new research shows that even in people who have been blind since birth the brain still separates the concepts of living and non-living objects.

D: I'd like to have indicators for
1) living and
2) human.

The above article might explain why pets are called he/she and not it.
I pointed out to a woman I know that I refer to pets as it.
Whether the dog is male or female does not matter to me.
A bitch or a curr are the same in how I interact with it.
The woman I spoke to has named her car, and it is masculine, LOL.

I found out the hard way that parents don't like their infant or toddler called it.
But they have no discernible gender, so I don't see what else I would call ... it.

I propose for Decimese
1) default to "it" - third person, singular
2) option to indicate living
3) option to denote masculine/feminine, with neuter default.
4) option for human (perhaps sentient, to allow for AIs/aliens/transgenics? <;)
So "he" would be third person, singular, neuter with human, living, and masculine denoters.
He/dog would be same but without human.
He/car would be same but without human, living.
The same car could take the default "it" pronoun.

Unlike it/he/she, I want the root/stem of "it" to be clear in he or she.
I also want more potential detail with plural forms.
It/ it(s). He(s). Et al.
They doesn't indicate gender or living or anything else.

The key theme is optional overt indicators of details.

Tuesday, August 11, 2009

engrish signs, funny, informative


Sunday, August 9, 2009

idiom - clear and not

But then, the background for "pulling someone's leg" isn't exactly clear as day either. By idioms that make sense, I mean those rooted in an observable phenomenon. "He's barking up the wrong tree." In other words, he's looking in the wrong place for what he wants. The phrase is rooted in the observable (or at least imaginable) phenomenon of a dog confused about where that pesky cat or squirrel escaped to.

Presumably all idioms start out this way, but some have become disconnected. An idiom, or idiomatic expression, cannot be puzzled out by breaking it down into parts and defining them individually. An idiom works as a whole: to kick the bucket meaning "to die," for instance.

"Idiom" comes from the Greek word idios, meaning "one's own." An idiomatic expression is one that "we understand among ourselves," even if it baffles foreigners, such as Miguel, a student of translation in Spain, who a few months ago posted a query online, "[W]hat is in the bucket and why would anyone kick a bucket in the first place[?]"

D: the punctuation fail is from page A2 of this weekend's Globe and Mail.

"Pranknet used untraceable Skype accounts to route calls creating a shield of anonymity."

D: to route calls COMMA! Gah.

Page A3 last weekend, then A2 this weekend.
I wonder if they'll drop the ball on the front page next weekend...
All for lack of a proof-reader.
I'm for hire for a 100 bucks, by the way. <:

Monday, August 3, 2009

study of kipf's law

" The law of brevity, proposed by the American philologist George K. Zipf, along with others, shows that the most frequently-used words are the shortest ones.

...when dolphins move on the surface of the water they tend to perform the most simple movements, in the same way that humans tend to use words made up of less letters when they are speaking or writing, in so-called "linguistic economy".

The research study includes the case of Oscar Wilde's novel The Picture of Dorian Gray. The most-used word is the three-letter article "the", while other larger ones, such as "responsibilities" are hardly found at all."


D: "A" and "I" are in the top 10 most common words and are only one letter.
We do not see a two-syllable word until #61 for "people".

This pressure for brevity is problematic for Decimese.
My approach to core vocabulary closed-class function words is pretty wordy - at least until certain consonant cluster data-compression approaches are used.

Take, for example, the pronoun "I".
Single, first person, pronoun.
There are 3 pieces of information there.
When we contrast he, she and it, we also find masculine/feminine and human/not categories.
Essentially, we now have FIVE discrete facts.
Add a pair-plural category, and we now have SIX.
Being able to include human/not and masc./fem. aspects to all the pronouns might be nice - particularly if we ever see sentient AIs, LOL.

How to compress such a complex approach into one syllable?
Basic word form: CV (consonant then vowel) + ending.
Consonant clusters L/R, W/Y. Plus limited vowel diphthongs.


Initially, I had hoped to have ten vowel sounds.
Then I realized that was hopeless for an international language.
I was forced to reduce this to five - pretty much as per Esperanto.
I still like the idea that 'long vowels' may be optional for advanced speakers, while word particles or additional syllables can stand in within the same role.

Here is a proposed basic design for a Decimese pronoun.
1) single/ plural (math concept) - plus 'pair' dual plural concept.
2) masc./fem. (one can use this with animals, or even objects if desired)
3) in/out (space concept, or math object manifold concept)
4) near/far (space concept)
5) human/not (one can humanize something if desired)
Some interesting options result from this approach.
One could say I BUT
a) not human
b) masculine (I, man! <:)

Core syllable: CV (plus ending consonant or 'cap syllable')
C plus LRWY plus V plus vowel diphthong plus ending.

There would be pressure to minimize detail, once the subject has been described.
You / plural, or they, would likely become you.
She becomes it.
We becomes I.

The H-sound serves all manner of special functions, creating special-duty syllbles in the form H plus vowel.
This was borrowed from Ygyde.

Closed class function words could use reserved syllable and word forms.
CV is the most common word form.
For vocabulary needs, I stick to CV plus (nasal consonant ending).
This means that any CV word is by defintion a function word.
The objection would initially seem to be a loss of clear word boundaries.

I meant.
Me meant.

But in Decimese this is not true.
Nasal consonants always imply a word-final consonant position.
Yes, this is a shameless attempt to cater to the Chinese.
The word-initial consonant is clearly differentiated from word-middle position by the voiced/voiceless distinction.
(Chinese could use heavy aspiration in lieu of this.)
E.g. P/B pair. Bam. Not pam. Bapam. But not pabam.
Note: this IS culturally biased.
English, I think, tend to devoice word-middle-position consonants.
Whereas the French do the opposite.
So yup, once again we have a cultural bias.
I know.
English is king, and Chinese is the rising star.
French was fighting a losing battle against Esperanto to maintain its pre-eminence in diplomacy a century ago.

Back to pronoun construction.
5 possible word-initial consonants. 5 vowels.
25 possible permutations.
4 possible consonant clusters.
??? possibly 5 possible vowel diphthongs.
OK, let's narrow this down.
Only ONE word-initial consonant designated for pronouns.

1 x 5 x 5... 25.
Because LRWY consonant clusters are mutually exclusive without additional syllables, we would want various mutually exclusive states described.
At first blush, I think single/plural, plural/dual and collective noun might be a good default.
With about 5 additional vowel diphthongs, once again, we want mutually exclusive.
Something we *could* do, though it sounds complex, is allows some vowel diphthongs and/or consonant clusters to denote compound conditions.
E.g. single AND masculine. Plural and feminine.
HE. And THEY (plural of she).

Ceqli is willing to sacrifice some brevity for clarity.
Go. I.
Zi. You.
Gozi. ... We.

Some languages have we-but-not-you or we-but-not-he designations.
Again, a variant prime number system could work.
(see much earlier entry).

If I just wanted to map English pronouns then my job becomes much easier.
The he/she/it difference does a sloppy job of identifying the subject.
Some dogs get called she.
Some cars get called she.
Only third-person gets a gender identifier.
It makes this optional.
And plural hides it again.
Sticking to spatial/math concepts, we then have strictly optional gender and human indicators.
Suddenly, calling a bitch (female dog) 'she' or 'it' ceases to be sloppy.
A category for living/not and adult/not is also useful.
Dog. Bitch. Curr. Puppy/ dog. Plural/not.

At public speeches, the speaker will often use the term 'ladies and gentlemen'.
Plural-human-adult-honorific, males same.
Six syllables.
Plural-human-adult-honorific, second-person (out, near).

Note the implications for vocabulary building of this word-particle approach.
Dog, bitch, curr. One word, plus some endlessly recycled core concepts.
A pair of jeans. A whole bunch jeans. Dual plural, plural.
A murder of crows, a heard of cattle (which is not clearly derived from cow).
Plural crow. Plural cow.
Steer, stallion, et al.
Just masculine, adult.
Hmm. Adult / not denotes living inherently.
Dog, puppy. Cat, kitten.
Person, people. Human indicator.
She/ dog. A bitch, living indicator.
She/ car. Female, no living or human indicator.

Well that's enough for now.

Saturday, August 1, 2009

insight, incorporating English punctuation into Decimese. Globe and Mail punctuation Fail!

My pal Roberto is back from Europe.

In the course of conversation, he mentioned Portuguese has two verbs for temporary and permanent states.
His name IS Roberto.
His job *is (for now)* data entry.

Anyway, the French/ Esperanto method to express possession is informative.
My French sux ass.
Le gateau de la femme.
Le gateus du (le le) homme.
Esperanto uses a similar mechanism without the masculine condensed approach.
What if... you used a consonant cluster in a designated part of the word to denote this invisible punctuation trick in English?
The cake of the woman.
The cake of the man.
The cake of the women/men.
The woman's cake.

Globe and Mail this Saturday, page A3.
Can YOU find the punctuation fail?
That spoiled my morning.

Yup, you guessed it - comma retardation.
Second only to apostrophe retardation.
So what happpened here?
One does need 2 commas IF one is indicating a description that delineates only a sole subject.
The dog, which lived in the doghouse on the hill, ...
THIS is not an example of this.
Please please please, somebody fire this journalist.
Then find the editors that let it slip through.

Along with the largely lackluster writing.
The Focus section made up for it, I guess.
Here's a hint.
If there should be a pause in spoken English, then there should likely be some punctuation there in written form.