Monday, November 9, 2009

babies born knowing their native tongue a bit. vocabulary design. esperanto! <:

http://www.newscientist.com/blogs/shortsharpscience/2009/11/actung-baby-german-babies-say.html

In the latest study, Kathleen Wermke of the University of Würzburg in Germany and colleagues claim that the womb may also be where we learn to produce our first sounds. Wermke's team found that French newborns produced "rising" or low to high, contours of sound, whereas German newborns preferentially produced "falling", or high to low, contours.

The researchers write that these are "consistent with the intonation patterns observed in both of these languages", and conclude that these differences are in place so soon after birth that they must have been learned prenatally.

------
D : well how about that.

My roomie and I just watched the first half of a documentary show called "The History of English". (Punctuation placement intended.)
This show portrays a language that very nearly never became what it is today.
It served certain political ends to get used, often in support of local rulers.
It absorbed vocabulary from many other tongues.
However, the basic stucture remained stable. This basic structure is reasonably easy to learn.
Again, suprasegmental aspects and function words remained highly resilient.

During the Renaissance, English generated vast amounts of new vocabulary from Greek and Latin sources. This resembled the method of Esperanto in passing.
Some words continue to be used, while others fell out of favour.
This leads to arbitrary incidents where the obvious opposite prefix cannot be used to generate the opposite word from the word core.
As much as one can praise English, a neat and concise core vocabulary is not such an aspect. It was described as nuanced and flexible and extensive.
I feel sorry for a new immigrant ESL student who gets to rote-memorize huge lexicon lists. I somehow doubt they use the word nuanced...

The Anglo-Saxon Germanic attempt to create a 'pure' English that was true to its roots is telling. Many new words were coined that involved compound words.
This reminded me a bit of PIE (proto indo european).
I think the PIE word for marriage literally means "give-heart".
Such an approach can complement thoughtful infix use for nuance, while retaining a fairly small core vocabulary to learn.
As Esperanto and Lojban have found out, a language needs a pretty extensive basic vocabulary to compound to generate more vocabulary.
If there are too few, the newly generated compound words are sufficiently vague to still require constant interpretation and clarification.
This is also true of infixes.

http://esperanto-usa.org/node/77

Esperanto lets you invent your own vocabulary

You can combine words, prefixes, and suffixes as you speak to make new words. In English, you can't just stick "un-" in front of the word "recommend" (unrecommend? disrecommend?). In Esperanto, if the opposite of a word ― a noun, a verb, an adjective ― makes sense, go ahead! Malrekomendi is perfectly good Esperanto. You want to really, really malrecommend something? Malrekomendegi! Every Esperanto speaker will get your point.
Esperanto has a recognizable vocabulary

You may have recognized several of the Esperanto words above, or seen related words in English (bona -> bonus, alta -> altitude, feliĉa -> felicitous). About 70% of Esperanto vocabulary is directly or indirectly derived from Latin roots, many of which also appear in English. Another major chunk is from Germanic roots (hundo, hound or dog). So you'll understand a good part of the words with little trouble.
------------
D: Latin/Germanic- claiming of particular ease to English on one hands, then claiming some mythical international character on the other...

http://www.freelang.net/dictionary/dic-lists.php

D: the English-to-Esperanto contains c. 17,000 items.

----
http://answers.yahoo.com/question/index?qid=20090928115959AAM6Hh7

There are about 5000 official roots in Esperanto. You can find a hard-to-navigate list of all the official roots here: http://www.akueck.de/oleo.zip

D: but the extensive PIV dictionary has about 15,000 items.

-----
This is an appropriate place for me to say a few words about the material for the dictionary. Much earlier, when I had examined and rejected every non-essential from the grammar, I had desired to exercise the principles of economy in respect of the word-material also. Thinking that it was a matter of indifference what form any particular word took, so long as it was agreed that it should express a given idea, I simply invented words, taking care only that they should be as short as possible, and did not contain an unnecessary number of letters. Instead of using "interparoli" (to converse), a word of eleven letters, why should we not express the idea just as well by some word of two letters, say, "pa"? So I simply wrote the shortest and most easily pronounced mathematical series of conjoined letters, to each factor of which series I gave a certain meaning (e.g., a, ab, ac, ad, ba, ca, da . . .; e, eb, ec . . .; be, ce . . .; aba, aca . . . etc.).

But I immediately rejected this notion, for my own personal experiments proved that these invented words were very difficult to learn, and even more so to remember. I came to the conclusion that the material for the dictionary must be Romance-Teutonic, altered only so far as regularity and other important requirements of language demanded. Standing upon this ground, I soon observed that the present languages possessed an immense supply of words already international, with which all the nations had a prior acquaintance, and which formed a veritable treasure house for the future international language--and, of course, I utilised this treasure.

D: Zamenhof on his basic language design philosophy.
He rejects philosophical language tenets from the get-go.

There was much to lop, alter, correct, and radically to transform. Words and forms, principles and postulates, jostled with and opposed each other, whereas in theory, taken separately and not subjected to extended tests, they had appeared to me perfectly good. Such things, for instance, as the indeterminate preposition 'je,' the elastic verb 'meti,' the neutral termination 'aŭ,' etc, possibly would never have entered into my head if I had proceeded only upon theory. Some forms which had appeared to possess a wealth of advantage proved in practice to be nothing but useless ballast, and on this account I discarded several unnecessary suffixes.

D: but here is the price to pay for starting with such a hodgepodge...

------
This problem I considered for a long while. At last the so-called secret alphabets, which do not necessitate any prior knowledge of them, and enable any person not in the secret to understand all that is written if you but transmit the key, gave me an idea. I arranged my language after the fashion of such a key, inserting not only the entire dictionary but also the whole grammar in the form of its separate elements. This key, entirely self-contained and alphabetically arranged, enabled anyone of any nationality to understand without further ado a letter written in Esperanto.

D: a concept I explore in HIOXian anatomical-based stylized pictograms - ideograms.
-----
D: Basic English by Ogden and derivatives claim a core vocabulary of about 1000 words.
Allowing for multi-word lexemes and generated compound words, this likely approaches 10,000. One can pay now (early) or later.
Less vocabulary core items, but more unclear and individually learned MWLs?
Any MWL that cannot be deconstructed and understood from its original commponents amounts to another basic vocabulary item that must be learned.
They often amount to idiom, which must be learned as 'part of the culture'.

I remain convinced that careful first design principles for both closed and open class words, with rules for both, could yield an unprecedentedly concise-but-flexible quality in a vocabulary.
-----
Reduplication

Reduplication is only marginally used in Esperanto. It has an intensivizing effect similar to that of the suffix -eg-. The common examples are plenplena (chock-full), from plena (full), finfine (finally, at last), from fina (final), and fojfoje (once in a while), from foje (once, sometimes). So far, reduplication has only been used with monosyllabic roots that don't require an epenthetic vowel when compounded.

D: You see a lot of reduplication in creoles.
It is often a way to differentiate two similar sounding words.
------
Idioms

In addition to the root words and the rules for combining them, a learner of Esperanto must learn some idiomatic compounds that are not entirely straightforward. For example, eldoni, literally "to give out", means "to publish"; a vortaro, literally "a compilation of words", means "a glossary" or "a dictionary"; and necesejo, literally "a place for necessities", is a toilet. Almost all of these compounds, however, are modeled after equivalent compounds in native European languages: eldoni after the German herausgeben, and vortaro from the Russian словарь slovar'.

D: leading to the typical scenario of a tourist trying to find a toilet and desperately needing to pee, while trying to convey one's idea, LOL.

I guess that is no more ridiculous than folks asking for a bathroom - when there is no bath!
-----
D: in conclusion, a language designer should not shy away from a sufficiently extensive core vocabulary in the name of conciseness.
A language does need to be able to express concepts likely to be encountered in life.
At the same time, duplication and word redundancy ought to be minimized.
I suppose some overlap if preferable to gaps in the ability to express concepts.
Multiple ways to say the same thing will likely crop up from time to time.
----
D: here I quote The Satanic Bible, er Ranto. <:

E2: Clarity

These affixes are often baffling. In , "cigarette box", <-uj-> means "(bulk) container". But it also occurs in , "Sweden" (not "Swedish ghetto") and , "apple tree" (not "apple barrel").

D: how is that for clear? <:
The trouble is that Zamenhof emphasived largely the syllable and not the single phoneme.
Because of his Latinate word-generating approach, with all the limitations of a natural language, this was bound to happen.
Some various possible translation errors for the above example might be:
- apple box for apple tree
- swede box for Sweden.
- cigarette tree
- Swede... tree?
<: LOL!
There is no clear system to sort out a word segment that is part of the root/stem versus part of an infix, either suffix or prefix.

D: to me, the easiest way to avoid this quagmire is to have certain syllable formats dedicated to root/stem, while others clearly only modify the basic meaning.
Simply using word particles versus infixes also helps, and is of more use to a language speaker unused to much agglutination.

overfussy distinctions - all mean "to marry".
D: his criticism about clockwork methodology to generate vocabulary would also apply to a philosophical language, though there is likely less overlap and redundancy.
Nonetheless, some very odd and often useless hypothetical words could be spoken.

----
Zamenhof was if anything overzealous in this department, stuffing his "basic" wordlists with trivial distinctions such as "a kiss" versus "a noisy kiss", and so on; who asked for these?
F3: Simplicity

This is the inverse problem, overlooked by Zamenhof. Language learners want to be able to start communicating with as little rote learning of vocabulary as possible.

D: "a noisy kiss" seems to be an ideal time to use compounding versus coining a new core term. Like I said, "Z" was haphazard.
------
G5: Elegance

Shoehorning words into this system can mangle them horribly.

<-a>
= "by marriage, bilious, repentant, ancient"

D: figuring out the origin and etymological meaning of such trunctations is impossible.
I think this aspect can be best understood as a reaction to the very brief Volapuk language.

----
"coffee" (near-globally ) becomes , etc.

D: My Japanese friend Hiroshi says something more like co-hay.
So I am not sure how internationa kafo is.
----
D: A problem with Esperanto, given the lack of planning to carefully use certain syllable formats for certain language aspects, is not 'wordiness' but lengthiness.
By that I mean quite a few vowels, and a long time to say the word.
Hund- (dog root). Hundo- dog.
The temptation to tack on a whole new syllable for each nuance (grammatical element, verb tense) ensures this.
A latinate word generating system all but ensures this.
To some degree, any natural language basis would.
To be fair, hundo means dog, hundoy means dogs. Hundoyn is dogs dogs/object.
So after the initial -o syllable, additional nuance does NOT add syllables.

Philosophical and/or taxonomic IALs have problems.
Not THOSE problems though...

No comments: