Sunday, January 31, 2010

evolution of punctuation and spacing. new aux-langs don't need them.

This evolution has 3 aspects:
1) spacing between words, and other elements such as paragraphs
2) punctuation that has an impact on how the passage is spoken out loud, like quotation marks for direct quotes
3) grammatical silent punctuation that aids silent reading.

D: I am borrowing heavily from Manguel's "A History of Reading".

"Because books were mainly read out loud, the letters that composed them did not need to be separated into separate unities, but were strung together in a continuous sentences."
...
"The ancient writing on scrolls - which neither separated words nor made a distinction between lower-case and upper-case letters, nor used punctuation - served the purposes of someone used to reading aloud..."

"Augustine, like Ciceero before him, would have had to practise a text before reading it aloud..."

D: Servius criticized Donat in 4thC AD for reading Virgil's Aenid as "ex Ilio" vs "exilio" - from Troy vs exiled.

"The separation of letters into words developed very gradually ... the ancient scribes were so familiar with the conventions that they apparently needed hardly any visual aids..."

"In order to help those whose reading skills were poor, the (early Christian monks) in the scriptorium made use of a writing method .... which divided the text into lines of sense, a primitive form of punctuation..."
Saint Jerome noticed this and appreciated how it rendered a passage more clear.

Punctuation was critical to the development of SILENT reading. Saint Isaac and Isidore make reference to this.

By the 8thC AD, a combo of dots and dashes duplicated our comma, period, and semi-colon.
By the 9th, silent reading was common enough for monks to place a physical SPACE between words.
And so words and also grammatical parts of speech began to be portrayed as visually discrete.

---
D: to this day, silent grammatical punctuation remains misunderstood by the masses.
I long ago gave up trying to explain the difference between your and you're.
Or -s and -'s.

D; even sans punctuation, the timing of a passage can be fairly clearly indicated.
If a space is a word boundary, then 2 could be a comma, and 3 a sentence terminator.
The dog ran. The dog ran then ran back. .... ran back It had a bone
This is almost like the layout of notes on sheet music for timing.
English does not portray timing WITHIN a word, although our syllable stress system does use timing.
I'm convinced that expressing word pronunciation timing as sheet music world would be able to predict word distortion in rapid and clipped contemporaneous speech. The larger a word, the more we hafta compress it to fit into one whole note of time. So the syllables become increasingly simplified with dropped consonants, more co-articulation occurs, and more vowels are turned into schwa or dropped entirely.

----
Modern aux-langs.

Loglan and lojban have self-isolating words. The syllable formats are restricted enough that no spaces between words are necessary. Spaces amount to 'training wheels' for new speakers.

Ceqli also uses this.

This morpheme-shape effectuates "Self-segregating morphology" (SSM), that is, in any string of Ceqli, it is possible to tell where all the morphemes begin and end.

tofelindweltogrinsadom

A beginning consonant signifies the beginning of a morpheme. A vowel followed by a consonant signifies the end of a morpheme. So the sentence is clearly broken down into:

to felin dwel to grin sa dom

D; Decimese is clearly of the latter tradition. Basing a syllable format on Mandarin accomplishes most of this already.
D: hmm, Ceqli uses both voiced and voiceless pairs. That will be tricky for Mandarin speakers.
A definite weighting in favour of English.

Whereas Decimese uses this additional element to denote word boundaries.
I.e. P/B pair. ....pa... necessarily indicates either a word particle, or the first syllable in a multisyllable word.
Wheeas ...ba... necessarily indicates a mid-word position in a multisyllable word.

I like as many ways to clearly denote word boundaries as possible.
Until a speaker can parse word boundaries, they are unable to suss out the vocabulary items.

A listener listens for certain patterns in "phoneme sequence constraints".
For example, English speakers do not begin a word with "ng". They associate NG with a word mid or final consonant.

I am designing Decimese to be unprecedentedly clear regarding word boundaries.
This should mean that a learner will rapidly be able to tease apart individual words, even from an unbroken stream of phonemes in colloquial speech.
This is terribly important in a world language!

http://www.sciencedaily.com/releases/2009/05/090506093952.htm
D: kids should learn to read prior to 10. OK. I don't personally see the point of requiring kids in pre-Kindergarten to know the alphabet. I guess parents can use that to impress the relatives. The rat-race- keeping up with the Joneses!

----
This century's aux-langs began to experiment with careful control of syllable formats.
This, in turn, rendered silently read passages clear even without spacing between words.
In this respect we have reproduced the specialized expertise of ancient scribes and monks, but without the skill element needed.

-----
Lessons, insights applied to Decimese.

In English, the # 2 is sometimes used instead of typing out two. But it is also used in lieu of to and too.
This usually results in clear meaning, since the # 2 would be out of place in the sentence.
I used 2 like Lego. I like 2 dance.
I hate sardines. I h8 sardines.
Use of English #s in this fashion is piecemeal. The irregular names and complex format preclude more common usage.
For example 3 ("three) is unlikely to fill in as short-hand for any other syllable.
These same observations also apply equally to letter names. B - bee, be. C -see sea. D-dee, maybe de- prefix?
W -double... yoo? As you can see, the letter names are as sporadic in English and number names.

Decimese simple letter and # naming conventions assure that their syllable format will appear often..
For example, we can plan to use the various classes of letters in various specific types of words and concepts.
Pairs: PB TD FV KG et al. LRWY. H, M NG N.
We can easily assign the first 6-7 pairs to common prepositions, for example.
The other consonants LRWYH could be assign to conjunctions -and Boolean logic operations.
And so on.
We also have 2 sets to work with. The # name and the letter name.
Visemese example: #s 2-9 ba cha da...
Variants for letters 2-9 could have been be che de...

With planning and careful thought at the language design stage (and only then!), we can make a system that allows use to heavily abbreviate our typing.
A BS English reformed example might be as follows.
Take the consonant in each #. One, two, three, four. Then add the vowel A,
1 2 3 4 ... na ta tha fa...
Do something similar with the alphabet. Continue the theme of early consonants of adding a long E sound.
BCDFGH... bee see dee fee gee (Get)...
This is somewhat like a syllabary system of sorts.
I suppose consonant clusters could be indicated with the appropriate # or letter followed by the consonant.
E.g. 3 .. tha. 3r.. THRA. And so on.

An obvious application in Decimese to indicate plural would be to attach the # or letter (sans syllable) to the beginning of a noun. Borrowing English. Man. 3man. 3... plural... man(men).. This is much like Chinese 'xie' for some.
Variants could indicate concepts such as none, single, few, some, ... all.
However we risk dual-meaning homophones when spoken if I do not take care during the initial design stage.
Considerations such as this are part of why I'm so unwilling to commit early to a poorly planned but early format.

2 comments:

Dino Snider said...
This comment has been removed by the author.
Dino Snider said...

Capital letters COULD have different names and pronounced names for them.
Visemese: small case letters: be(beh), cheh deh...
Capital letters: perhpas bi(bih), chih dih...
We could pretend that we have a 'capital NUMBER' concept. This could stand in for variables, for example. E.g. 2a + 3b=5c.
I suppose 'variable 1 - a" is synonymous with letter variant 2nd in meaning.
Perhaps in 2a, 'a' would be '3rd letter form'.
E.g. B, C, D... bo(boh) choh doh....
A useful and powerful system!
It also saves me the grief of using a more elaborate syllable form, or an additional syllable, to denote "variable".
I'm sure this concept could be extended more.