Friday, September 19, 2008

computer translation interlingua, problem with natural languages

http://www.sciencedaily.com/releases/2008/09/080918091618.htm

"So far all approaches to solve the multilingual component have run into serious difficulties...Other approaches set out to use one language (almost always English) as a pivot. Again the results were held back, this time because the use of a natural language as an interlanguage occasions ambiguity. As the researchers note, early attempts at using a natural reference language to build machine translation systems go back 20 years ago, and the results were no good."

D: the con-lang (controlled natural language) approach has shown some promise.
Attempto is a good example, and one I've studied.

http://attempto.ifi.uzh.ch/site/

Attempto Controlled English (ACE) is a controlled natural language, i.e. a rich subset of standard English designed to serve as specification and knowledge representation language. ACE allows users to express professional texts precisely, and in the terms of their respective application domain. As any language, ACE must be learned to be used competently, but this amounts to learning the differences between ACE and full English, formulated as a small set of ACE construction and interpretation rules. Once written, ACE texts can be read and understood by anybody.

ACE appears perfectly natural, but — being a controlled subset of English — is in fact a formal language. ACE texts are computer-processable and can be unambiguously translated into discourse representation structures, a syntactic variant of first-order logic.

D: this removes some of the impressiveness of Lojban, which makes a similar claim (wiki source).

Lojban (pronounced [ˈloʒban]) is a constructed, syntactically unambiguous human language based on predicate logic. Its predecessor is Loglan, the original logical language by James Cooke Brown.

D: finally, as an aside, I'll note A++, which is a learning tool to understand computer programming. One deals in the most rudimentary concepts, and composite concepts are clearly indicated as such.

A++ stands for abstraction plus reference plus synthesis which is used as a name for the minimalistic programming language that is built on ARS.

ARS is an abstraction from the Lambda Calculus, taking its three basic operations, and giving them a more general meaning, thus providing a foundation for the three major programming paradigms: functional programming, object-oriented programming and imperative programming.

ARS Based Programming is used as a name for programming which consists mainly of applying patterns derived from ARS to programming in any language.


D: the book is online.

http://www.aplusplus.net/bookonl/index.html


No comments: