{Go to, let us go down, and there confound
their language, that they may not understand one another’s speech. So the LORD
scattered them abroad from thence, upon the face of all the earth: and they
left off to build the City. Therefore is the name of it called Babel,
because the LORD did there confound the language of all the earth: and from
thence did the LORD scatter them abroad upon the face of all the earth.
(Genesis, chapter 11: 7 to 9)}
God did, Bible said.
Longing for a universal language is a dream of mankind
since antiquity, such as the Biblical story of Babel. In the human history,
many languages (such as, Greek, Latin, Arabic or English) claimed to be a
universal language with the political or economic supremacy for a short period
of time (hundreds of years), especially in the area that its political power
could reach. Nonetheless, a few languages do act as trans-national and
trans-racial literary language for millenniums, such as the Chinese written
language in China, in Vietnam, Korea and Japan. However, there are, at least,
two difficulties for any natural language to become a true universal language.
- No natural language is easy. Less than 15% of people can
truly master their mother language to a scholastic level. In general, the
difficulty of learning another natural language as a second language is
about 10 times harder than learning the mother tongue. Thus, even if we
all accepted politically that one particular natural language (such as,
English) is the lingua franca, the illiteracy rate for this
language would have still been higher than 85% worldwide.
- Just as all the de facto world languages owe
their status to historical political supremacy, the suggestion of a given
natural language as a universal language has strong political implications,
and the major world powers will never be agreeing such an agreement. Thus,
the best hope for a universal language, if ever possible, is by choosing
an insignificant language or a constructed one, such as Esperanto.
The above analysis shows that the all lingua franca in
history or currently are the result of political power, not a true universal
language linguistically.
With these realities, a universal language, if any,
must be:
- as a second language for all people, and
- as a constructed language.
Then, we must answer the following questions.
1. Can a constructed language have the same scope
of a natural language?
2. Can a small set of root words (humanly
readable, not machine codes) be found to encode the entire vocabulary of a
natural language?
3. What is the minimum number of root words
needed for such an encoding?
First is the first, can question 1 be
answered, at least, in principle? The answer is a big Yes.
For every kind of encryption, it constructs a
new language for a natural language. The simplest encryption for English is by
moving its first letter to be the last one for every word. This newly encrypted
vocabulary is, of course, a constructed language and is identical to the
old language in scope. Thus, finding a set of symbols to encode all English
words is theoretically practical.
However, this encrypted new English
language has a zero gain in linguistics. Thus, the key point is about the
question 2. Can we find an axiomatic set with finite number of members and
rules while it can regenerate a natural language in its entirety and can
be read by human (not machine) easily?
This book is trying to show that a PreBabel
universal language is, indeed, a reality.
In this preface, I will go over the history of development on this PreBabel
discovery.
In the early 1990s, the computer scientists were searching for a universal
computer language which can run on all computers regardless of their
underlying computer architectures. The solution was the Java with a Java
virtual machine, developed by Sun Microsystems.
At that time, my reaction was: Can we also
construct a universal Natural language?
I immediately came up some criteria for this
universal (natural) language (the U-language) as follow:
1. The
theoretical definition -- a universal language (u-language) must be able to "re-produce"
every nature language in existence. Here, the term "re-produce"
is not translation. It must mean that the entire language system (vocabulary
and grammar) of a selected language can be re-written with the PreBabel codes,
vocabulary of the u-language. In fact, this selected language (such as English,
Japanese, etc.) must be 100% isomorphic to a subset of this u-language. If such
a u-language can be constructed, then a true automatic language translation
machine can be built.
- The
practical constrains -- if a u-language is too difficult to learn by an
average person (not machine), it will become a dead language right after
its birth. The rule of the thumb is that it must not be more difficult
than any nature language which is learned as a second language. In fact,
the design criterion should be 10 times easier to learn than any
nature language to be when it is learned as a second language. Yet, it is
difficult to know what the term "10 times" means. We should give
it a quantified criterion. It must be learned in 100 days when a person
(12 years or older) spends 3 hours a day of good (no playing around)
study.
- The
attributes --
- It is a
second language for many nature languages. That is, no particular nature
language is a pre-requisite for learning this u-language. A u-language
must be learned without any particular nature language as its language
environment. It must be learned as a knowledge (such as chemistry or
arithmetic), not as a living habit.
- It has to
be a mute or a silent language (at the beginning) in order for it to
carry all-natural verbal languages as its dialects.
- Of course,
for any word token, it can always carry a sound. However, the
pronunciation of the u-language word token should be evolved with the
using community. Then, the verbal of the u-language will become a true
universal speaking language.
With the above criteria, I proved two laws (in
1997):
PB Law 1: Encoding with a closed set of root words (the PreBabel
root set), any arbitrary vocabulary type language will be organized into a
logically linked linear chain.
PB theorem 0: if a closed set of root words can encode one
natural language, it can encode ALL-natural languages.
Note: a
closed set means that the parts (radicals) of all vocabulary of a language will
not contain any symbol beyond (or outside of) the given root word set (in
finiteness).
PB Law 2: When every natural
language is encoded with a universal set of root words, a true Universal
Language emerges.
With these two laws, I immediately concluded
that I was unable to construct such a universal natural language, for three
reasons:
1. although English has only finite number of
word-tokens (alphabets and root-words), it can obviously not able to meet the
above criteria.
2. I have no idea of how to construct a set
closed codes (root-words or radicals) to encode a (any) natural language.
3. Even if I tried to invent a universal-code
set, it will be a nightmare for me to prove or test out that that set of codes
does, indeed, encode a (any) natural language in its entirety.
With the above three reasons, I did not think
that searching for a universal (natural) language is a worth awhile project.
In 2001,
I was in a party while one old man (about 70 years old) talked about the
evilness of simplified Chinese written system. At that time, I had not learned
anything about the simplified system and was not in any position to make any
comment. Furthermore, I did not use (read or write) the traditional Chinese
written system for 30 years by then; that is, I could not even write a simple
Chinese sentence without wondering of how to write this or that words (even the
mother tongue can be forgotten). Coming home from the party, I asked my father
(a professor of Chinese Literature of Taiwan Central University) about this
evilness of Simplified system. He gave me two books {康熙字典 (kangxi dictionary) and 說 文 解 字 (Shuowen Jiezi)}
and said: studying these two books and you will know the answer.
Both are
dictionaries. Read dictionaries? Yes, I did.
康熙字典 (kangxi dictionary) is organized via 部首
(radicals) but gives the description of each word in terms of its phonetic. In
Chinese, each word has many different pronunciations (Heteronyms). For word X,
when it pronounces X, it means A; when it pronounces Y, it means B, etc...
So, 康熙字典 is all about
word’s pronunciations which determine its meanings, and its usages.
As a dictionary,
there is no right or wrong issue for 康熙字典.
Note: while
Homographs/heteronyms are exceptions in English, they are 100% the case in
Chinese. That is, each and every Chinese word is a Homograph/heteronym.
On the other hand, 說 文 解 字 (Shuowen Jiezi)
is all about the STRUCTURE (the composite of radicals and parts) of the words,
based on a set of radicals (540). That is, the meaning of a word derives from
those radicals. The sound of the word was given without any theoretical
explanation. Although it describes 六 書 (six
ways of constructing the Chinese words):
象形 (pictograph) · 指事 (pointing) · 會意 (sense
determinators) · 形聲 (phonetic loan) · 轉注
(synonymize) · 假借 (borrowing), yet 90% of the words
(about 9,000) in the book are classified as 象形. Thus, in the
history, the Chinese written system was described as pictographic system.
Obviously, the
Chinese character system is described with two completely different pathways.
From this inconsistency, I developed the “New Chinese Etymology”, with three
results:
One, all Chinese written words (about
60,000 now) can be constructed with a set (220, a finite number) of root-words.
Two, the meaning of each and every Chinese
written word can be read out from it face (by decoding its composing radicals)
Three, the sound (pronunciation) of each
and every Chinese written word can be read out from it face too.
With the above
finding, I published {Chinese word Roots and Grammar; US copyrighted on May
5, 2006, TX 9-514-465}. This book was written in Chinese.
On January 16,
2008, I published {Chinese Etymology; US TX 6-917-909}. This book is a
textbook (in English) for foreigner (such as Americans) to learn Chinese via
this new system.
On May 24, 2012,
I published {Chinese Etymology Workbook One; with US TX 7-539-827}. This is a
workbook for the above textbook.
It took me three
years (from 2002 to 2005) to read 2 dictionaries. It took me also 3 years (from
2005 to 2008) to write two books (one in Chinese and one in English) on this
new Chinese Etymology. In those years, I worked on Chinese Etymology every day
without thinking about anything else.
One day in September 2008, I made a statement: the
entire Chinese written language (one of the natural languages) can be encoded
with a set (in finite numbers) of radicals. Then, the lightning strikes: what
about my u-language laws of 1997?
Now, I have found a closed set of codes which can encode the
entire Chinese written language; that is, this set should be able to encode
all-natural languages in terms of my PB law 1 and theorem 0.
In addition to construct a u-language via my u-language
theorem (1997) + the new Chinese etymology (encoding the entire Chinese
language), I developed a u-language theoretically via the Martian Language
Thesis (MLT) -- Any human language can always establish a communication
with the Martian or Martian-like languages. Thus, the Martian Language Thesis
is the first principle for linguistics. It encompasses the following
attributes.
Permanent
confinement -- no language (Martian or otherwise) can escape from it.
Infinite
flexibility -- it can encompass any kind of language structure.
This MLT is based on the following two principles:
Universal principle I -- all languages (human or Martian) share the
identical metalanguage.
Universal principle II -- all language
structures are subsets of a universal language structure.
What is the meta-language then?
Meta-language consists of four parts:
One:
the universal laws (physics, math, etc.) continent: all universal
events are described by the universal laws.
Two: the universal conscientiousness
(meaning) continent: the human conscientiousness views the universal laws
in an identical way, getting the identical MEANING for all universal
laws.
Three:
there is a Grand Canyon between these two continents.
Four: Human natural languages are
different symbol systems for connecting these two universal continents.
Thus, for the universal language, it must
encompass the following three attributes:
A.
Forming the words --- with finite number of symbols to form unlimited number of
words while the meaning and the pronunciation of each word can be read out from
its face.
B.
Unique meaning of each word --- every word carries a “unique” meaning, not
having multiple meanings.
C.
Universal grammar --- a grammar is the mother of all grammars.
For answering these issues, I published a new website {http://www.prebabel.info/ } in June 2009. On October 12, 2010,
I published {Linguistics Manifesto --- Universal Language & The super
Unified Linguistic Theory; with US TX 7-290-840}. The issue of two continents
is briefly discussed in Chapter Twelve of this book. For the details of the
universal grammar, I published a book [The Great
Vindications; the US copyright # TX 7-667-010 on January 23, 2013}.
The key emphasis of this book is about discussing the issue
of the perfect language. That is, is the
u-language also the PERFECT language?
What is the perfect language?
A perfect language should consist of three attributes:
One, it has only a finite number of tokens
for constructing unlimited number of words (vocabulary).
Two,
the phonetic (pronunciation) of a word (character) should be read out from its
face.
Three, the meaning of a word (character)
should be read out from its face.
Of course, a perfect language might not be a universal
language. Although that universal language issue was addressed in detail in my
previous two books, I, nonetheless, will readdress this universal language
issue again and again in this book.
For English, it has 220 points out of the maximum of 300: 100
for ‘one’, having only 26 alphabets; 100 for ‘two’, almost every word can be
pronounced from its face; 20 for ‘three’, as only words with
roots/prefixes/suffixes can be guessed for its meaning.
On the other hand, I will show that Chinese written language
is THE perfect natural language, having 300 points.
That is, I will show three linguistic issues:
One, Chinese
written language can be encoded with a closed set of radicals (roots).
Two, with my
u-language theorem of 1997 + the Martian Language Thesis, I have constructed a
u-language.
Three, I have
defined what the ‘perfect’ should be.
Now, going back to the issue of ‘Simplified Chinese system”
which got me started, I discovered that the reason for its creation (the
simplified) was caused by viewing that the original (traditional) Chinese
written language was the worst language in the world, as the dog turd by
those May 4th movement scholars who pushed for abandoning the
traditional Chinese written language, see the video {https://www.youtube.com/watch?v=HjbmAlWe_Ig } and
Chapter One.
I, then, further discovered that Chinese government issued a
language law in April 2006, prohibiting the use of any other forms
(especially the traditional form) of Chinese written system and planned to
abandon even the simplified system by 2016 while going 100% with the
Romanization (the Pinyin). Yet, with my publication of {Chinese Etymology} also
in 2006, China has abandoned her Romanization plan on August 30, 2017,
see the news article {统编教材9月启用 拼音晚学一个月, http://www.xinhuanet.com//local/2017-08/29/c_1121559170.htm } and https://www.linkedin.com/pulse/amen-victory-entire-chinese-people-jeh-tween-gong/ ; that is, I have saved the Chinese written
system single-handed. These are addressed in detail in Chapter One of this
book.
Superficially,
this book discusses the details of the Chinese etymology, but it is not the
point. The key points of this book are proving the reality of universal
language and of the perfect language.
In fact, you (the readers) need not to know a single Chinese character in order
to comprehend this book, as all those Chinese characters can be viewed as a set of
Lego pieces. The key points of the books are the principles, the laws and the
theorems of how to organize those Lego pieces. It is about the
principles/laws/theorems which makes the universal language coming alive. This
book just uses the Chinese etymology as one example to show those
principles/laws and theorems.
Of course, this
book can be very helpful for anyone who is interested in learning Chinese
linguistics via this new Chinese etymology. However, the base of this new
Chinese etymology (220 word roots and 300 sound modules) is not provided in its
entirety in this book. If you (the readers) want to learn Chinese writing
system via this new Chinese etymology, you must use the textbook {Chinese
Etymology; US TX 6-917-909}.
This book is, in
fact, a thread to sew up all my previous books on the following issues;
One,
the theory of universal language.
Two, the definition of perfect language.
Three, the actual construction of u-language
and the proof of a perfect language.
Four, the greatest historical event of saving
the perfect language of the humanity from a disastrous destruction.
From Chapter one
to Chapter twelve, I used Chinese etymology as one example to demonstrate the
theory of universal language and to provide one real example of a perfect
language. The Chapter thirteen is, however, a recap of the entire PreBabel
principles and laws while also provides a real
model for a PreBabel language.
Thus, this book is for linguists to witness the evidence of a
PERFECT language system and of the reality of the universal language.
In addition to
this book, you (the readers) are encouraged to read the following books.
One, Linguistics Manifesto --- Universal Language
& The super Unified Linguistic Theory; Written in English, US copyright TX 7-290-840.
Two, The Great Vindications; Written in
English and Chinese, US copyright TX
7-667-010.
Three, Chinese Etymology; written in English, US TX
6-917-909.
Four, Bible of China Studies & new
Political Science; Written in English, US
copyright TX 8-685-690.
Five, 中文的字根與文法: 天馬行空的漢語 (Chinese word roots and Grammar); written
in Chinese, US copyright TX 6-514-465
Some info about
those books is available in the Appendix of this book.
No comments:
Post a Comment