Criteria for Core Vocabulary Selection


In the phonology section we considered a phonetic limitation such as would present the minimum difficulty for the maximum number of peoples worldwide. Fortunately, many common words in diverse languages fulfil this condition already.

Other criteria concern the choice of the words themselves. In a previous webpage I addressed this subject as follows:

"Synonyms and near-synonyms present difficulties in most languages - how much more in an IAL which, initially admitting all words from all sources, would be inundated by hundreds or thousands of synonyms and near-synonyms - not to mention umpteen million words! Happily the problem is much less daunting than it might appear at first sight, due to a number of mitigating factors, including the following:

(1) When most words from most languages can be rendered into the same orthography, most of the world's words will become available to the discriminating speaker or writer, who will then be able to choose the best synonym for a particular purpose - for no reason except its sound. In this way the ideal word might emerge - as it has in the past. (Ultimately, for the sake of simplicity there should be no synonyms within the IAL.)

(2) A suitable word already existing in a living language should always be chosen in preference to a neologism. The latter might be more logical, etymologically speaking, but only the test of time proves euphony.

(3) The extant original form of a word should be used rather than transliterated versions in other languages.

(4) Justice demands that the IAL's vocabulary be selected from all languages. In fact this is not a limitation, since things and ideas tend to originate in different countries - and often the best of them in small nations, within minority tongues. The other side of this coin is the requirement to maximise phonetic range and depth, so as to minimise the number of homographs in an orthographic script.

(5) It might happen that a word chosen for the IAL eventually failed: perhaps because most people disliked its sound, or its historical associations. However, synonyms would continue to exist in the remaining mother-tongues for centuries, so replacing a word in the IAL should not be too difficult.

(6) The globalisation of commodities and ideas is not taking place wordlessly. Thus the same processes that have raised one synonym above others within national tongues have begun to work internationally. In this way the best words for the IAL might appear.

(7) Where synonyms of equivalent pedigree exist, it is probably better to choose the older word, or, where that cannot be established with certainty, the shorter. In many cases the shorter word, or - more exactly - the word requiring less effort to articulate, will be the older word (Zipf's Law).

(8) Whereas the IAL is unlikely to borrow Chinese characters for its script, it might adopt the Chinese system of word-formation - as imitated by progressive constructed languages.

(9) Where it is impossible to choose between alternative words, and a compromise word has failed, it may be necessary to return to first principles. Did Cratylus identify one of these in Plato's eponymous dialogue? He pointed out that rho is a sign of motion, found in words such as "tremor, tremble, strike, crush, bruise, tremble and whirl" because it is linked to the physical activity of pronunciation. According to Socrates, the tongue was "most agitated and least at rest in the pronunciation of this letter" and therefore it was originally used to express motion. Aspirated phonemes requiring expenditure of breath, likewise find themselves in windy, tempestuous words such as "shivering, seething, shock and shaking". Lamda, with its liquid smoothness produced by the slipping of the tongue, is found in words like "slip, level, floor, flood, sleek" (when combined with another syllable it denotes easy but repetitive motion as in "handle, swivel, anvil, paddle"); gamma, in which the tongue is detained, combines with lamda to express the notion of stickiness, as in "glue, glutinous, glucose"."


Not a rigorous account by any means, but still a flavour of what is required. A fuller set of criteria (in some respects) is given by Unish on their "Lexicon" page:


Principles for Selecting Unish Words


 1. Commonness


 1.1 Commonness in Quantity

The word that is used in several languages or borrowed into them is given priority in selection.


academy = akademi (in 13 languages)

guitar = gitar (in 13 languages)

tea =  cha (in 14 languages)

gas =  gas (in 13 languages)

humor =  humor (in 14 languages)

film =  film (in 11 languages)


1.2 Commonness in Official Languages

 The word that is officially used in international communities is given priority in selection.

 Example 1:  gold, oro, ouro, or, chrusos, aurum, zoloto, gum, kin, jin, dhahab, sonaa, emas ->  gold

 Example 2:  sentury, jarcento, aionas, centum anni, jahrhundert, siglo, seculo, secolo, siecle, stoletie, segi, seiki, shiji, qarn, sataabdii, abad -> senturi


 1.3 Commonness in Daily Life

 The word that is familiar to people in everyday life due to its appearance in signboards or brand names is given priority in selection.

Example 1:   stop ~ stop          fire ~ fire 

Example 2:   brown ~ braun         river ~  rio 


2. Shortness


2.1 Shortest Word

The shortest word is selected.

Example 1:   brain, cerbo, mualo, cerebrum, gehirn, cerebro , cervello, cerveau, mozg, nwe, noo, nao, dimaagh, mastisk, otak ->  nao 

 Example 2:   acquire, akiras, apokto, adquirere, erwerben, adquirir, acquisire, acquerir, riobretat, odda, eru, dedao, yaHaraz, arjit karng, memperoleh ->  eru 


2.2 Shortest Spelling

In the case of the words that have the same origin, the most simple spelling is selected.

Example 1:  perfect, perfekta, perfectus, perfekt, perfecto, perfeito, perfetto, parfait ->  perfet

 Example 2:  certain, certa, certus, certo ->  sert


2.3 Shortness by Clipping

 If meaning is sufficiently conveyed by a stem, other parts are omitted.

Example    equipment, ekipazo, aksoplismos, arma, ausrutattung, equipo, equipamento, equipaggiamento, equipement, oborudovanie, jangbi, soobi, zhuangbei, muiddat, upaskar, kelengkapan -> eqip


3. Distinctness

Even a short word is avoided if its pronunciation is too simple and it can be confused with other words.

Example 1:  goose, ansero, chena, anser, gans, ganso, oca, oie, gus, gowi, gachoo, e, wizza, hans, angsa ->  gus

 Example 2:  degree, grado, moira, gradus, grad, grado, grau, degre, stepen, jondo, teedo, chengu, daraga, darjah ->  degri


4. Phonological Simplicity

 If possible, the structure of 'CVCV', which is easy to pronounce, is selected.

 Example 1:  appear, aperas, parousiazomai, apparere, erscheinen, aparecer , apparire, apparaitre, natanada, arawareru, chuxian, dhaHar, upsthit, menjelma ->  apar

Example 2:  atmosphere, atmosfero, atmosphaira, aer, atmosphare, atmosfera, atmosphere, degi, taiki, daqi, aljaww, vaataavaran, suasana ->  degi


5. Cultural Priority

 A word is selected from the language with which the word is most associated culturally in meaning.

Example:   tea =  cha       rose =  roz 


 6. Compounding

  A new word is created through active compounding.

 Example 1:   dictionary ->  motbuk (mot 'word'  + buk 'book') 

Example 2:   Monday ->  lunadi (luna 'moon' + di 'day' )

Example 3:   first ->  unme (un 'one'  + me 'ordinal ending' )


7. Diversity

 The degree of familiarity is enhanced by selecting words from diverse origins as much as possible

Example 1:     nose =  noz      neck =  mog       cheek =  jia       heel =  eri

Example 2:   elephant =  elefant     bird =  tori     sheep =  yang       goose =  gus 


 Transformation of Characters


 1. Consonant Deletion

CC ->  C : Unnecessary duplication of consonants is simplified

 Example 1:           bell ->  bel          roll ->  rol         address ->  adres

Example 2:      check -> chek


2. Vowel Deletion


 2.1 VV ->  V

 The cluster of vowels is simplified as much as possible.

Example:    enough ->  enuf       heavy ->  hevi       guest ->  gest         heart ->  hart


2.2 -VCVC# ->  -VCC# (cf. nasal/liquid sound)

 When the final consonant in the structure of '-VCVC' is a nasal or liquid sound, the vowel which is pronounced as [schwa] is omitted.

 Example:    even ->  evn     bottle ->  botl


2.3 V (cf. word-final position)

 A word-final inflectional ending or silent vowel is omitted.

 Example 1:    sure ->  sur      advice ->  advis

Example 2:     tower, turo, turris, turm, torre, tour ->  tor                                   adventure, aventuro, abenteuer, aventura, avventura, aventure ->  aventur


 3. Semi-Vowel Deletion


3.1 ow which is pronounced as [ow] ->  o

Example:    slow -> slo      snow ->  sno          yellow ->  yelo


3.2 iy which is pronounced as [iy] ->  i


 4. Systematic Change


 4.1 [ks] ->  x

Example:        access ->  axes              success ->  suxes


  4.2 [qu] ->  q

Example:        quick ->  qik           quarter ->  qart


 4.3 c which is pronounced as [k] ->  k

Example:       class ->  klas           card ->  kard             car ->  kar


 4.4 c which is pronounced as [s] ->  s

Example:             city ->  siti               century ->  senturi


 4.5 g which is pronounced as j

Example:       orange ->  oranj         charge ->  charj            energy ->  enerj


4.6 y ->  i/ai

 Example 1:   system, sistemo, sistema, systeme, sisutem, shisutemu, sistem ->  sistem

Example 2:       every ->  evri         employ ->  emploi

Example 3:       dry ->  drai          fly ->  flai


5. Change for Shortness

 When the pronunciation of a word is widespread and the word tends to be shortened, the spelling of the word is changed according to its pronunciation.

Example:    business =  biz         introduce =  intro         advertise =  ad


 Word Templates


 1. Personal pronouns

I = me   you (sg) = de    he/she/it = le     we = mes      you (pl) = des     they = les


 2. Interrogative pronouns

what = wat      who = wu         which = wich        why = we        where = wer            

when = wen       how = wi


3. Determiners

all = al      every = evri         each = ge          any = eni           many = meni

 more = mor        most = most        some = som         no = no          

other = odro       another = unodro         that = dat          this = dis


4. Conjunctives


and = e      but = but      or = or       so = so          if = if         as = az      

because = koz     since = sins      although = aldo, daga      while = wail


 5. Cardinal numbers/ ordinal numbers (Number + me)


one = un       first = unme

two = du       second = dume

three = tri       third = trime

four = sa       fourth = same

five = wu       fifth = wume

six = ses       sixth = sesme

seven = sep       seventh = sepme

eight = ot       eighth = otme

nine = nov       ninth = novme

ten = ten       tenth = tenme

hundred = cent       hundredth = centme

thousand = mil       thousandth = milme


Local Index

Next Page