Criteria for Core Vocabulary Selection
In the phonology section we considered a phonetic limitation such as would present the minimum difficulty for the maximum number of peoples worldwide. Fortunately, many common words in diverse languages fulfil this condition already. Other criteria concern the choice of the words themselves. In a previous webpage I addressed this subject as follows: "Synonyms and near-synonyms present difficulties in most languages - how much more in an IAL which, initially admitting all words from all sources, would be inundated by hundreds or thousands of synonyms and near-synonyms - not to mention umpteen million words! Happily the problem is much less daunting than it might appear at first sight, due to a number of mitigating factors, including the following: (1) When most words from most languages can be rendered into the same orthography, most of the world's words will become available to the discriminating speaker or writer, who will then be able to choose the best synonym for a particular purpose - for no reason except its sound. In this way the ideal word might emerge - as it has in the past. (Ultimately, for the sake of simplicity there should be no synonyms within the IAL.) (2) A suitable word already existing in a living language should always be chosen in preference to a neologism. The latter might be more logical, etymologically speaking, but only the test of time proves euphony. (3) The extant original form of a word should be used rather than transliterated versions in other languages. (4) Justice demands that the IAL's vocabulary be selected from all languages. In fact this is not a limitation, since things and ideas tend to originate in different countries - and often the best of them in small nations, within minority tongues. The other side of this coin is the requirement to maximise phonetic range and depth, so as to minimise the number of homographs in an orthographic script. (5) It might happen that a word chosen for the IAL eventually failed: perhaps because most people disliked its sound, or its historical associations. However, synonyms would continue to exist in the remaining mother-tongues for centuries, so replacing a word in the IAL should not be too difficult. (6) The globalisation of commodities and ideas is not taking place wordlessly. Thus the same processes that have raised one synonym above others within national tongues have begun to work internationally. In this way the best words for the IAL might appear. (7) Where synonyms of equivalent pedigree exist, it is probably better to choose the older word, or, where that cannot be established with certainty, the shorter. In many cases the shorter word, or - more exactly - the word requiring less effort to articulate, will be the older word (Zipf's Law). (8) Whereas the IAL is unlikely to borrow Chinese characters for its script, it might adopt the Chinese system of word-formation - as imitated by progressive constructed languages. (9) Where it is impossible to choose between alternative words, and a compromise word has failed, it may be necessary to return to first principles. Did Cratylus identify one of these in Plato's eponymous dialogue? He pointed out that rho is a sign of motion, found in words such as "tremor, tremble, strike, crush, bruise, tremble and whirl" because it is linked to the physical activity of pronunciation. According to Socrates, the tongue was "most agitated and least at rest in the pronunciation of this letter" and therefore it was originally used to express motion. Aspirated phonemes requiring expenditure of breath, likewise find themselves in windy, tempestuous words such as "shivering, seething, shock and shaking". Lamda, with its liquid smoothness produced by the slipping of the tongue, is found in words like "slip, level, floor, flood, sleek" (when combined with another syllable it denotes easy but repetitive motion as in "handle, swivel, anvil, paddle"); gamma, in which the tongue is detained, combines with lamda to express the notion of stickiness, as in "glue, glutinous, glucose"."
Not a rigorous account by any means, but still a flavour of what is required. A fuller set of criteria (in some respects) is given by Unish http://www.unish.org/ on their "Lexicon" page:
Principles for Selecting Unish Words
1. Commonness
1.1 Commonness in Quantity The word that is used in several languages or borrowed into them is given priority in selection. Example: academy = akademi (in 13 languages) guitar = gitar (in 13 languages) tea = cha (in 14 languages) gas = gas (in 13 languages) humor = humor (in 14 languages) film = film (in 11 languages)
1.2 Commonness in Official Languages The word that is officially used in international communities is given priority in selection. Example 1: gold, oro, ouro, or, chrusos, aurum, zoloto, gum, kin, jin, dhahab, sonaa, emas -> gold Example 2: sentury, jarcento, aionas, centum anni, jahrhundert, siglo, seculo, secolo, siecle, stoletie, segi, seiki, shiji, qarn, sataabdii, abad -> senturi
1.3 Commonness in Daily Life The word that is familiar to people in everyday life due to its appearance in signboards or brand names is given priority in selection. Example 1: stop ~ stop fire ~ fire Example 2: brown ~ braun river ~ rio
2. Shortness
2.1 Shortest Word The shortest word is selected. Example 1: brain, cerbo, mualo, cerebrum, gehirn, cerebro , cervello, cerveau, mozg, nwe, noo, nao, dimaagh, mastisk, otak -> nao Example 2: acquire, akiras, apokto, adquirere, erwerben, adquirir, acquisire, acquerir, riobretat, odda, eru, dedao, yaHaraz, arjit karng, memperoleh -> eru
2.2 Shortest Spelling In the case of the words that have the same origin, the most simple spelling is selected. Example 1: perfect, perfekta, perfectus, perfekt, perfecto, perfeito, perfetto, parfait -> perfet Example 2: certain, certa, certus, certo -> sert
2.3 Shortness by Clipping If meaning is sufficiently conveyed by a stem, other parts are omitted. Example equipment, ekipazo, aksoplismos, arma, ausrutattung, equipo, equipamento, equipaggiamento, equipement, oborudovanie, jangbi, soobi, zhuangbei, muiddat, upaskar, kelengkapan -> eqip
3. Distinctness Even a short word is avoided if its pronunciation is too simple and it can be confused with other words. Example 1: goose, ansero, chena, anser, gans, ganso, oca, oie, gus, gowi, gachoo, e, wizza, hans, angsa -> gus Example 2: degree, grado, moira, gradus, grad, grado, grau, degre, stepen, jondo, teedo, chengu, daraga, darjah -> degri
4. Phonological Simplicity If possible, the structure of 'CVCV', which is easy to pronounce, is selected. Example 1: appear, aperas, parousiazomai, apparere, erscheinen, aparecer , apparire, apparaitre, natanada, arawareru, chuxian, dhaHar, upsthit, menjelma -> apar Example 2: atmosphere, atmosfero, atmosphaira, aer, atmosphare, atmosfera, atmosphere, degi, taiki, daqi, aljaww, vaataavaran, suasana -> degi
5. Cultural Priority A word is selected from the language with which the word is most associated culturally in meaning. Example: tea = cha rose = roz
6. Compounding A new word is created through active compounding. Example 1: dictionary -> motbuk (mot 'word' + buk 'book') Example 2: Monday -> lunadi (luna 'moon' + di 'day' ) Example 3: first -> unme (un 'one' + me 'ordinal ending' )
7. Diversity The degree of familiarity is enhanced by selecting words from diverse origins as much as possible Example 1: nose = noz neck = mog cheek = jia heel = eri Example 2: elephant = elefant bird = tori sheep = yang goose = gus
Transformation of Characters
1. Consonant Deletion CC -> C : Unnecessary duplication of consonants is simplified Example 1: bell -> bel roll -> rol address -> adres Example 2: check -> chek
2. Vowel Deletion
2.1 VV -> V The cluster of vowels is simplified as much as possible. Example: enough -> enuf heavy -> hevi guest -> gest heart -> hart
2.2 -VCVC# -> -VCC# (cf. nasal/liquid sound) When the final consonant in the structure of '-VCVC' is a nasal or liquid sound, the vowel which is pronounced as [schwa] is omitted. Example: even -> evn bottle -> botl
2.3 V (cf. word-final position) A word-final inflectional ending or silent vowel is omitted. Example 1: sure -> sur advice -> advis Example 2: tower, turo, turris, turm, torre, tour -> tor adventure, aventuro, abenteuer, aventura, avventura, aventure -> aventur
3. Semi-Vowel Deletion
3.1 ow which is pronounced as [ow] -> o Example: slow -> slo snow -> sno yellow -> yelo
3.2 iy which is pronounced as [iy] -> i
4. Systematic Change
4.1 [ks] -> x Example: access -> axes success -> suxes
4.2 [qu] -> q Example: quick -> qik quarter -> qart
4.3 c which is pronounced as [k] -> k Example: class -> klas card -> kard car -> kar
4.4 c which is pronounced as [s] -> s Example: city -> siti century -> senturi
4.5 g which is pronounced as j Example: orange -> oranj charge -> charj energy -> enerj
4.6 y -> i/ai Example 1: system, sistemo, sistema, systeme, sisutem, shisutemu, sistem -> sistem Example 2: every -> evri employ -> emploi Example 3: dry -> drai fly -> flai
5. Change for Shortness When the pronunciation of a word is widespread and the word tends to be shortened, the spelling of the word is changed according to its pronunciation. Example: business = biz introduce = intro advertise = ad
Word Templates
1. Personal pronouns I = me you (sg) = de he/she/it = le we = mes you (pl) = des they = les
2. Interrogative pronouns what = wat who = wu which = wich why = we where = wer when = wen how = wi
3. Determiners all = al every = evri each = ge any = eni many = meni more = mor most = most some = som no = no other = odro another = unodro that = dat this = dis
4. Conjunctives
and = e but = but or = or so = so if = if as = az because = koz since = sins although = aldo, daga while = wail
5. Cardinal numbers/ ordinal numbers (Number + me)
one = un first = unme two = du second = dume three = tri third = trime four = sa fourth = same five = wu fifth = wume six = ses sixth = sesme seven = sep seventh = sepme eight = ot eighth = otme nine = nov ninth = novme ten = ten tenth = tenme hundred = cent hundredth = centme thousand = mil thousandth = milme | ||||||||