Posted by Chris Muema on

In Part 1, we compared the simplicity of Latin (European origin) characters to Asian and Middle-Eastern (Arabic) characters in so far as how much effort it would take to re-decorate them.  The visual examples showed, clearly, that Latin characters have fewer ink-strokes when translated to Asian or Arabic characters; glyph-to-glyph. However, we also looked at meaning in a single Latin alphabet letter versus meaning in an Asian and Arabic alphabet character.  In that framework, we illustrated that the meaning of most Latin characters is reduced to a sound or visual icon versus many Asian and Arabic characters that denote a concept . . .  THUS, WE UNPACKED THE DIFFERENCE IN MEANING BETWEEN LATIN GLYPHS AND IDEOGRAPHIC (CHINESE-ORIGIN) GLYPHS.  It is singularly that difference in understanding that enables us illustrate how Latin characters increase in complexity of ink-strokes when the characters are cobbled together to form words.


In Part 2, we also touched on the impact of the ever-developing Unicode standard: version 2.x which is a general moniker for most Latin font developers; i.e. the standard before ideographic Unicode standards 3.0 of February 2000 and Unicode 3.1 of March 2001.  We unpacked the meaning of the latter (Unicode 3.0 and 3.1) and how that dramatically changed font standards, effectively more than quadrupling the number of characters from the 1993 benchmark of 20,902 characters to 94,140 characters by March 2001.  Effectively, Unicode 3.0 would more-than-doubled the 20,902 international characters with the addition of 27,484 “ideographic” characters.  A year later, in March 2001, Unicode 3.1 would, again, more-than-double Unicode 3.0 from the year before!  Unicode 3.1 would add 44,946 “ideographic” (i.e. Chinese influenced) characters.  A total of 72,430 Chinese-influenced characters added in a span of a year.


Here, in Part 3 we attempt to break-down the numbers further so we can get a visual representation of the challenges posed with re-styling, in particular the Chinese, old-Korean and Japanese (CKJ) writing.  However, the CKJ block is, for practical purposes, of Chinese-influence . . .  And thus the Unicode euphemism “ideographic.” 


What we've learned about CKJ is that the Japanese character set is still dominated by Chinese influence, thus called Kanjin.  In more recent times, not long after the Korean war that parted North and South, we begin to see new ideological claims made about language, and of course, writing.  Among the main claims are the use of the Hangeul script rather than Confucian, a debate that dates back to the mid-1400s.  For the purpose of re-decorating such writings, it is estimated there are more than 11,000 Japanese-Chinese (Kanjin), depending on who’s counting.  Likewise, the Korean-Hangeul alphabet has 11,172 characters, according to the Unicode narrative.


But here’s the dilemma.  Politically, Hangeul is considered not to be Chinese.  But that’s ideological, especially given the historical influences, and more importantly, the proximity of these cultures.  In over 500 years, it’s inconceivable that there were no significant cross-cultural influences in the Hangeul character set . . .  For surely, with wars having been won, and wars have been lost, it is reasonable that language exchange did occur.  And thus, it is also conceivable that writing exchange also did occur.  In the history of war, there is a truism that repeats: "The winner's language is what the people learn."  As such, it is inconceivable that Hangeul is a pure language; such a claim lacks historical merit, and the same argument holds for Latin as well.  Language changes with time: that's a truism.  That too is an academic fascination.  However, for the purposes of number crunching, we separate Hangeul from old-Korean.


What’s important about the Chinese, old-Korean or Japanese characters (thus called CKJ, also called “ideographic”) is the fact that they have a long history of interaction with each other.  The combined CKJ set at the time Unicode 3.1 was published, is 72,430 Chinese-influenced characters out of a total of 94,140 characters.  That’s 77% of the standard! 


The sheer volume of CKJ ideographs is daunting for any one artist to consider, even in a re-mapping exercise.  That’s why, in Part 2, we used the “battalion of 500” metaphor; effectively placing 140 Chinese glyphs per soldier to work-on every time a new font-style is introduced.  By the year 2015, even with all the technological advances, font-restyling is still very much a hands-on work of art.


In Part 1, we also introduced the notion of culture, whether it is even appropriate from a cultural perspective to attempt re-styling the ideographic set.


Okay, we’ve looked at decorating fonts and a little of the ideology in fonts.  Let’s now get into some number crunching.


The English alphabet alone has 26 characters in upper-case another 26 in lower-case and 10 numeric-keyboard characters then add 30 (give and take 3) special common keyboard characters: a total of about 90 common keyboard characters, not including other special characters such as found in mathematics, science, economics and technology (STEM/Latin-STEM).


If you add other Latin language characters, such as French and Portuguese, the total of basic Latin character set approaches 240 characters as long as Latin-STEM characters are not in consideration.  After counting the Latin character set before February 2000 (Before Unicode 3.0), the Latin set had about 260 characters; which font-making community refers to as Unicode 2.x.  After adding Latin-STEM characters and other disciplines, by the year 2014, there are more than 1,340 Latin characters if you add other special characters used in science, mathematics and engineering.


Even if you wanted, you could never stretch the Latin set to match to ideographic CKJ character set: see the chart ahead.  The next largest character set is the Korean Hangeul.     


And then the rest of the world: Ethiopic, Gujarati, Hindi, Arabic, Russian, Greek, Armedian, Hebrew and so on.  The numbers can get lost in detail.  So what we’ve done is to compare the numbers on a pie-chart as shown below.


AND YET, about 4Billion people (that’s four-billion of seven-billion) of the world’s writing is based on the Latin character set, including Europe, Non-Arabic Africa, North America, South America, Australia, India (as a national language), Vietnam, Indonesia, Philippines et-al.


AND YET, over 90% (that’s the low side) of computer programming languages, games and internet content are entirely Latin character sets – the 1%.  Visit the app-download stores and you’ll see why.  One of the largest app-download stores in the world will only accept English-only descriptions.  Even the internet standards themselves are a Latin-based.


Below is a geographic infographic of the world’s writing systems.




We know that the first dynasty of Egypt (Dynasty Zero) was led by an African called King Scorpion.  We know this because the writing survived more than 5,000 years.  The Egyptian dynasties span over 3,000 years and again, we know this because writing is a memory preserver!


If you take that reasoning to its logical end, then the odds of your legacy being remembered 5,000 years from now improve dramatically when you use the world’s most dominant character set: Latin!

Share this post

← Older Post Newer Post →