அறிமுகம்...

My photo
சென்னை, தமிழகம், India
இலக்கியம், பயணம், மனிதர்கள், இசை, உணவு, நட்பு, சமுதாயம், கலை - இவை என் ஆர்வங்கள். பகிர்தலில் இன்பம் கொண்ட எல்லோருக்கும் நான் நண்பன். என்னை தொடர்ப்பு கொள்ள: muthu.gvmuthu@gmail.com / 9894238404

Tuesday, May 10, 2005

Unicode - An Introduction...

Dear Friends..

Good Evening.. This is a Small introduction about Unicode Fonts and its technologies... This may be helpful for the newbies to Unicode system..

What is Unicode...?

Unicode is an universal font encoding scheme, designed to cover all world languages. It is a 32-bit scheme with over 65500 slots to assign to various

languages. Each language (except few like chinese) is given a 128-slot block.

When we discuss about Tamil Encoding..Tamil is a language where, in addition to the basic vowels (uyir) and consonants (mei), the compounded (uyirmei)

characters all have unique glyph forms. Popular Tamil font encoding schemes like TAB, TSCII, TAM are glyph based ones. As many of these unique

uyirmeis with distinct glyph forms are directly encoded in the scheme. Thus uyirmeis like ku, pU etc are directly encoded.

Unicode, on the contrary, encodes only basic uyir and mei characters and a set of modifiers to represent situations where the uyir/mei pair appear as a

combination (uyirmei). Unicode file stores textual information solely at this "character" level. It does not care about the actual form of the glyphs. Rendering

of the glyphs corresponding to stored characters is left to softwares.

OK.. How Tamil is Encoded in Unicode.. All indic languages are allocated 128-slots each. Assignment of characters to specifc slots within this block is

based on ISCII (Indian Standard Code for Information Interchange) scheme, that uses Devanagari as the basic reference language. Thus the vowels,

consonants and their modifiers of each indic language appears at the same slot. "ka" of Tamil and Telugu are separated by same 128 slots, greatly

facilitating programming.

Fine.. Now Concepts are ok.. What is the Technology.. As I already stated, in Unicode, unique glyph forms of uyirmeis are stored separately and are

"rendered" on the screen when a unicode-based text file is displayed using softwares.

The process of picking up these unique glyph forms of uyirmeis stored in the font and rendering them on the screen is called "glyph substitution (GSUB)".A

new Font technology called "OpenTrueType" (OTT) has been developed for use with Unicode. Different platforms/Operating systems use different

font-rendering engines to handle these Unicode OTT-type fonts (use of GSUB, GPOS tables).

To use a Unicode Tamil text, you need to have a Unicode OTT-type font that has Tamil block (yes many unicode fonts carry only few languages) and also

the font-rendering tool/engine (a DLL) of respective platform.

Fine..Now a Question.. What Operating System do I need to use Tamil Unicode?

On Windows platform, only Windows 2000 and Win XP come with the required .dll file to handle Tamil characters. Windows ME and 98 though they are

"unicode-intelligent", they do not have the specific .dll file support required for Tamil. So unicode Tamil texts will be rendered in a "linear" fashion as stored

in character-based scheme without glyph substitution. Latha, Arial UnicodeMS, Code2000 are some of the Unicode fonts that carry Tamil block.

Apple uses a different font-rendering engine called ATSUI to handle GSUB, GPOS tables of unicode OTT fonts. Though Mac OS 9.x and X fully support

Devanagari, Gujarati and Gurmuki, their ATSUI does not support Tamil.

Tamil Linux group has developed necessary tools to enable Unicode Tamil in this platform.

I See.. Now..one more question.. What application softwares do I need for Tamil Unicode in Windows ?

Even if you use Win 2K/XP, you need "compatible" application softwares to handle Tamil Unicode in these. MS Office 2000 appeared before Windows2000

release and hence displays unicode Tamil text in linear fashion even when used in Windows 2000 OS!. So you need to use recent Office XP package with

Win 2000. Alternate choice is to use a simple text editor like Notepad or WordPad with TavulteSoft Keyman (E-Kalappai20b - Package Name) - Which can

be downloaded from www.tavultesoft.com/keyman website. This Keyman Software supports Unicode, Normal English and TSCIIANJAL Architecture.

OK.. let us see more in next Article.....

Good Night..
G.Muthukumar

2 comments:

Anonymous said...

hi,
can u guide me for COM and COM+ developments

regards
sindhu

Anonymous said...

hi,
its good after a long time to recollect all those andal songs.. and (important) to know the meanings.. its really good.
All the best. Continue the same..
bye
sindhu