In this age of Information Technology (IT) with the entire Globe being integrated into a web-linked village with the knowledge as the sole differentiator, development of convivial Access Technology has gained prime importance. Especially for India, with its diverse and multi-lingual heritage and culture, the Internet is expected to play dominant integrating role for integrating almost all aspects of social and economic endeavor.
To ensure that India and Indian languages have their rightful place in the fast-evolving scenario, CDAC GIST undertook research and study of various IDN related RFCs and their applicability vis-à-vis Indian Languages. The pilot study was under the aegis of the DIT.
The main aim was dual :
- Primarily to ensure that a user can create and access URL’s in his own language and
- Closely associated with the above, to ensure that the average user who is little aware of spoofing or phishing or pharming does not suffer financial loss or identity loss. Thus an average user doesn't care about a seemingly trivial URL string unless he comes across it inadvertently and gets robbed. For him,
- वीत्त.भारत and.वीत.भारत look the same. By clicking on the worng URL, (s)he could be phished or pharmed.
To prevent such spoofing and phishing attacks in IL scenario, a careful study of some scripts is done by CDAC GIST, especially in the area of homographs: characters and their combinations which look alike but are different in reality, as in the example given above..
The research is focused on Domain Names in Indian languages for Hindi, Gujarati, Oriya, Urdu, Gurumukhi etc. and includes the following :
NamePrep and StringPrep Profile - RFC-3492
PunyCode: Bootstring encoding - RFC-3454
Variant tables accompanied by the normalization tables were prepared which would help reducing vulnerability of an IDN against homograph attacks.
A single all script font (on the line of Sakal Bharati font) is also developed. In this font a special care of code-point wise different but visually similar characters is taken to help them being distinguished visually in the address bar of a browser.
Easy floating keyboards well equipped with the IDN restrictions are also being developed for inputting the URL at the Registry software for each language.