Wednesday, June 23, 2010

BUILDING MULTI LINGUAL WEBSITE FOR INDIAN LANGUAGES


Gita Supersite is a multilingual website supporting the 10 Indian official languages. All the ten languages namely Devanagari, Oriya, Bengali, Punjabi, Gujrati, Telugu, Tamil, Assamese, Kannada and Malayalam are derived from Brahmi script. These languages have a common phonetic structure and this property is used for text storage and transliteration of the original text in Devanagari into other languages. The site contains texts related to Srimad BhagwadGita and some other famous Indian heritage books like Ramayana, Brahmasutra and Upanishads.

There are 14 officially recognized languages in India. Apart from Perso-Arabic scripts, all the other 10 scripts used for Indian languages have evolved from the ancient Brahmi script and have a common phonetic structure, making a common character set possible. Since all the languages have common phonetic structure the content of the website related to Indic scripts is saved in ISCII format.

Indian Script Code for Information Interchange or ISCII is a character based encoding system defining common phonetic character set that is used to represent this common character set. It was adopted by the Bureau of Indian Standards (BIS) in 1991 as a language for information exchange of Indian Languages. It is a single representation of all the Indian scripts with codes assigned in the upper ASCII region (160 - 255) for the aksharas of the language. The character encoding scheme also assign code for vowel extensions called matras, and includes special characters (like visarg, halant), to specify how a consonant in a syllable should be rendered. The representation for a syllable can be from one byte to as many as 10 bytes, making it is a multi-byte representation.

The most important point to mention about ISCII is that ISCII codes have nothing to do with fonts. A given text in ISCII may be displayed using many different fonts for the same script. This can be done by mapping ISCII codes to the glyphs in a matching font for that script. The rendering of content on the web browser for the current version is done with the use of CDAC (Center for Development and Advance Computing) technologies: GIST, ISFOC and TTF fonts.


The data needs to be represented in a universal format that is supported by current Web technologies. The Unicode Standard is a universal character encoding scheme for representing characters as integers. It defines a consistent way of encoding multilingual text that enables the exchange of text data internationally and creates the foundation for developing global software.

The format chosen should be displayed in fonts that can be installed on all platforms. Unicode format is displayed using OTF fonts that are intended to be cross-platform, and can be used on Mac OS, Windows and UNIX systems.

ISCII is UNICODE compatible. UNICODE has exactly the same character code map as that of ISCII for all the Indian Languages. So code for mapping characters from ISCII to Unicode can be written, and this will allow making use of the already created data in ISCII.

Tuesday, June 22, 2010

Retrieving Forgotten Password

In present world scenario most of the systems save password after applying one way encryption using algorithms like MD5 or SHA1 and in such cases it becomes impossible for the user to tell the password to the user (after applying some security check like asking security answer of some security question chosen by the user during the time of registration), because the system itself does not know the password. In such cases the password can be only be updated by the user. The update password procedure can be completed in the following two steps. The system providing access to the user base (User Base Management System UBMS) has to provide two APIs for the following two steps.
  • Creation of security token:

The security token will be used for validating that only the authenticated user is asking for new password. The security token will be generated by the UBMS which provides APIs for secured access of the central repository.
  • Calling change password functionality:

The security token generated in the first step will be passed along with the new password to the system. The UBMS will validate the security token to find out whether the request is coming from the authorized user.

 The updation process can be completed in the following steps:
1.      The application will call token generation functionality

public String generateToken(String emailId)
The API will give authtoken as output which is the authority token.
2. The token will be mailed to the user in some link (of change password page at application end).
3. The user will click on the link and will fill email id and new password and then submit the form.
4. The application will call change password API provided by UBMS:

public boolean changePassword(String newPassword, String authToken)
The authtoken parameter value should be same as passed in step 1.
5. The UBMS will changed password after validating authtoken and give status true/ false based on the validation.
6. The result will be used at the application end for redirecting the user accordingly.