Localization of Languages for eGovernance and RTIA

eGovINDIA, INDIA RTI, INDIA WBA,

As per PRESIDENTIAL ORDER of 1960, Language of Acts, Bills, etc. -

As per PRESIDENTIAL ORDER of 1960,

11.Language of Acts, Bills, etc. -

(a) The Committee has expressed the opinion that Parliamentary legislation may continue to be in English but an authorised translation should be provided in Hindi. The Ministry of Law may, in due course, initiate necessary legislation to provide for an authorised Hindi translation of Parliamentary legislation which may continue to be in English. Arrangements may be made by the Ministry of Law also for providing translations of Parliamentary legislation into the regional languages.

(b) The Committee has expressed the opinion that where the original text of Bills introduced in or Acts passed by the State legislature is in language other than Hindi, a Hindi translation may be published with it besides an English translation as provided in clause (3) of article 348.

In due course, legislation may be initiated for the publication of a Hindi translation of State Bills, Acts, and other statutory instruments, along with the text in the official language of the State.

The following Government Resolution, as adopted by both Houses of Parliament, is hereby published for general information:-

RESOLUTION

“WHEREAS under article 343 of the Constitution, Hindi shall be the official language of the Union, and under article 351 thereof it is the duty of the Union to promote the spread of the Hindi Language and to develop it so that it may serve as a medium of __expression for all the elements of the composite culture of India;

This House resolves that a more intensive and comprehensive programme shall be prepared and implemented by the Government of India for accelerating the spread and development of Hindi and its progressive use for the various official purposes of the Union and an annual assessment report giving details of the measures taken and the progress achieved shall be laid on the Table of both Houses of Parliament and sent to all State Governments;

 

2. WHEREAS the Eighth Schedule of the Constitution specifies 14 major languages of India besides Hindi, and it is necessary in the interest of the educational and cultural advancement of the country that concerted measures should be taken for the full development of these languages;

The House resolves that a programme shall be prepared and implemented by the Government of India, in collaboration with the State Governments for the coordinated development of all these languages, alongside Hindi so that they grow rapidly in richness and become effective means of communicating modern knowledge;

3. WHEREAS it is necessary for promoting the sense of unity and facilitating communication between people in different parts of the country that effective steps should be taken for implementing fully in all States the three-language formula evolved by the Government of India in consultation with the State Government;

This House resolves that arrangements should be made in accordance with that formula for the study of a modern Indian language, preferably one of the Southern languages, apart from Hindi and English in the Hindi speaking areas and of Hindi along with the regional languages and English in the non-Hindi speaking areas;

4. AND WHEREAS it is necessary to ensure that the just claims and interest of people belonging to different parts of the country in regard to the public services of the Union are fully safeguarded:

This House resolves –

(a) that compulsory knowledge of either Hindi or English shall be required at the stage of selection of candidates for recruitment to the Union services or posts except in respect of any special services or posts for which a high standard of knowledge of English alone or Hindi alone, or both, as the case may be, is considered essential for the satisfactory performance of the duties of any such service or post; and

 

(b) that all the languages included in the Eighth Schedule to the Constitution and English shall be permitted as alternative media for the All India and higher Central Services examinations after ascertaining the views of the Union Public Service Commission on the future scheme of the examinations, the procedural aspects and the timing.

October 25, 2006 Posted by localization | Govt. of INDIA | | No Comments Yet

Text-to-Speech and Automatic Speech Recognition in Indian Languages (Matrubhasha)

Text-to-Speech and Automatic Speech Recognition in Indian Languages (Matrubhasha)

http://www.ictrt.org.in/modules.php?name=Pages&pa=showpage&id=12
Introduction

In the present era of human computer interaction, the educationally under privileged and the rural communities of India are being deprived of technologies that pervade the growing interconnected web of computers and communications. One good solution for this problem would be computers talking to the common man in the language he is comfortable to communicate in. Indian population has a significant percentage of people who are educationally under-privileged. There are still quite a large number of areas where people do not have the capabilities of 3R’s. The digital divide under such circumstances is constantly on a rise, where on one hand we claim that India is leading in IT and on the other hand, the advances we make are totally inaccessible by a large number of countrymen. Under such circumstances, we cannot expect rural/educationally under-privileged countrymen to use computers and IT products unless we remove the need of being literate, which exists as a barrier between them and computers.

Major Issues

In this information age, storage and retrieval of information in a convenient manner has gained importance. Because of the near-universal adoption of World Wide Web as a repository of information for unconstrained and wide dissemination, information is now broadly available on the Internet and is accessible from remote sites. However, the interaction between the computer and the user is largely through keyboard and screen-oriented systems. In the current Indian context, this restricts the usage to a miniscule fraction of the population, who are both computer-literate and conversant with written English. In order to enable a wider proportion of population to benefit from Information technology, there is a dire need for an interface other than keyboard and screen-interface that is widely in use at present. Speech, being a natural means of communication among human beings, can also provide a consummate platform for man-machine interaction. It is also desirable that human-machine interface permits one’s native language of communication. In the context of a multi-lingual country like India, this can be of immense value to our country where literacy rate is considerably low. Certain efforts are currently been undertaken to develop OS and applications, which support the local languages. Localization efforts have been undertaken by most of the leading OS vendors and promoters, which include Microsoft (Windows), Red Hat (Linux), NCST (Indix), IIT Madras (IndLinux) etc. These OS’s support some of the leading Indian languages by using international coding standards (Unicode). Speech technologies promise to be the next generation user interface. Software application having speech and voice recognition abilities have a better chance to communicate with a large percentage of population which include educationally under-privileged, visually challenged and computer illiterates, if these applications can speak and understand the native language. Hence we put forward the API (Application Programming Interface) Model based on Unicode for Text to Speech Synthesis and Automatic Speech Recognition in Indian languages.

Why Unicode?

Unicode is the international standard that encodes characters in 16 bits as opposed to the ASCII standard encoding 8 bits. The Unicode Standard is the universal character-encoding standard used for representation of text for computer processing. Unicode provides a consistent way of encoding multilingual plain text. The design of Unicode is based on the simplicity and consistency of ASCII but goes far beyond ASCII’s limited ability to encode only the Latin alphabet. The Unicode Standard provides the capacity to encode all of the characters used for the written languages of the world. To keep character coding simple and efficient, the Unicode standard assigns each character a unique numeric value and name. It has laid out provisions for encoding all scripts in the world, and has been agreed upon by all major software providers, as well as international governments, as the most suitable character representation for all major character sets. The advantage of using Unicode as character set will enable the TTS engine to recognize different Indian languages in the same document or character string, where as using a separate encoding format for each of the languages would not support the same. All the content development applications such as Office Suite’s, Star-Office, and Open-Office etc. are already being localized using Unicode. Unicode is already playing a significant part with respect to localization and internationalisation. Since it handles the characters for all languages in a uniform way, it avoids the complexities of different character code architectures. All of the modern operating systems, from PCs to mainframes, support Unicode now or are actively developing support for it. The same is true of databases, as well. In this scenario, Unicode would be the best option to go with, in the context of the current problem being discussed about!

Why API Model?

While we talk about IT being completely in-reach of the common countrymen, it is not just enough to have an operating system, or one specific application that supports local language speech synthesis or speech recognition. The point is that in the current juncture where in the growth and impact of technology is day by day rising exponentially; any software application is a candidate for localization. Hence, we put forward this API model so that any software developer can incorporate speech capabilities into one’s application, thus extending the reach of the product even to the masses.

Mr. Raman and the Matrubhasha Team

October 23, 2006 Posted by localization | Govt. of INDIA | | No Comments Yet

ICT Research and Training Centre, India.

Welcome to the ICT Research and Training Centre, India.

 http://www.ictrt.org.in/

The Government of India is a member of the
Development Gateway Foundation, a World Bank
initiative.

An Information and Communication Technologies (ICT) – Research & Training (R&T) Centre of the Foundation has been set up in Bangalore, India with Centre for Development of Advanced Computing (C-DAC) as the Project Implementing Agency and the Indian Institute of Technology (IIT), Bombay as the first collaborating institution.

 

 Language Technology
The dependency on English and other foreign langauges has become main stream and it influences all avenues of our day-to-day life. The aim of this is to develop Indian language technologies inorder to empower the ‘digitally impoverished’.

Projects:

BharateeyaOO

This project aims at enabling the support of Indian Languages within the OpenOffice.org office suite on the main software platforms, so as to facilitate the digital representation, collection and distribution of information by ensuring access to information technology seamless over natural language barriers. Indian language support in the suite will be through localization of the complete user interface and help, in Hindi, Tamil, Kannada, Punjabi, Gujarati, Telugu, Bengali and Malayalam. Internationalization support will also be developed featuring editing and complex text layout processing in Indian scripts, Indian language currency, calendar and spell checking.

Cross Lingual Information Retrieval

The web is a critical and vast source of information in today’s world. However most of the information is in English, which is understood by less than 5% of the Indian population. Search engines are the primary mechanisms to find information on the web. However most of the search engines do not allow querying in Indian languages. Even though a few offer screen layout and static text in Hindi, the querying and results retrieved are still in English. So a person literate in Indian languages but not well versed in English is deprived of access to a vast store of information. To bridge this Digital Language Divide, one of the key technologies required is Cross Lingual Information Retrieval (CLIR). The proposed CLIR system aims at enabling a person to query the web for documents related to health issues and obtain the results, in Hindi.

Other Initiatives: News And Events:
IndiX DGF Article – BharteeyaOO
LEAP OFFICE 2000 GNU Linux enabled for five Indian Languages
MaTra
SHAKTI OFFICE SUITE
Shree-Lipi Ankur Script Processor
Lila Hindi Pragya
Learn Indian Language Through Artificial Intelligence (LILA)
Speech Synthesis System

 

October 23, 2006 Posted by localization | Govt. of INDIA | | No Comments Yet