A language and character set determination method based on N-gram statistics
ACM Transactions on Asian Language Information Processing (TALIP)
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
The language observatory project (LOP)
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Improve feature selection method of web page language identification using fuzzy ARTMAP
International Journal of Intelligent Information and Database Systems
Improved N-grams approach for web page language identification
Transactions on computational collective intelligence V
Hi-index | 0.00 |
Ubiquitous learning challenges students to become adept at information retrieval, management and synthesis from a variety of sources. This sparks discovery activities that are student-centred and personalized. Personalized means that the learning is best conducted in the natural language of the student. Language is an important tool for human communication and at the moment, the language dominating ICT is English. Although many efforts have been made, learning English is a slow and expensive process. There were also many casualties and sacrifices which unfortunately were at the expenses of many local and indigenous languages and cultural heritage. This paper presents an effort made by a consortium of universities and research centres in Asia to address the problem of language digital divide by establishing a World Language Observatory. Compared to an astronomical observatory, which observes space for astronomical phenomena, a language observatory observes language phenomena in cyberspace. Software agents in the form of soft bots are periodically sent into cyberspace by the mother Language Observatory in Japan to examine websites and identify its languages and contents in an attempt to identify language communities in various regions of cyberspace. Assisted by various language observatories around the world a language census chart is then published annually on the UNESCO's International Mother Language Day to inform the world of the current language situation in cyberspace which have implications on education.