Amharic-English bilingual web search engine

  • Authors:
  • Mequannint Munye;Solomon Atnafu

  • Affiliations:
  • Jijiga University, Jijiga, Ethiopia;Addis Ababa University, Addis Ababa, Ethiopia

  • Venue:
  • Proceedings of the International Conference on Management of Emergent Digital EcoSystems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As non-English languages are growing exponentially on the Web, the number of online non-English speakers who realizes the importance of finding information in different languages is enormously growing. However, the major general purpose search engines such as Google, Yahoo, etc have been lagging behind in providing indexes and search features to handle non-English languages. Amharic, which is the family of Semitic languages and the official working language of the federal government of Ethiopia, is one of these languages with a rapidly growing content on the Web. As a result, the need to develop bilingual search engine that handles the specific characteristics of the users' native language query (Amharic) and retrieves documents in both Amharic and English languages becomes more apparent. In this research work, we designed a model for an Amharic-English Search Engine and developed a bilingual Web search engine based on the model that enables Web users for finding the information they need in Amharic and English languages. In doing so, we identified different language dependent query preprocessing components for query translation. We have also developed a bidirectional dictionary-based translation system which incorporates a transliteration component to handle proper names which are often missing in bilingual lexicons. We have used an Amharic search engine and an open source English search engine (Nutch) as our underlying search engines for Web document crawling, indexing, searching, ranking and retrieving. To evaluate the effectiveness of our Amharic-English bilingual search engine, precision measures were conducted on the top 10 retrieved Web documents. The experimental results showed that the Amharic-English cross-lingual retrieval engine performed 74.12% of its corresponding English monolingual retrieval engine and the English-Amharic cross-lingual retrieval engine performed 78.82% of its corresponding Amharic monolingual retrieval engine. The bilingualism advantage of the system is also evaluated by comparing its results with general purpose search engines. The overall evaluation results of the system are found to be promising.