Information Storage and Retrieval Systems: Theory and Implementation
Information Storage and Retrieval Systems: Theory and Implementation
Empirical studies in strategies for Arabic retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Collection of U.S. Extremist Online Forums: A Web Mining Approach
HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
Current Approaches in Arabic IR: A Survey
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Classical to slang conversion for retrieving Arabic documents using slang queries
Journal of Information Science
Hi-index | 0.00 |
Due to the widespread use of the internet, there are large amounts of information and documents available in several languages. The Arabic language is one of the available important languages in terms of its usage and structure. Search engines like Google and Yahoo support searching in Arabic, yet fail to get good results when slang terms are used in the query. There are difficulties associated with the Arabic language. The main goal of this research is to refine Arabic text-based searching by using Arabic slang terms in queries. This research proposed a framework to enable users to use their slang language in order to retrieve the relevant documents that have been posted in both forms - slang and classical. The framework is designed and implemented based on a context-free grammar that is used to map the user's slang queries to the equivalent classical ones. On a classical dataset, results showed a 3% improvement on the average values of precision, recall, and F-measure achieved using classical-based queries rather than slang-based ones. Using slang-based queries gives 13% improvement on the average values of the used measures on a slang dataset and 7% improvement on the average values of the used measures on a hybrid dataset.