A critical investigation of recall and precision as measures of retrieval system performance
ACM Transactions on Information Systems (TOIS)
Building bilingual microcomputer systems
Communications of the ACM
Finding approximate matches in large lexicons
Software—Practice & Experience
Phonetic string matching: lessons from information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval effectiveness of proper name search methods
Information Processing and Management: an International Journal
A unified environment for fusion of information retrieval approaches
Proceedings of the eighth international conference on Information and knowledge management
Joe Celko's SQL for smarties: advanced SQL programming (2nd editor)
Joe Celko's SQL for smarties: advanced SQL programming (2nd editor)
ACM Computing Surveys (CSUR)
A technique for computer detection and correction of spelling errors
Communications of the ACM
Practical Algorithms for Programmers
Practical Algorithms for Programmers
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
On arabic search: improving the retrieval effectiveness via a light stemming approach
Proceedings of the eleventh international conference on Information and knowledge management
Improving Precision and Recall for Soundex Retrieval
ITCC '02 Proceedings of the International Conference on Information Technology: Coding and Computing
The concept of relevance in IR
Journal of the American Society for Information Science and Technology
A computational morphology system for Arabic
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Finding variants of out-of-vocabulary words in Arabic
Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Worldwide accessibility to Yizkor books
NGITS'09 Proceedings of the 7th international conference on Next generation information technologies and systems
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hi-index | 0.00 |
The need for effective identity matching systems has led to extensive research in the area of name search. For the most part, such work has been limited to English and other Latin-based languages. Consequently, algorithms such as Soundex and n-gram matching are of limited utility for languages such as Arabic, which has vastly different morphologic features that rely heavily on phonetic information. The dearth of work in this field is partly caused by the lack of standardized test data. Consequently, we have built a collection of 7,939 Arabic names, along with 50 training queries and 111 test queries. We use this collection to evaluate a variety of algorithms, including a derivative of Soundex tailored to Arabic (ASOUNDEX), measuring effectiveness by using standard information retrieval measures. Our results show an improvement of 70% over existing approaches. © 2006 Wiley Periodicals, Inc.