Phonetic string matching: lessons from information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic spelling correction in scientific and scholarly text
Communications of the ACM
A technique for computer detection and correction of spelling errors
Communications of the ACM
On arabic search: improving the retrieval effectiveness via a light stemming approach
Proceedings of the eleventh international conference on Information and knowledge management
On the development of name search techniques for Arabic
Journal of the American Society for Information Science and Technology
Introduction to Information Retrieval
Introduction to Information Retrieval
Yizkor books: a voice for the silent past
Proceedings of the 17th ACM conference on Information and knowledge management
Recent developments in information retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
A non-learning approach to spelling correction in web queries
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
We address foreign name search in a highly diverse user community. User sophistication ranges from highly experienced archivists to apprehensive users who shy away from technology; apprehensive users dominate system use. Thus, all system interfaces must assume minimal dependency on the user. Our foreign names search approach, called Segments, is language independent; thus, there is no need to determine the language of origin from the diverse candidate set of thirteen languages. We compare Segments against traditional n-gram and Soundex based solutions. Actual and synthetic queries are used to search a names data set resident in the United States Holocaust Memorial Museum. We also search a subset of the 1990 United States Census Bureau Surnames data set to evaluate the performance of Segments on a predominately language specific (English) collection. Our results demonstrate statistically significant performance gains over both traditional approaches. The described approach supports search efforts at the United States Holocaust Memorial Museum.