Customizing search results for non-native speakers

Authors:
Theodoros Lappas;Michail Vlachos
Affiliations:
Boston University, Boston, MA, USA;IBM Research, Zurich, Switzerland
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 6
Cited 0

Predicting reading difficulty with statistical language models

Journal of the American Society for Information Science and Technology
Cognate mapping: a heuristic strategy for the semi-supervised acquisition of a Spanish lexicon from a Portuguese seed lexicon

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Natural language processing tools for reading level assessment and text simplification for bilingual education

Natural language processing tools for reading level assessment and text simplification for bilingual education
A machine learning approach to reading level assessment

Computer Speech and Language
Revisiting readability: a unified framework for predicting text quality

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An analysis of statistical models and features for reading difficulty prediction

EANL '08 Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Blog posts, news articles and other webpages are present on the web in multiple languages. Standard search engines evaluate the relevance of the candidate documents to the given query. However, when considering documents with overlapping content, many of them written in a foreign language other than the user's own native tongue, it is beneficial to promote documents that are easy enough for the user to read. Here, we show how to rank a collection of foreign documents based on both: a) relevance to the query, and b) the comprehension difficulty of the document. We design effective ranking operators that evaluate the difficulty of a foreign document with respect to the user's native language. We show that existing search engines can easily augment their scoring function by incorporating the proposed comprehensibility metrics. Finally, we provide extensive experimental evidence that the comprehensibility-aware ranking model significantly improves the standard relevance-based ranking paradigm.