Predicting reading difficulty with statistical language models
Journal of the American Society for Information Science and Technology
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Natural language processing tools for reading level assessment and text simplification for bilingual education
A machine learning approach to reading level assessment
Computer Speech and Language
Revisiting readability: a unified framework for predicting text quality
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An analysis of statistical models and features for reading difficulty prediction
EANL '08 Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications
Hi-index | 0.00 |
Blog posts, news articles and other webpages are present on the web in multiple languages. Standard search engines evaluate the relevance of the candidate documents to the given query. However, when considering documents with overlapping content, many of them written in a foreign language other than the user's own native tongue, it is beneficial to promote documents that are easy enough for the user to read. Here, we show how to rank a collection of foreign documents based on both: a) relevance to the query, and b) the comprehension difficulty of the document. We design effective ranking operators that evaluate the difficulty of a foreign document with respect to the user's native language. We show that existing search engines can easily augment their scoring function by incorporating the proposed comprehensibility metrics. Finally, we provide extensive experimental evidence that the comprehensibility-aware ranking model significantly improves the standard relevance-based ranking paradigm.