Concept-based document readability in domain specific information retrieval

Authors:
Xin Yan;Dawei Song;Xue Li
Affiliations:
The University of Queensland, Australia;The Open University, Milton Keynes, United Kingdom;The University of Queensland, Australia
Venue:
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Year:
2006

Citing 5
Cited 8

Readability formulas have even more limitations than Klare discusses

ACM Journal of Computer Documentation (JCD)
Computer-based readability indexes

ACM '82 Proceedings of the ACM '82 conference
Eye-tracking analysis of user behavior in WWW search

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Document generality: its computation for ranking

ADC '06 Proceedings of the 17th Australasian Database Conference - Volume 49
Document re-ranking by generality in bio-medical information retrieval

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering

Object-fuzzy concept network: An enrichment of ontologies in semantic information retrieval

Journal of the American Society for Information Science and Technology
Easiest-first search: towards comprehension-based web search

Proceedings of the 18th ACM conference on Information and knowledge management
Domain-specific iterative readability computation

Proceedings of the 10th annual joint conference on Digital libraries
Toward a semantic granularity model for domain-specific information retrieval

ACM Transactions on Information Systems (TOIS)
An unsupervised ranking method based on a technical difficulty terrain

Proceedings of the 20th ACM international conference on Information and knowledge management
Adaptive ranking of search results by considering user's comprehension

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
Combining NLP with evidence-based methods to find text metrics related to perceived and actual text difficulty

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Ranking Text Documents Based on Conceptual Difficulty Using Term Embedding and Sequential Discourse Cohesion

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain specific (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with different levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insufficient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document affect the document's readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users' readability ratings over four traditional readability measures.