Assessing user-specific difficulty of documents

  • Authors:
  • Mari-Sanna Paukkeri;Marja Ollikainen;Timo Honkela

  • Affiliations:
  • Department of Information and Computer Science, Aalto University School of Science, PO Box 15400, FI-00076 Aalto, Finland;Department of Information and Computer Science, Aalto University School of Science, PO Box 15400, FI-00076 Aalto, Finland;Department of Information and Computer Science, Aalto University School of Science, PO Box 15400, FI-00076 Aalto, Finland

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

On the web, a huge variety of text collections contain knowledge in different expertise domains, such as technology or medicine. The texts are written for different uses and thus for people having different levels of expertise on the domain. Texts intended for professionals may not be understandable at all by a lay person, and texts for lay people may not contain all the detailed information needed by a professional. Many information retrieval applications, such as search engines, would offer better user experience if they were able to select the text sources that best fit the expertise level of the user. In this article, we propose a novel approach for assessing the difficulty level of a document: our method assesses difficulty for each user separately. The method enables, for instance, offering information in a personalised manner based on the user's knowledge of different domains. The method is based on the comparison of terms appearing in a document and terms known by the user. We present two ways to collect information about the terminology the user knows: by directly asking the users the difficulty of terms or, as a novel automatic approach, indirectly by analysing texts written by the users. We examine the applicability of the methodology with text documents in the medical domain. The results show that the method is able to distinguish between documents written for lay people and documents written for experts.