Automatic Recognition of Text Difficulty from Consumers Health Information
CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Assessing user-specific difficulty of documents
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Distinction between expert and non expert documents is an important issue in the medical area, for instance in the context of information retrieval. In our work we address this issue through stylistic corpus analysis and application of machine learning algorithms. Our hypothesis is that this distinction can be observed on the basis of a little number of criteria and that such criteria can be language and domain independent. The used criteria have been acquired in source corpus (Russian) and then tested on source and target (French) corpora. The method shows up to 90% precision and 93% recall, and 85% precision and 74% recall in source and target corpora.