Identifying disease diagnosis factors by proximity-based mining of medical texts

Authors:
Rey-Long Liu;Shu-Yu Tung;Yun-Ling Lu
Affiliations:
Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan;Winbond Electronics Corporation, HsinChu, Taiwan;Winbond Electronics Corporation, HsinChu, Taiwan
Venue:
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part II
Year:
2011

Citing 13
Cited 0

A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature selection using linear classifier weights: interaction with classification models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Recognizing names in biomedical texts: a machine learning approach

Bioinformatics
Applying GIFT, a Gene Interactions Finder in Text, to fly literature

Bioinformatics
Modeling individual and collaborative problem-solving in medical problem-based learning

User Modeling and User-Adapted Interaction
Kernel approaches for genic interaction extraction

Bioinformatics
Identifying gene-disease associations using centrality on a literature mined gene-interaction network

Bioinformatics
Learning in a pairwise term-term proximity framework for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A proximity language model for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Bio-medical entity extraction using support vector machines

Artificial Intelligence in Medicine
Text classification for healthcare information support

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
How good is a span of terms?: exploiting proximity to improve web retrieval

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Proximity-based opinion retrieval

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Diagnosis of diseases requires a large amount of discriminating diagnosis factors, including the risk factors, symptoms, and signs of the diseases, as well as the examinations and tests to detect the signs of the diseases. Relationships between individual diseases and the discriminating diagnosis factors may thus form a diagnosis knowledge map, which may even evolve when new medical findings are produced. However, manual construction and maintenance of a diagnosis knowledge map are both costly and difficult, and state-of-the-art text mining techniques have difficulties in identifying the diagnosis factors from medical texts. In this paper, we present a novel text mining technique PDFI (Proximity-based Diagnosis Factors Identifier) that improves various kinds of identification techniques by encoding term proximity contexts to them. Empirical evaluation is conducted on a broad range of diseases that have texts describing their symptoms and diagnosis in MedlinePlus, which aims at providing reliable and up-to-date healthcare information for diseases. The results show that PDFI significantly improves a state-of-the-art identifier in ranking candidate diagnosis factors for the diseases. The contribution is of practical significance in developing an intelligent system to provide disease diagnosis support to healthcare consumers and professionals.