Semantic text classification of disease reporting

Authors:
Yi Zhang;Bing Liu
Affiliations:
University of Illinois at Chicago, Chicago, IL;University of Illinois at Chicago, Chicago, IL
Venue:
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2007

Citing 4
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Discovery of inference rules for question-answering

Natural Language Engineering
Paraphrase acquisition for information extraction

PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional text classification studied in the IR literature is mainly based on topics. That is, each class or category represents a particular topic, e.g., sports, politics or sciences. However, many real-world text classification problems require more refined classification based on some semantic aspects. For example, in a set of documents about a particular disease, some documents may report the outbreak of the disease, some may describe how to cure the disease, some may discuss how to prevent the disease, and yet some others may include all the above information. To classify text at this semantic level, the traditional "bag of words" model is no longer sufficient. In this paper, we report a text classification study at the semantic level and show that sentence semantic and structure features are very useful for such kind of classification. Our experimental results based on a disease outbreak dataset demonstrated the effectiveness of the proposed approach.