Making large-scale support vector machine learning practical
Advances in kernel methods
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Discovery of inference rules for question-answering
Natural Language Engineering
Paraphrase acquisition for information extraction
PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
Hi-index | 0.00 |
Traditional text classification studied in the IR literature is mainly based on topics. That is, each class or category represents a particular topic, e.g., sports, politics or sciences. However, many real-world text classification problems require more refined classification based on some semantic aspects. For example, in a set of documents about a particular disease, some documents may report the outbreak of the disease, some may describe how to cure the disease, some may discuss how to prevent the disease, and yet some others may include all the above information. To classify text at this semantic level, the traditional "bag of words" model is no longer sufficient. In this paper, we report a text classification study at the semantic level and show that sentence semantic and structure features are very useful for such kind of classification. Our experimental results based on a disease outbreak dataset demonstrated the effectiveness of the proposed approach.