A Novel Conception Based Texts Classification Method

Authors:
Bai Rujiang;Liao Junhua
Affiliations:
-;-
Venue:
AST '09 Proceedings of the 2009 International e-Conference on Advanced Science and Technology
Year:
2009

Citing 6
Cited 0

Automatic text processing

Automatic text processing
The nature of statistical learning theory

The nature of statistical learning theory
Using WordNet and Lexical Operators to Improve Internet Searches

IEEE Internet Computing
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Overcoming the brittleness bottleneck using wikipedia: enhancing text categorization with encyclopedic knowledge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Feature generation for text categorization using world knowledge

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, Current text classification systems are based on the “Bag of Words” (BOW) representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich text representation by means of manual intervention or automatic document expansion. The achieved improvement is unfortunately very limited, due to the poor coverage capability of the dictionary, and to the ineffectiveness of term expansion. Fortunately, DBpedia appeared recently which contains rich semantic information. In this paper, we proposed a method compiling DBpedia knowledge into document representation to improve text classification. It facilitates the integration of the rich knowledge of DBpedia into text documents, by resolving synonyms and introducing more general and associative concepts. To evaluate the performance of the proposed method, we have performed an empirical evaluation using SVM calssifier on several real data sets. The experimental results show that our proposed framework, which integrates hierarchical relations, synonym and associative relations with traditional text similarity measures based on the BOW model, does improve text classification performance significantly.