Improving Text Classification by Using Encyclopedia Knowledge

Authors:
Pu Wang;Jian Hu;Hua-Jun Zeng;Lijun Chen;Zheng Chen
Affiliations:
-;-;-;-;-
Venue:
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Year:
2007

Citing 0
Cited 26

Enhancing text clustering by leveraging Wikipedia semantics

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Building semantic kernels for text classification using wikipedia

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Leveraging Web 2.0 Sources for Web Content Classification

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Understanding user's query intent with wikipedia

Proceedings of the 18th international conference on World wide web
Clustering Documents Using a Wikipedia-Based Concept Representation

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Building a Text Classifier by a Keyword and Wikipedia Knowledge

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Mining meaning from Wikipedia

International Journal of Human-Computer Studies
Wikipedia-assisted concept thesaurus for better web media understanding

Proceedings of the international conference on Multimedia information retrieval
Exploiting time-based synonyms in searching document archives

Proceedings of the 10th annual joint conference on Digital libraries
A self-supervised approach for extraction of attribute-value pairs from wikipedia articles

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Adaptable term weighting framework for text classification

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Unsupervised feature weighting based on local feature relatedness

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Grammatical dependency-based relations for term weighting in text classification

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
A multi-layer text classification framework based on two-level representation model

Expert Systems with Applications: An International Journal
Large-scale question classification in cQA by leveraging Wikipedia semantic knowledge

Proceedings of the 20th ACM international conference on Information and knowledge management
Wikipedia-based semantic smoothing for the language modeling approach to information retrieval

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Wikipedia-based smoothing for enhancing text clustering

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
A social network-based approach to expert recommendation system

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
A semantic-based social network of academic researchers

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Exploring the existing category hierarchy to automatically label the newly-arising topics in cQA

Proceedings of the 21st ACM international conference on Information and knowledge management
Improving cross-document knowledge discovery using explicit semantic analysis

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
A new term ranking method based on relation extraction and graph model for text classification

ACSC '11 Proceedings of the Thirty-Fourth Australasian Computer Science Conference - Volume 113
A semantic social network-based expert recommender system

Applied Intelligence
Improving semi-supervised text classification by using wikipedia knowledge

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Improving question retrieval in community question answering using world knowledge

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
PSG: a two-layer graph model for document summarization

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.00

Visualization

Abstract

The exponential growth of text documents available on the Internet has created an urgent need for accurate, fast, and general purpose text classification algorithms. However, the "bag of words" representation used for these classification methods is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. In order to deal with this problem, we integrate background knowledge in our application: Wikipedia into the process of classifying text documents. The experimental evaluation on Reuters newsfeeds and several other corpus shows that our classification results with encyclopedia knowledge are much better than the baseline "bag of words" methods.