Random-Walk Term Weighting for Improved Text Classification

Authors:
Samer Hassan;Rada Mihalcea;Carmen Banea
Affiliations:
University of North Texas, USA;University of North Texas, USA;University of North Texas, USA
Venue:
ICSC '07 Proceedings of the International Conference on Semantic Computing
Year:
2007

Citing 0
Cited 5

Text Summarization by Sentence Extraction Using Unsupervised Learning

MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Terms derived from frequent sequences for extractive text summarization

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
FactRank: random walks on a web of facts

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
NLEL-MAAT at CLEF-IP

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Combining compositionality and pagerank for the identification of semantic relations between biomedical words

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier. The method uses term co-occurrence as a measure of dependency between word features. A random-walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that represent a quantification of how a particular word feature contributes to a given context. Experiments performed on three standard classification datasets show that the new random-walk based approach outperforms the traditional term frequency approach of feature weighting.