Timeline adaptation for text classification

Authors:
Fumiyo Fukumoto;Yoshimi Suzuki;Atsuhiro Takasu
Affiliations:
Univ. of Yamanashi, Kofu, Japan;Univ. of Yamanashi, Kofu, Japan;National Institute of Informatics, Tokyo, Japan
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 8
Cited 0

Detecting Concept Drift with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics
Understanding temporal aspects in document classification

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Learning classifiers from only positive and unlabeled data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting temporal contexts in text classification

Proceedings of the 17th ACM conference on Information and knowledge management
Temporally-aware algorithms for document classification

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Topic dynamics: an alternative model of bursts in streams of topics

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Exponentially weighted moving average charts for detecting concept drift

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we address the text classification problem that a period of time created test data is different from the training data, and present a method for text classification based on temporal adaptation. We first applied lexical chains for the training data to collect terms with semantic relatedness, and created sets (we call these Sem sets). Semantically related terms in the documents are replaced to their representative term. For the results, we identified short terms that are salient for a specific period of time. Finally, we trained SVM classifiers by applying a temporal weighting function to each selected short terms within the training data, and classified test data. Temporal weighting function is weighted each short term in the training data according to the temporal distance between training and test data. The results using MedLine data showed that the method was comparable to the current state-of-the-art biased-SVM method, especially the method is effective when testing on data far from the training data.