Detecting Concept Drift with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
Understanding temporal aspects in document classification
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Learning classifiers from only positive and unlabeled data
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting temporal contexts in text classification
Proceedings of the 17th ACM conference on Information and knowledge management
Temporally-aware algorithms for document classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Topic dynamics: an alternative model of bursts in streams of topics
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Exponentially weighted moving average charts for detecting concept drift
Pattern Recognition Letters
Hi-index | 0.00 |
In this paper, we address the text classification problem that a period of time created test data is different from the training data, and present a method for text classification based on temporal adaptation. We first applied lexical chains for the training data to collect terms with semantic relatedness, and created sets (we call these Sem sets). Semantically related terms in the documents are replaced to their representative term. For the results, we identified short terms that are salient for a specific period of time. Finally, we trained SVM classifiers by applying a temporal weighting function to each selected short terms within the training data, and classified test data. Temporal weighting function is weighted each short term in the training data according to the temporal distance between training and test data. The results using MedLine data showed that the method was comparable to the current state-of-the-art biased-SVM method, especially the method is effective when testing on data far from the training data.