A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental relevance feedback for information filtering
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Incremental Induction of Decision Trees
Machine Learning
Information Filtering: Overview of Issues, Research and Systems
User Modeling and User-Adapted Interaction
MMNS '02 Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Combining proactive and reactive predictions for data streams
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Understanding temporal aspects in document classification
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A case-based technique for tracking concept drift in spam filtering
Knowledge-Based Systems
CIC'02 Proceedings of the 7th CDMA international conference on Mobile communications
Hi-index | 0.00 |
The task of information filtering is to classify documents from a stream as either relevant or non-relevant according to a particular user interest with the objective to reduce information load. When using an information filter in an environment that is changing with time, methods for adapting the filter should be considered in order to retain classification accuracy. We favor a methodology that attempts to detect changes and adapts the information filter only if inevitable in order to minimize the amount of user feedback for providing new training data. Yet, detecting changes may require costly user feedback as well. This paper describes two methods for detecting changes without user feedback. The first method is based on evaluating an expected error rate, while the second one observes the fraction of classification decisions made with a confidence below a given threshold. Further, a heuristics for automatically determining this threshold is suggested and the performance of this approach is experimentally explored as a function of the threshold parameter. Some empirical results show that both methods work well in a simulated change scenario with real world data.