Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Learning while filtering documents
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Maximum likelihood estimation for filtering thresholds
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Topic-conditioned novelty detection
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Robustness of adaptive filtering methods in a cross-benchmark evaluation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Utility-based information distillation over temporally sequenced documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Understanding temporal aspects in document classification
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Using the αβ-Neighborhood for Adaptive Document Filtering
CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Adaptive Information Filtering Based on PTM Model (APTM)
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Selected new training documents to update user profile
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Adaptive information filtering is an open challenge in information retrieval. One of the tough issues is the optimization of decision thresholds over time, based on partial relevance feedback on the system-retrieved documents in chronological order. We developed a new approach, namely margin-based local regression, that automatically adjusts the thresholds based on a sliding window over the truly positive examples for which the system predicted "yes" with respect to a particular class, and a second sliding window over the other documents being processed by the system. Using the means of the scores of the documents in the two windows, we monitor the temporal drifting of the margin that is a function of both the current classification model and the threshold calibration strategy, and that suggests the bounds for the optimal threshold at a given time. Examining this approach together with a Rocchio-style classifier on the TREC 2001 and TREC 2002 benchmark data sets in adaptive filtering, we obtained significant improvements in performance (measured using Fβ=0.5) over the baseline system that did not adapt the threshold over time, and the best result ever reported on the TREC 2002 benchmark corpus for adaptive filtering evaluations. These empirical results suggest that it is important to use both system-accepted and system-rejected documents to optimize thresholds instead of just using system-accepted documents alone, as well as to make the thresholding function temporally sensitive to the shifting centroids of on-topic and off-topic documents.