Using bayesian priors to combine classifiers for adaptive filtering

Authors:
Yi Zhang
Affiliations:
Carnegie Mellon University, Pittsburgh, PA
Venue:
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2004

Citing 17
Cited 18

Neural networks and the bias/variance dilemma

Neural Computation
Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Document filtering with inference networks

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental relevance feedback for information filtering

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Method combination for document filtering

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A meta-learning approach for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Maximum likelihood estimation for filtering thresholds

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic combination of text classifiers using reliability indicators: models and results

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Combining Multiple Learning Strategies for Effective Cross Validation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

Incremental profile learning based on a reinforcement method

Proceedings of the 2005 ACM symposium on Applied computing
Robustness of adaptive filtering methods in a cross-benchmark evaluation

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
On-line spam filter fusion

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Combining multiple forms of evidence while filtering

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Utility-based information distillation over temporally sequenced documents

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of adaptive filtering in the context of realistic task-based information exploration

Information Processing and Management: an International Journal
Toward incorporating a task-stage identification technique into the long-term document support process

Information Processing and Management: an International Journal
Complex adaptive filtering user profile using graphical models

Information Processing and Management: an International Journal
Towards modeling threaded discussions using induced ontology knowledge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Immune Learning in a Dynamic Information Environment

ICARIS '09 Proceedings of the 8th International Conference on Artificial Immune Systems
Mining Negative Relevance Feedback for Information Filtering

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A network-based model for high-dimensional information filtering

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Interactive retrieval based on faceted feedback

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Selected new training documents to update user profile

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A pattern mining approach for information filtering systems

Information Retrieval
Filtering semi-structured documents based on faceted feedback

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
On sparsity and drift for effective real-time filtering in microblogs

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Text mining in negative relevance feedback

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

An adaptive information filtering system monitors a document stream to identify the documents that match information needs specified by user profiles. As the system filters, it also refines its knowledge about the user's information needs based on long-term observations of the document stream and periodic feedback(training data) from the user. Low variance profile learning algorithms, such as Rocchio, work well at the early stage of filtering when the system has very few training data. Low bias profile learning algorithms, such as Logistic Regression, work well at the later stage of filtering when the system has accumulated enough training data.However, an empirical system needs to works well consistently at all stages of filtering process. This paper addresses this problem by proposing a new technique to combine different text classification algorithms via a constrained maximum likelihood Bayesian prior. This technique provides a trade off between bias and variance, and the combined classifier may achieve a consistent good performance at different stages of filtering. We implemented the proposed technique to combine two complementary classification algorithms: Rocchio and logistic regression. The new algorithm is shown to compare favorably with Rocchio, Logistic Regression, and the best methods in the TREC-9 and TREC-11 adaptive filtering tracks.