A two-stage text mining model for information filtering

Authors:
Yuefeng Li;Xujuan Zhou;Peter Bruza;Yue Xu;Raymond Y.K. Lau
Affiliations:
Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;City University of Hong Kong, Hong Kong, China
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 27
Cited 15

An evaluation of phrasal and clustered representations on a text categorization task

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Information filtering and information retrieval: two sides of the same coin?

Communications of the ACM - Special issue on information filtering
A decision theoretic framework for approximating concepts

International Journal of Man-Machine Studies
A multilevel approach to intelligent information filtering: model, system, and evaluation

ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Automatic classification using supervised learning in a medical document filtering application

Information Processing and Management: an International Journal
A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
A probabilistic model of information retrieval: development and comparative experiments Part 2

Information Processing and Management: an International Journal
Effective personalization based on association rule discovery from web usage data

Proceedings of the 3rd international workshop on Web information and data management
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Multipass algorithms for mining association rules in text databases

Knowledge and Information Systems
Modern Information Retrieval

Modern Information Retrieval
Feature Engineering for Text Classification

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
User Profile Model: A View from Artificial Intelligence

TSCTC '02 Proceedings of the Third International Conference on Rough Sets and Current Trends in Computing
Building a filtering test collection for TREC 2002

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Belief revision for adaptive information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic Pattern-Taxonomy Extraction for Web Mining

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Mining Ontology for Automatically Acquiring Web User Information Needs

IEEE Transactions on Knowledge and Data Engineering
Identifying comparative sentences in text documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Deploying Approaches for Pattern Refinement in Text Mining

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
SIFT: a tool for wide-area information dissemination

TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Utility-based information distillation over temporally sequenced documents

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Ranking with multiple hyperplanes

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generating concise association rules

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Probabilistic rough set approximations

International Journal of Approximate Reasoning
Learning to classify texts using positive and unlabeled data

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Adaptive Information Filtering Based on PTM Model (APTM)

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Mining Negative Relevance Feedback for Information Filtering

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
An effective model of using negative relevance feedback for information filtering

Proceedings of the 18th ACM conference on Information and knowledge management
Concept-Based, Personalized Web Information Gathering: A Survey

KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Granular Computing for Text Mining: New Research Challenges and Opportunities

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Mining positive and negative patterns for relevance feature discovery

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Selected new training documents to update user profile

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Rough sets based reasoning and pattern mining for a two-stage information filtering system

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A pattern mining approach for information filtering systems

Information Retrieval
Pattern mining for a two-stage information filtering system

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Editorial: Narrative-based taxonomy distillation for effective indexing of text collections

Data & Knowledge Engineering
A two-stage decision model for information filtering

Decision Support Systems
Adopting relevance feature to learn personalized ontologies

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
A pattern based two-stage text classifier

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Text mining in negative relevance feedback

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mismatch and overload are the two fundamental issues regarding the effectiveness of information filtering. Both term-based and pattern (phrase) based approaches have been employed to address these issues. However, they all suffer from some limitations with regard to effectiveness. This paper proposes a novel solution that includes two stages: an initial topic filtering stage followed by a stage involving pattern taxonomy mining. The objective of the first stage is to address mismatch by quickly filtering out probable irrelevant documents. The threshold used in the first stage is motivated theoretically. The objective of the second stage is to address overload by apply pattern mining techniques to rationalize the data relevance of the reduced document set after the first stage. Substantial experiments on RCV1 show that the proposed solution achieves encouraging performance.