An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of adding relevance information in a relevance feedback environment
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval
Proceedings of the eighth international conference on Information and knowledge management
Improving the effectiveness of information retrieval with local context analysis
ACM Transactions on Information Systems (TOIS)
Data mining: concepts and techniques
Data mining: concepts and techniques
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
A probabilistic model of information retrieval: development and comparative experiments
Information Processing and Management: an International Journal
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Engineering for Text Classification
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Document Collections
ADL '98 Proceedings of the Advances in Digital Libraries Conference
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Building a filtering test collection for TREC 2002
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining Sequential Patterns Using Graph Search Techniques
COMPSAC '03 Proceedings of the 27th Annual International Conference on Computer Software and Applications
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Automatic Pattern-Taxonomy Extraction for Web Mining
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Simple BM25 extension to multiple weighted fields
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Multi-labelled classification using maximum entropy method
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining Ontology for Automatically Acquiring Web User Information Needs
IEEE Transactions on Knowledge and Data Engineering
Adapting ranking SVM to document retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying comparative sentences in text documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Deploying Approaches for Pattern Refinement in Text Mining
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Ranking with multiple hyperplanes
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Latent concept expansion using markov random fields
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection methods for text classification
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Tracking multiple topics for finding interesting articles
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A concept-based model for enhancing text categorization
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Generating concise association rules
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Query dependent ranking using K-nearest neighbor
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A study of methods for negative relevance feedback
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining multi-faceted overviews of arbitrary topics in a text collection
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Active relevance feedback for difficult queries
Proceedings of the 17th ACM conference on Information and knowledge management
A two-stage text mining model for information filtering
Proceedings of the 17th ACM conference on Information and knowledge management
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
Learning to classify texts using positive and unlabeled data
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A Personalized Ontology Model for Web Information Gathering
IEEE Transactions on Knowledge and Data Engineering
Selected new training documents to update user profile
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A two-stage decision model for information filtering
Decision Support Systems
Efficient subject-oriented evaluating and mining methods for data with schema uncertainty
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Unsupervised multi-label text classification using a world knowledge ontology
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Adopting relevance feature to learn personalized ontologies
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Scoring-Thresholding pattern based text classifier
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
Matching Relevance Features with Ontological Concepts
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Using Patterns Co-occurrence Matrix for Cleaning Closed Sequential Patterns for Text Mining
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Mapping semantic knowledge for unsupervised text categorisation
ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137
A pattern based two-stage text classifier
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences, but many experiments do not support this hypothesis. The innovative technique presented in paper makes a breakthrough for this difficulty. This technique discovers both positive and negative patterns in text documents as higher level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the higher level features. Substantial experiments using this technique on Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and pattern based methods on precision, recall and F measures.