International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
The automatic indexing system AIR/PHYS - from research to applications
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
TCS: a shell for content-based text categorization
Proceedings of the sixth conference on Artificial intelligence applications
An architecture for probabilistic concept-based information retrieval
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Maximizing the predictive value of production rules
Artificial Intelligence
Combining model-oriented and description-oriented approaches for probabilistic indexing
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Classifying news stories using memory based reasoning
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
C4.5: programs for machine learning
C4.5: programs for machine learning
IEEE Expert: Intelligent Systems and Their Applications
Machine Learning
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
Task-oriented world wide web retrieval by document type classification
Proceedings of the eighth international conference on Information and knowledge management
Text classification using ESC-based stochastic decision lists
Proceedings of the eighth international conference on Information and knowledge management
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Active learning using adaptive resampling
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning for Information Extraction in Informal Domains
Machine Learning - Special issue on information retrieval
Scalable association-based text classification
Proceedings of the ninth international conference on Information and knowledge management
Web page classification based on k-nearest neighbor approach
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Concept-based knowledge discovery in texts extracted from the Web
ACM SIGKDD Explorations Newsletter
A Default Logic Based Framework for Context-Dependent Reasoning with Lexical Knowledge
Journal of Intelligent Information Systems
Probe, count, and classify: categorizing hidden web databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A meta-learning approach for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Summarization as feature selection for text categorization
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
Classifying text documents by associating terms with text categories
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Strategies for minimising errors in hierarchical web categorisation
Proceedings of the eleventh international conference on Information and knowledge management
Simple and accurate feature selection for hierarchical categorisation
Proceedings of the 2002 ACM symposium on Document engineering
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
Hierarchical Text Categorization Using Neural Networks
Information Retrieval
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Innovating web page classification through reducing noise
Journal of Computer Science and Technology
Automatic Text Categorization and Its Application to Text Retrieval
IEEE Transactions on Knowledge and Data Engineering
ACIRD: Intelligent Internet Document Organization and Retrieval
IEEE Transactions on Knowledge and Data Engineering
IEEE Expert: Intelligent Systems and Their Applications
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Lightweight Document Matching for Help-Desk Applications
IEEE Intelligent Systems
QProber: A system for automatic classification of hidden-Web databases
ACM Transactions on Information Systems (TOIS)
Text classification using ESC-based stochastic decision lists
Information Processing and Management: an International Journal
Uncertainty-Based Noise Reduction and Term Selection in Text Categorization
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Athena: Mining-Based Interactive Management of Text Database
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Distributed Hypertext Resource Discovery Through Examples
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Preferred Document Classification for a Highly Inflectional/Derivational Language
AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Mining HTML Pages to Support Document Sharing in a Cooperative System
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
Optimal Queries in Information Filtering
ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Predictive Self-Organizing Networks for Text Categorization
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
A Case Based System for Oil and Gas Well Design
IEA/AIE '02 Proceedings of the 15th international conference on Industrial and engineering applications of artificial intelligence and expert systems: developments in applied artificial intelligence
Constraint Classification: A New Approach to Multiclass Classification
ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
A Machine Learning Approach to Web Mining
AI*IA '99 Proceedings of the 6th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
Uncertainty and term selection in text categorization
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Efficient handling of high-dimensional feature spaces by randomized classifier ensembles
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental context mining for adaptive document classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Handbook of data mining and knowledge discovery
A trainable system for the extraction of meaning from text
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Indexing for fast categorisation
ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
Dynamic Email Organization via Relevance Categories
ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Classification of Web Documents Using a Graph Model
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
On the quality of ART1 text clustering
Neural Networks - 2003 Special issue: Advances in neural networks research IJCNN'03
Mining for interactive identification of users' information needs
Information Systems
CBC: Clustering Based Text Classification Requiring Minimal Labeled Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
On Using Partial Supervision for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Effect of term distributions on centroid-based text categorization
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Document classification using a finite mixture model
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Event detection from online news documents for supporting environmental scanning
Decision Support Systems - Special issue: Knowledge management technique
Text categorization for a comprehensive time-dependent benchmark
Information Processing and Management: an International Journal
Data-intensive analytics for predictive modeling
IBM Journal of Research and Development
Semantic Feature Selection Using WordNet
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
A Fuzzy Classification Based on Feature Selection for Web Pages
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Relevant Data Expansion for Learning Concept Drift from Sparsely Labeled Data
IEEE Transactions on Knowledge and Data Engineering
Adaptive anti-spam filtering for agglutinative languages: a special case for Turkish
Pattern Recognition Letters
The BankSearch web document dataset: investigating unsupervised clustering and category similarity
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
A Case Based System for Oil and Gas Well Design with Risk Assessment
Applied Intelligence
Scoring and Selecting Terms for Text Categorization
IEEE Intelligent Systems
Genre, task, topic and time: facets of personal digital document management
CHINZ '05 Proceedings of the 6th ACM SIGCHI New Zealand chapter's international conference on Computer-human interaction: making CHI natural
Introducing a Family of Linear Measures for Feature Selection in Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Automatic Category Theme Identification and Hierarchy Generation for Chinese Text Categorization
Journal of Intelligent Information Systems
Local sparsity control for naive Bayes with extreme misclassification costs
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling word burstiness using the Dirichlet distribution
ICML '05 Proceedings of the 22nd international conference on Machine learning
Applying semantic links for classifying web pages
IEA/AIE'2003 Proceedings of the 16th international conference on Developments in applied artificial intelligence
Two-stage statistical language models for text database selection
Information Retrieval
Rule identification from web pages by the XRML approach
Decision Support Systems
Large-scale text categorization by batch mode active learning
Proceedings of the 15th international conference on World Wide Web
Modeling user interests by conceptual clustering
Information Systems - Special issue: The semantic web and web services
Angular measures for feature selection in text categorization
Proceedings of the 2006 ACM symposium on Applied computing
Application of information retrieval techniques to single writer documents
Pattern Recognition Letters
Higher order feature selection for text classification
Knowledge and Information Systems
Automatic semantics extraction in law documents
ICAIL '05 Proceedings of the 10th international conference on Artificial intelligence and law
NEWPAR: an automatic feature selection and weighting schema for category ranking
Proceedings of the 2006 ACM symposium on Document engineering
Text classification based on the bias of word frequency over categories
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Linguini: language identification for multilingual documents
Journal of Management Information Systems - Special section: Exploring the outlands of the MIS discipline
Query translation by text categorization
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Multi-class pattern classification using neural networks
Pattern Recognition
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
Expert Systems with Applications: An International Journal
Fuzzy support vector machine for multi-class text categorization
Information Processing and Management: an International Journal
Automated extraction of behavioural profiles from document usage
BT Technology Journal
Learning rules with negation for text categorization
Proceedings of the 2007 ACM symposium on Applied computing
A study of local and global thresholding techniques in text categorization
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Searching and retrieving legal literature through automated semantic indexing
Proceedings of the 11th international conference on Artificial intelligence and law
Evolving Lucene search queries for text classification
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Automatic classification of provisions in legislative texts
Artificial Intelligence and Law - AI & law in eGovernment and eDemocracy part II
Text classification using sentential frequent itemsets
Journal of Computer Science and Technology
Time-efficient spam e-mail filtering using n-gram models
Pattern Recognition Letters
Expert Systems with Applications: An International Journal
Combining rough decisions for intelligent text mining using Dempster's rule
Artificial Intelligence Review
Data mining from 1994 to 2004: an application-orientated review
International Journal of Business Intelligence and Data Mining
Retrieval of Italian legal literature: a case of semantic search using legal vocabulary
International Journal of Metadata, Semantics and Ontologies
Supervised document classification based upon domain-specific term taxonomies
International Journal of Metadata, Semantics and Ontologies
Flexible document categorisation
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
A new algorithm for term weighting in text summarization process
AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Boosting multi-label hierarchical text categorization
Information Retrieval
Web Intelligence and Agent Systems
Retrieval of Italian legal literature: a case of semantic search using legal vocabulary
DCMI '05 Proceedings of the 2005 international conference on Dublin Core and metadata applications: vocabularies in practice
A-Brain: a general system for solving data analysis problems
Journal of Experimental & Theoretical Artificial Intelligence
Architecture and performance of the rule based comparison shopping: delivery cost experience
Proceedings of the 10th international conference on Electronic commerce
Semi-supervised Collaborative Text Classification
ECML '07 Proceedings of the 18th European conference on Machine Learning
Text Categorization in Non-linear Semantic Space
AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
A Genetic Algorithm for Text Classification Rule Induction
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Using Laplace and angular measures for Feature Selection in Text Categorisation
International Journal of Advanced Intelligence Paradigms
A Nonparametric Bayesian Learning Model: Application to Text and Image Categorization
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Clinical text classification under the Open and Closed Topic Assumptions
International Journal of Data Mining and Bioinformatics
Set Cover Feature Selection for Text Categorisation and spam detection
International Journal of Advanced Intelligence Paradigms
Bayesian network models for hierarchical text classification from a thesaurus
International Journal of Approximate Reasoning
Adaptive Web SitesA Knowledge Extraction from Web Data Approach
Proceedings of the 2008 conference on Adaptive Web Sites: A Knowledge Extraction from Web Data Approach
IWANN '03 Proceedings of the 7th International Work-Conference on Artificial and Natural Neural Networks: Part II: Artificial Neural Nets Problem Solving Methods
A comparison of fraud cues and classification methods for fake escrow website detection
Information Technology and Management
Legal docket-entry classification: where machine learning stumbles
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Rule-based machine learning methods for functional prediction
Journal of Artificial Intelligence Research
NLP-driven IR: evaluating performances over a text classification task
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Integrated access to legal literature through automated semantic classification
Artificial Intelligence and Law
Japanese text classification using N-gram and the maximum ratio of term frequency among categories
ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
A decision-tree-based symbolic rule induction system for text categorization
IBM Systems Journal
Pattern-oriented associative rule-based patent classification
Expert Systems with Applications: An International Journal
Computing a Comprehensible Model for Spam Filtering
DS '09 Proceedings of the 12th International Conference on Discovery Science
Classifying Documents According to Locational Relevance
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Rule identification from Web pages by the XRML approach
Decision Support Systems
Building an operational product ontology system
Electronic Commerce Research and Applications
Modeling user interests by conceptual clustering
Information Systems
Commercial Internet filters: Perils and opportunities
Decision Support Systems
Chinese text categorization based on the binary weighting model with non-binary smoothing
ECIR'03 Proceedings of the 25th European conference on IR research
Iems: helping users manage email
UM'03 Proceedings of the 9th international conference on User modeling
On the importance of parameter tuning in text categorization
PSI'06 Proceedings of the 6th international Andrei Ershov memorial conference on Perspectives of systems informatics
Robust expectation maximization learning algorithm for mixture of experts
IWANN'03 Proceedings of the Artificial and natural neural networks 7th international conference on Computational methods in neural modeling - Volume 1
A new learning method for single layer neural networks based on a regularized cost function
IWANN'03 Proceedings of the Artificial and natural neural networks 7th international conference on Computational methods in neural modeling - Volume 1
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Experiments on kernel tree support vector machines for text categorization
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Supervised and unsupervised learning algorithms for thai web pages identification
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Robust text classification using a hysteresis-driven extended SRN
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Feature reinforcement approach to poly-lingual text categorization
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Fast categorization of web documents represented by graphs
WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
Knowledge extraction with non-negative matrix factorization for text classification
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A study on feature weighting in Chinese text categorization
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Text and hypertext categorization
Artificial intelligence
Distributed text classification with an ensemble kernel-based learning approach
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling
IEEE Transactions on Neural Networks
Content-enriched classifier for web video classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Automatically computed document dependent weighting factor facility for Naïve Bayes classification
Expert Systems with Applications: An International Journal
Transferring and retraining learned information filters
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Active learning with committees for text categorization
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
ROLEX-SP: Rules of lexical syntactic patterns for free text categorization
Knowledge-Based Systems
Predictive rule discovery from electronic health records
Proceedings of the 1st ACM International Health Informatics Symposium
Learning trees and rules with set-valued features
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Classification inductive rule learning with negated features
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Word co-occurrence features for text classification
Information Systems
Self-organising maps in document classification: a comparison with six machine learning methods
ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part I
Cross-lingual text categorization: Conquering language boundaries in globalized environments
Information Processing and Management: an International Journal
Expert Systems with Applications: An International Journal
An improved K-nearest-neighbor algorithm for text categorization
Expert Systems with Applications: An International Journal
Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet
Expert Systems with Applications: An International Journal
FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors
Expert Systems with Applications: An International Journal
Rule-based personalized comparison shopping including delivery cost
Electronic Commerce Research and Applications
Multiple sets of rules for text categorization
ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Oscillating feature subset search algorithm for text categorization
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
An embedded bayesian network hidden markov model for digital forensics
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
A new inductive learning method for multilabel text categorization
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
MP-Boost: a multiple-pivot boosting algorithm and its application to text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Classifying web data in directory structures
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Effectiveness of document representation for classification
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
AIS'04 Proceedings of the 13th international conference on AI, Simulation, and Planning in High Autonomy Systems
Improving customer experience via text mining
DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
Evolving rules for document classification
EuroGP'05 Proceedings of the 8th European conference on Genetic Programming
Towards automatic and optimal filtering levels for feature selection in text categorization
IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Assigning polarity scores to reviews using machine learning techniques
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Class normalization in centroid-based text categorization
Information Sciences: an International Journal
User behavior analysis of the open-ended document classification system
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Automated text classification using a dynamic artificial neural network model
Expert Systems with Applications: An International Journal
Feature selection: a useful preprocessing step
IRSG'97 Proceedings of the 19th Annual BCS-IRSG conference on Information Retrieval Research
International Journal of Information Management: The Journal for Information Professionals
Expert Systems with Applications: An International Journal
Combining relevancy and methodological quality into a single ranking for evidence-based medicine
Information Sciences: an International Journal
Rapid modeling and analyzing networks extracted from pre-structured news articles
Computational & Mathematical Organization Theory
Effect of small sample size on text categorization with support vector machines
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Large-scale item categorization for e-commerce
Proceedings of the 21st ACM international conference on Information and knowledge management
Concept comparison engines: A new frontier of search
Decision Support Systems
Audience targeting by B-to-B advertisement classification: A neural network approach
Expert Systems with Applications: An International Journal
Classifying unlabeled short texts using a fuzzy declarative approach
Language Resources and Evaluation
What's buzzing in the blizzard of buzz? Automotive component isolation in social media postings
Decision Support Systems
Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Exploiting poly-lingual documents for improving text categorization effectiveness
Decision Support Systems
Hi-index | 0.02 |
We describe the results of extensive experiments using optimized rule-based induction methods on large document collections. The goal of these methods is to discover automatically classification patterns that can be used for general document categorization or personalized filtering of free text. Previous reports indicate that human-engineered rule-based systems, requiring many man-years of developmental efforts, have been successfully built to “read” documents and assign topics to them. We show that machine-generated decision rules appear comparable to human performance, while using the identical rule-based representation. In comparison with other machine-learning techniques, results on a key benchmark from the Reuters collection show a large gain in performance, from a previously reported 67% recall/precision breakeven point to 80.5%. In the context of a very high-dimensional feature space, several methodological alternatives are examined, including universal versus local dictionaries, and binary versus frequency-related features.