An Evaluation of Statistical Approaches to Text Categorization

Authors:
Yiming Yang
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213-3702, USA. yiming@cs.cmu.edu
Venue:
Information Retrieval
Year:
1999

Citing 20
Cited 406

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Trading MIPS and memory for knowledge engineering

Communications of the ACM
Automatic indexing based on Bayesian inference networks

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
An example-based mapping method for text categorization and retrieval

ACM Transactions on Information Systems (TOIS)
Expert network: effective and efficient learning from human decisions in text categorization and retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Towards language independent automated learning of text categorization models

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Document filtering for fast ranking

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Noise reduction in a statistical approach to text categorization

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based text categorization: a comparison of category search strategies

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The design of a high performance information filtering system

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Induction of Decision Trees

Machine Learning
CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories

IAAI '90 Proceedings of the The Second Conference on Innovative Applications of Artificial Intelligence
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Linear Least Squares Fit mapping method for information retrieval from natural language texts

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2

A personal news agent that talks, learns and explains

Proceedings of the third annual conference on Autonomous Agents
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic description-oriented approach for categorizing web documents

Proceedings of the eighth international conference on Information and knowledge management
Content-based book recommending using learning for text categorization

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Improving text categorization methods for event tracking

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A classifier for semi-structured documents

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Fast supervised dimensionality reduction algorithm with applications to document categorization & retrieval

Proceedings of the ninth international conference on Information and knowledge management
An improved boosting algorithm and its application to text categorization

Proceedings of the ninth international conference on Information and knowledge management
Improving automatic Chinese text categorization by error correction

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Web page classification based on k-nearest neighbor approach

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Learning video browsing behavior and its application in the generation of video previews

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Extracting meaningful labels for WEBSOM text archives

Proceedings of the tenth international conference on Information and knowledge management
Learning probabilistic datalog rules for information classification and transformation

Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Adaptive interfaces for ubiquitous web access

Communications of the ACM - The Adaptive Web
BlOMIND-protein property prediction by property proximity profiles

Proceedings of the 2002 ACM symposium on Applied computing
Evaluating cost-sensitive Unsolicited Bulk Email categorization

Proceedings of the 2002 ACM symposium on Applied computing
Boosting to correct inductive bias in text classification

Proceedings of the eleventh international conference on Information and knowledge management
Strategies for minimising errors in hierarchical web categorisation

Proceedings of the eleventh international conference on Information and knowledge management
Web classification using support vector machine

Proceedings of the 4th international workshop on Web information and data management
Exploiting Hierarchy in Text Categorization

Information Retrieval
Text Categorization Based on Regularized Linear Classification Methods

Information Retrieval
Integrating External Knowledge to Supplement Training Data in Semi-Supervised Learning for Text Categorization

Information Retrieval
Hierarchical Text Categorization Using Neural Networks

Information Retrieval
Information Filtering in TREC-9 and TDT-3: A Comparative Analysis

Information Retrieval
Evaluation of Text Retrieval Systems

Programming and Computing Software
User Modeling for Adaptive News Access

User Modeling and User-Adapted Interaction
Mining e-mail content for author identification forensics

ACM SIGMOD Record
A Study of Approaches to Hypertext Categorization

Journal of Intelligent Information Systems
Support Vector Machines

IEEE Intelligent Systems
Learning Approaches for Detecting and Tracking News Events

IEEE Intelligent Systems
Maximizing Text-Mining Performance

IEEE Intelligent Systems
Lightweight Document Matching for Help-Desk Applications

IEEE Intelligent Systems
Neural Networks for Web Content Filtering

IEEE Intelligent Systems
Combining evidence for automatic web session identification

Information Processing and Management: an International Journal - Issues of context in information retrieval
Converting numerical classification into text classification

Artificial Intelligence
Learning Classification with Both Labeled and Unlabeled Data

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Similarity Model and Term Association for Document Categorization

NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Text Categorization and Semantic Browsing with Self-Organizing Maps on Non-euclidean Spaces

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Automatic Web-Page Classification by Using Machine Learning Methods

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Adapting a Robust Multi-genre NE System for Automatic Content Extraction

AIMSA '02 Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
A Linear Text Classification Algorithm Based on Category Relevance Factors

ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Topic Spotting on News Articles with Topic Repository by Controlled Indexing

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Experiments on the Use of Feature Selection and Negative Evidence in Automated Text Categorization

ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Evaluating a User-Model Based Personalisation Architecture for Digital News Services

ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Dynamic Models of Expert Groups to Recommend Web Documents

ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Personalized Classification for Keyword-Based Category Profiles

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
TWIMC: An Anonymous Recipient E-mail System

IEA/AIE '02 Proceedings of the 15th international conference on Industrial and engineering applications of artificial intelligence and expert systems: developments in applied artificial intelligence
A Comparative Study on Statistical Machine Learning Algorithms and Thresholding Strategies for Automatic Text Categorization

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Support Vector Learning for Gender Classification Using Audio and Visual Cues: A Comparison

SVM '02 Proceedings of the First International Workshop on Pattern Recognition with Support Vector Machines
Text Categorization Using Adaptive Context Trees

CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Combining Multiple K-Nearest Neighbor Classifiers for Text Classification by Reducts

DS '02 Proceedings of the 5th International Conference on Discovery Science
A Hybrid Approach to Optimize Feature Selection Process in Text Classification

AI*IA 01 Proceedings of the 7th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Extracting User Profiles from E-mails Using the Set-Oriented Classifier

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Text categorization based on k-nearest neighbor approach for web site classification

Information Processing and Management: an International Journal
A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Topic-conditioned novelty detection

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Weighting in k-Means Clustering

Machine Learning
A comparative study for domain ontology guided feature extraction

ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
A scalability analysis of classifiers in text categorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A Comparison of Word- and Sense-Based Text Categorization Using Several Classification Algorithms

Journal of Intelligent Information Systems
Intelligent Web agents that learn to retrieve and extract information

Intelligent exploration of the web
Classification of Documents by Content

ICCI '03 Proceedings of the 2nd IEEE International Conference on Cognitive Informatics
Recommender systems using linear classifiers

The Journal of Machine Learning Research
An Ontology-Based Binary-Categorization Approach for Recognizing Multiple-Record Web Documents Using a Probabilistic Retrieval Model

Information Retrieval
A Feature Selection Framework for Text Filtering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Enhancing Techniques for Efficient Topic Hierarchy Integration

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Augmenting Naive Bayes Classifiers with Statistical Language Models

Information Retrieval
Category cluster discovery from distributed WWW directories

Information Sciences—Informatics and Computer Science: An International Journal - special issue: Knowledge discovery from distributed information sources
Effect of term distributions on centroid-based text categorization

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Automatic text categorization in terms of genre and author

Computational Linguistics
Message classification in the call center

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Improving text categorization using the importance of sentences

Information Processing and Management: an International Journal
Using the feature projection technique based on a normalized voting method for text classification

Information Processing and Management: an International Journal
Text categorization for a comprehensive time-dependent benchmark

Information Processing and Management: an International Journal
Improving linear classifier for Chinese text categorization

Information Processing and Management: an International Journal
Unsupervised learning of soft patterns for generating definitions from online news

Proceedings of the 13th international conference on World Wide Web
Learning block importance models for web pages

Proceedings of the 13th international conference on World Wide Web
Automatic text categorization by unsupervised learning

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
An Evaluation of Passage-Based Text Categorization

Journal of Intelligent Information Systems
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Feature selection for text categorization on imbalanced data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning to Decode Cognitive States from Brain Images

Machine Learning
Probabilistic author-topic models for information discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Authorship verification as a one-class classification problem

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Knowledge Representation and Inductive Learning with XML

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Building an example application with the unstructured information management architecture

IBM Systems Journal
An adaptive k-nearest neighbor text categorization strategy

ACM Transactions on Asian Language Information Processing (TALIP)
Learning important models for web page blocks based on layout and content analysis

ACM SIGKDD Explorations Newsletter
Experimenting with the automatic assignment of educational standards to digital library content

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Best terms: an efficient feature-selection algorithm for text categorization

Knowledge and Information Systems
Language and task independent text categorization with simple language models

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Application of automatic topic identification on excite web search engine data logs

Information Processing and Management: an International Journal
Term norm distribution and its effects on latent semantic indexing

Information Processing and Management: an International Journal
Deriving marketing intelligence from online discussion

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Very large Bayesian multinets for text classification

Future Generation Computer Systems
Collective multi-label classification

Proceedings of the 14th ACM international conference on Information and knowledge management
A model for handling approximate, noisy or incomplete labeling in text classification

ICML '05 Proceedings of the 22nd international conference on Machine learning
Classifying free-text triage chief complaints into syndromic categories with natural languages processing

Artificial Intelligence in Medicine
Comparison of extreme learning machine with support vector machine for text classification

IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence
Combining text and heuristics for cost-sensitive spam filtering

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A text categorization based on summarization technique

RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Machine learning methods for Chinese web page categorization

CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Boosting automatic lexical acquisition with morphological information

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Text classification in Asian languages without word segmentation

AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Selecting text features for gene name classification: from documents to terms

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Virtual examples for text classification with Support Vector Machines

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Hierarchical document categorization with k-NN and concept-based thesauri

Information Processing and Management: an International Journal
Filtering search results using an optimal set of terms identified by an artificial neural network

Information Processing and Management: an International Journal
Mining semantically related terms from biomedical literature

ACM Transactions on Asian Language Information Processing (TALIP)
Large-scale text categorization by batch mode active learning

Proceedings of the 15th international conference on World Wide Web
A methodology for clustering XML documents by structure

Information Systems
Dictionary-based text categorization of chemical web pages

Information Processing and Management: an International Journal
An analysis of the coupling between training set and neighborhood sizes for the kNN classifier

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Linear prediction models with graph regularization for web-page categorization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

IEEE Transactions on Knowledge and Data Engineering
A comparison of feature selection methods for an evolving RSS feed corpus

Information Processing and Management: an International Journal - Special issue: Informetrics
Sequential patterns for text categorization

Intelligent Data Analysis
Query enrichment for web-query classification

ACM Transactions on Information Systems (TOIS)
Content based SMS spam filtering

Proceedings of the 2006 ACM symposium on Document engineering
NEWPAR: an automatic feature selection and weighting schema for category ranking

Proceedings of the 2006 ACM symposium on Document engineering
Text classification based on the bias of word frequency over categories

AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Template extraction from candidate template set generation: a structure and content approach

Proceedings of the 43rd annual Southeast regional conference - Volume 2
Large-scale data exploration with the hierarchically growing hyperbolic SOM

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
On Mining Instance-Centric Classification Rules

IEEE Transactions on Knowledge and Data Engineering
Text classification improved through multigram models

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A Voting Method for the Classification of Web Pages

WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Query translation by text categorization

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A novel feature selection algorithm for text categorization

Expert Systems with Applications: An International Journal
Architecture of a grid-enabled Web search engine

Information Processing and Management: an International Journal
Reformatting web documents via header trees

ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Classifying web documents in a hierarchy of categories: a comprehensive study

Journal of Intelligent Information Systems
Hierarchically SVM classification based on support vector clustering method and its application to document categorization

Expert Systems with Applications: An International Journal
Fuzzy support vector machine for multi-class text categorization

Information Processing and Management: an International Journal
Contextual feature selection for text classification

Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
An empirical study of three machine learning methods for spam filtering

Knowledge-Based Systems
Text classification: A least square support vector machine approach

Applied Soft Computing
Computerized retrieval and classification: An application to reasons for late filings with the securities and exchange commission

Intelligent Data Analysis
Patent document categorization based on semantic structural information

Information Processing and Management: an International Journal
Using the revised EM algorithm to remove noisy data for improving the one-against-the-rest method in binary text classification

Information Processing and Management: an International Journal
Using hypothesis margin to boost centroid text classifier

Proceedings of the 2007 ACM symposium on Applied computing
Standards alignment for metadata assignment

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Automatic patent classification using citation network information: an experimental study in nanotechnology

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A study of local and global thresholding techniques in text categorization

AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Dynamic category profiling for text filtering and classification

Information Processing and Management: an International Journal
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Information-theoretic semantic multimedia indexing

Proceedings of the 6th ACM international conference on Image and video retrieval
The learning vector quantization algorithm applied to automatic text classification tasks

Neural Networks
Using classification techniques for informal requirements in the requirements analysis-supporting system

Information and Software Technology
Intelligent document classification

Intelligent Data Analysis
Combining Subclassifiers in Text Categorization: A DST-Based Solution and a Case Study

IEEE Transactions on Knowledge and Data Engineering
Mining user navigation patterns for personalizing topic directories

Proceedings of the 9th annual ACM international workshop on Web information and data management
Dimensionality reduction of features for text categorization

ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
Multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method

Expert Systems with Applications: An International Journal
Combining rough decisions for intelligent text mining using Dempster's rule

Artificial Intelligence Review
Deep classifier: automatically categorizing search results into large-scale hierarchies

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Personal name classification in web queries

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Performance of KNN and SVM classifiers on full word Arabic articles

Advanced Engineering Informatics
Finding keyword from online broadcasting content for targeted advertising

Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising
Detection of e-mail concerning criminal activities using association rule-based decision tree

International Journal of Electronic Security and Digital Forensics
Supervised document classification based upon domain-specific term taxonomies

International Journal of Metadata, Semantics and Ontologies
Regularized query classification using search click information

Pattern Recognition
An efficient feature selection using multi-criteria in text categorization for naïve Bayes classifier

AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Text categorization: potential tool for managerial decision-making

AIC'05 Proceedings of the 5th WSEAS International Conference on Applied Informatics and Communications
Using the wisdom of the crowds for keyword generation

Proceedings of the 17th international conference on World Wide Web
Text-based decision making with artificial immune systems

SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
An experimental comparative study of web mining methods for recommender systems

DIWED'06 Proceedings of the 6th WSEAS International Conference on Distance Learning and Web Engineering
Index based approach for categorizing online news articles

CEA'08 Proceedings of the 2nd WSEAS International Conference on Computer Engineering and Applications
New methods for text categorization

CIMMACS'06 Proceedings of the 5th WSEAS International Conference on Computational Intelligence, Man-Machine Systems and Cybernetics
Combining gene sequence similarity and textual information for gene function annotation in the literature

Information Retrieval
Requirements model generation to support requirements elicitation: the Secure Tropos experience

Automated Software Engineering
Deep classification in large-scale text hierarchies

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Topic-bridged PLSA for cross-domain text classification

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
COMBINING MULTIPLE CLASSIFIERS USING DEMPSTER'S RULE FOR TEXT CATEGORIZATION

Applied Artificial Intelligence
Boosting: a classification method for remote sensing

International Journal of Remote Sensing
Multilabel classification via calibrated label ranking

Machine Learning
Decision trees for hierarchical multi-label classification

Machine Learning
Fuzzy Kernel Ridge Regression for Classification

ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Mistaken Driven and Unconditional Learning of NTC

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
CWC: A Clustering-Based Feature Weighting Approach for Text Classification

MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Neighborhood-Based Local Sensitivity

ECML '07 Proceedings of the 18th European conference on Machine Learning
Random k-Labelsets: An Ensemble Method for Multilabel Classification

ECML '07 Proceedings of the 18th European conference on Machine Learning
Identifying Chinese E-Mail Documents' Authorship for the Purpose of Computer Forensic

PAISI, PACCF and SOCO '08 Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on Intelligence and Security Informatics
Using Intuitionistic Fuzzy Sets in Text Categorization

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Improving Web Search by Categorization, Clustering, and Personalization

ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Organizational Knowledge Sources Integration through an Ontology-Based Approach: The Onto-DOM Architecture

WSKS '08 Proceedings of the 1st world summit on The Knowledge Society: Emerging Technologies and Information Systems for the Knowledge Society
INDUCTION FROM MULTI-LABEL EXAMPLES IN INFORMATION RETRIEVAL SYSTEMS: A CASE STUDY

Applied Artificial Intelligence
Two novel feature selection approaches for web page classification

Expert Systems with Applications: An International Journal
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization

Proceedings of the 17th ACM conference on Information and knowledge management
The effect of title term suggestion on e-commerce sites

Proceedings of the 10th ACM workshop on Web information and data management
Text classification from unlabeled documents with bootstrapping and feature projection techniques

Information Processing and Management: an International Journal
Automatic Indexing from a Thesaurus Using Bayesian Networks: Application to the Classification of Parliamentary Initiatives

ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Construction of supervised and unsupervised learning systems for multilingual text categorization

Expert Systems with Applications: An International Journal
Managing irrelevant knowledge in CBR models for unsolicited e-mail classification

Expert Systems with Applications: An International Journal
A Probabilistic Neighbourhood Translation Approach for Non-standard Text Categorisation

DS '08 Proceedings of the 11th International Conference on Discovery Science
A hidden Markov model-based text classification of medical documents

Journal of Information Science
Query based optimal web site clustering using simulated annealing

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Feature selection for text classification with Naïve Bayes

Expert Systems with Applications: An International Journal
A New Approach of Feature Selection for Chinese Web Page Categorization

ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
Identification of ambiguous queries in web search

Information Processing and Management: an International Journal
Classifying search queries using the Web as a source of knowledge

ACM Transactions on the Web (TWEB)
Identifying vertical search intention of query through social tagging propagation

Proceedings of the 18th international conference on World wide web
One-against-one fuzzy support vector machine classifier: An approach to text categorization

Expert Systems with Applications: An International Journal
CLHQS: Hierarchical Query Suggestion by Mining Clickthrough Log

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A new initialization method for categorical data clustering

Expert Systems with Applications: An International Journal
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
An initialization method for the K-Means algorithm using neighborhood model

Computers & Mathematics with Applications
Bayesian network models for hierarchical text classification from a thesaurus

International Journal of Approximate Reasoning
Context Dependent Movie Recommendations Using a Hierarchical Bayesian Model

Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Detecting spammers and content promoters in online video social networks

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The role of roles in classifying annotated biomedical text

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Semi-supervised multi-label learning by constrained non-negative matrix factorization

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
An efficient algorithm for local distance metric learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
A comparison of text-classification techniques applied to Arabic text

Journal of the American Society for Information Science and Technology
Sales Intelligence Using Web Mining

ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
Detecting Intuitive Mentions of Diseases in Narrative Clinical Text

AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
A two-stage approach to retrieving answers for how-to questions

EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Investigating content selection for language generation using machine learning

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Training Data Cleaning for Text Classification

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Dependency Tree Kernels for Relation Extraction from Natural Language Text

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
An Evidence-Theoretic k-Nearest Neighbor Rule for Multi-label Classification

SUM '09 Proceedings of the 3rd International Conference on Scalable Uncertainty Management
Multi-label learning by instance differentiation

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Automatic Arabic document categorization based on the Naïve Bayes algorithm

Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Web corpus mining by instance of Wikipedia

WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
Towards role-based filtering of disease outbreak reports

Journal of Biomedical Informatics
Social Semantics and Its Evaluation by Means of Semantic Relatedness and Open Topic Models

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
User Insisted Redistribution of Belief in Hierarchical Classification Spaces

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Confusion and Distance Metrics as Performance Criteria for Hierarchical Classification Spaces

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Using text classifiers for numerical classification

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Mining soft-matching rules from textual data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
NLP-driven IR: evaluating performances over a text classification task

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Cerno: Light-weight tool support for semantic annotation of textual documents

Data & Knowledge Engineering
Boosting KNN text classification accuracy by using supervised term weighting schemes

Proceedings of the 18th ACM conference on Information and knowledge management
Web news categorization using a cross-media document graph

Proceedings of the ACM International Conference on Image and Video Retrieval
Two-way Poisson mixture models for simultaneous document classification and word clustering

Computational Statistics & Data Analysis
Managing Knowledge in Light of Its Evolution Process: An Empirical Study on Citation Network-Based Patent Classification

Journal of Management Information Systems
An extensive study on automated Dewey Decimal Classification

Journal of the American Society for Information Science and Technology
Learning author-topic models from text corpora

ACM Transactions on Information Systems (TOIS)
Japanese text classification using N-gram and the maximum ratio of term frequency among categories

ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
A decision-tree-based symbolic rule induction system for text categorization

IBM Systems Journal
Detectando usuários maliciosos em interações via vídeos no YouTube

Proceedings of the 14th Brazilian Symposium on Multimedia and the Web
Classifying Documents According to Locational Relevance

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Incremental E-Mail Classification and Rule Suggestion Using Simple Term Statistics

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Effective use of WordNet semantics via kernel-based learning

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Neighbor-weighted K-nearest neighbor for unbalanced text corpus

Expert Systems with Applications: An International Journal
Using text classification and multiple concepts to answer e-mails

Expert Systems with Applications: An International Journal
Very large Bayesian multinets for text classification

Future Generation Computer Systems
Application of automatic topic identification on Excite Web search engine data logs

Information Processing and Management: an International Journal
Hierarchical document categorization with k-NN and concept-based thesauri

Information Processing and Management: an International Journal
Filtering search results using an optimal set of terms identified by an artificial neural network

Information Processing and Management: an International Journal
Computing with words for text processing: An approach to the text categorization

Information Sciences: an International Journal
A methodology for clustering XML documents by structure

Information Systems
A rough set-based case-based reasoner for text categorization

International Journal of Approximate Reasoning
Text categorization of commercial Web pages

AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Categorization of news articles using neural text categorizer

FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Detecting spammers and content promoters in online video social networks

INFOCOM'09 Proceedings of the 28th IEEE international conference on Computer Communications Workshops
Identification of non-functional requirements in textual specifications: A semi-supervised learning approach

Information and Software Technology
Combining naive bayes and n-gram language models for text classification

ECIR'03 Proceedings of the 25th European conference on IR research
A study on optimal parameter tuning for Rocchio text classifier

ECIR'03 Proceedings of the 25th European conference on IR research
Ranking web documents with dynamic evaluation by expert groups

CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
Document representation using global association distance model

ECIR'07 Proceedings of the 29th European conference on IR research
Very large Bayesian networks in text classification

ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
Supervised and unsupervised learning algorithms for thai web pages identification

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Content-based recommendation systems

The adaptive web
Combining global and local information for enhanced deep classification

Proceedings of the 2010 ACM Symposium on Applied Computing
Formal distance vs. association strength in text processing

CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
An extended document frequency metric for feature selection in text categorization

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A novel reliable negative method based on clustering for learning from positive and unlabeled examples

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Text categorization of multilingual web pages in specific domain

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Profile based algorithm to topic spotting in Reuter21578

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
A text classification method with an effective feature extraction based on category analysis

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
A text categorization method based on local document frequency

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
An impact of linguistic features on automated classification of OCR texts

DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Multilabel classification with meta-level features

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Mining positive and negative patterns for relevance feature discovery

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Multilabel dimensionality reduction via dependence maximization

ACM Transactions on Knowledge Discovery from Data (TKDD)
Patent classification system using a new hybrid genetic algorithm support vector machine

Applied Soft Computing
Chat mining: Automatically determination of chat conversations' topic in Turkish text based chat mediums

Expert Systems with Applications: An International Journal
A clustering-based KNN improved algorithm CLKNN for text classification

CAR'10 Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics - Volume 3
An information-theoretic framework for semantic-multimedia retrieval

ACM Transactions on Information Systems (TOIS)
A topological embedding of the lexicon for semantic distance computation

Natural Language Engineering
ISTI@SemEval-2 task #8: Boosting-based multiway relation classification

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research
A ConceptLink graph for text structure mining

ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Inferring gender of movie reviewers: exploiting writing style, content and metadata

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Collective taxonomizing: A collaborative approach to organizing document repositories

Decision Support Systems
Undersampling approach for imbalanced training sets and induction from multi-label text-categorization domains

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Extended explicit semantic analysis for calculating semantic relatedness of web resources

EC-TEL'10 Proceedings of the 5th European conference on Technology enhanced learning conference on Sustaining TEL: from innovation to learning and practice
Personalisation in web computing and informatics: Theories, techniques, applications, and future research

Information Systems Frontiers
Statistical Classification of Scientific Publications

Informatica
Document assignment in multi-site search engines

Proceedings of the fourth ACM international conference on Web search and data mining
Computer-assisted assignment of educational standards using natural language processing

Journal of the American Society for Information Science and Technology
An approach to expert recommendation based on fuzzy linguistic method and fuzzy text classification in knowledge management systems

Expert Systems with Applications: An International Journal
A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional

Expert Systems with Applications: An International Journal
A K-mixture connective-strength-based approach to automatic text summarisation

International Journal of Intelligent Systems Technologies and Applications
Text classification on a grid environment

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Using thesaurus to improve multiclass text classification

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
A logistic regression-based smoothing method for Chinese text categorization

Expert Systems with Applications: An International Journal
A topic identification task for modern standard Arabic

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Coordinate model for text categorization

Transactions on edutainment V
An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data

Knowledge-Based Systems
Constructing maximum entropy language models for movie review subjectivity analysis

Journal of Computer Science and Technology
PalimPost: information convergence using sticky notes

Proceedings of the Second International Workshop on Web of Things
A comparative experimental assessment of a threshold selection algorithm in hierarchical text categorization

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A novel attribute weighting algorithm for clustering high-dimensional categorical data

Pattern Recognition
Detecting malicious web links and identifying their attack types

WebApps'11 Proceedings of the 2nd USENIX conference on Web application development
A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

Applied Soft Computing
A novel ensemble algorithm for biomedical classification based on Ant Colony Optimization

Applied Soft Computing
An empirical comparison of flat and hierarchical performance measures for multi-label classification with hierarchy extraction

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part I
A comparative study of thresholding strategies in progressive filtering

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Learning from label preferences

DS'11 Proceedings of the 14th international conference on Discovery science
Phoneme Based Representation for Vietnamese Web Page Classification

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
From names to entities using thematic context distance

Proceedings of the 20th ACM international conference on Information and knowledge management
Correlated multi-label feature selection

Proceedings of the 20th ACM international conference on Information and knowledge management
Multi-instance multi-label learning

Artificial Intelligence
New feature selection and weighting methods based on category information

ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization
A semantic kernel to exploit linguistic knowledge

AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence
Applying software analysis technology to lightweight semantic markup of document text

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Binarization approaches to email categorization

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
A general fuzzy-based framework for text representation and its application to text categorization

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Heterogeneous information integration in hierarchical text classification

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Similarity measures in documents using association graphs

CIARP'05 Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications
A comparative study for wordnet guided text representation

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Boosting for text classification with semantic features

WebKDD'04 Proceedings of the 6th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
Importance-based web page classification using cost-sensitive SVM

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Removing smoothing from naive bayes text classifier

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Filtering contents with bigrams and named entities to improve text classification

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Analyzing entities and topics in news articles using statistical topic models

ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
An adaptive fuzzy kNN text classifier

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Medical case-driven classification of microblogs: characteristics and annotation

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A new inductive learning method for multilabel text categorization

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
New methods for text categorization based on a new feature selection method and a new similarity measure between documents

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Text mining using markov chains of variable length

Proceedings of the 2005 international conference on Federation over the Web
Refinement method of post-processing and training for improvement of automated text classification

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Linguistic knowledge representation and automatic acquisition based on a combination of ontology with statistical method

KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Automatic chinese text classification using n-gram model

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part III
Different approaches to bilingual text classification based on grammatical inference techniques

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Email categorization with tournament methods

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
FASiL adaptive email categorization system

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Multilabel classification using heterogeneous ensemble of multi-label classifiers

Pattern Recognition Letters
A novel web page categorization algorithm based on block propagation using query-log information

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
An efficient text categorization algorithm based on category memberships

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Learning outliers to refine a corpus for chinese webpage categorization

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I
A Non-VSM kNN algorithm for text classification

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Text classification for DAG-Structured categories

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Importance of HTML structural elements and metadata in automated subject classification

ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
Text classification with tournament methods

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
A theme allocation for a sentence based on head driven patterns

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Vocabulary completion through word cooccurrence analysis using unlabeled documents for text categorization

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
On the behavior of SVM and some older algorithms in binary text classification tasks

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
On compression-based text classification

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Topic-specific text filtering based on multiple reducts

AIS-ADM 2005 Proceedings of the 2005 international conference on Autonomous Intelligent Systems: agents and Data Mining
Class normalization in centroid-based text categorization

Information Sciences: an International Journal
Segmental parameterisation and statistical modelling of e-mail headers for spam detection

Information Sciences: an International Journal
A cluster centers initialization method for clustering categorical data

Expert Systems with Applications: An International Journal
Functional grouping of natural language requirements for assistance in architectural software design

Knowledge-Based Systems
Semi-automatic creation and maintenance of web resources with webtopic

EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Text mining through semi automatic semantic annotation

PAKM'06 Proceedings of the 6th international conference on Practical Aspects of Knowledge Management
Macro features based text categorization

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Automatic categorization of patent applications using classifier combinations

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
A model for multi-label classification and ranking of learning objects

Expert Systems with Applications: An International Journal
Adapting the naive bayes classifier to rank procedural texts

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Exploiting concept clumping for efficient incremental news article categorization

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Capturing correlations of multiple labels: A generative probabilistic model for multi-label learning

Neurocomputing
Research on text categorization based on a weakly-supervised transfer learning method

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
A new search engine integrating hierarchical browsing and keyword search

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
On supervised mining of dynamic content-based networks1

Statistical Analysis and Data Mining
File classification using byte sub-stream kernels

Digital Investigation: The International Journal of Digital Forensics & Incident Response
Combining relevancy and methodological quality into a single ranking for evidence-based medicine

Information Sciences: an International Journal
A term association translation model for naive bayes text classification

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A plagiarism detection system for arabic text-based documents

PAISI'12 Proceedings of the 2012 Pacific Asia conference on Intelligence and Security Informatics
Text categorization using an ensemble classifier based on a mean co-association matrix

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Using file system content to organize e-mail

Proceedings of the 4th Information Interaction in Context Symposium
Content-based image retrieval using color difference histogram

Pattern Recognition
Text classification by aggregation of SVD eigenvectors

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
RETRACTED: Sentiment Analysis in Decision Sciences Research: An Illustration to IT Governance

Decision Support Systems
A real time Named Entity Recognition system for Arabic text mining

Language Resources and Evaluation
A novel fuzzy clustering algorithm with between-cluster information for categorical data

Fuzzy Sets and Systems
Integrating statistical and lexical information for recognizing textual entailments in text

Knowledge-Based Systems
Sampling the Web as Training Data for Text Classification

International Journal of Digital Library Systems
The decomposed k-nearest neighbor algorithm for imbalanced text classification

FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Multi-Label Classification Method for Multimedia Tagging

International Journal of Multimedia Data Engineering & Management
ExpertRank: A topic-aware expert finding algorithm for online knowledge communities

Decision Support Systems
Supervised and semi-supervised learning in text classification using enhanced KNN algorithm: a comparative study of supervised and semi-supervised classification in text categorisation

International Journal of Intelligent Systems Technologies and Applications
Structure-based document model with discrete wavelet transforms and its application to document classification

AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
An Ontology-Based Mining of Consumer Feedbacks Using Fuzzy Reasoning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Class-indexing-based term weighting for automatic text classification

Information Sciences: an International Journal
A supervised machine learning classification algorithm for research articles

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Recommender systems survey

Knowledge-Based Systems
Recursive regularization for large-scale classification with hierarchical and graphical dependencies

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving Text Classification Accuracy by Training Label Cleaning

ACM Transactions on Information Systems (TOIS)
What's buzzing in the blizzard of buzz? Automotive component isolation in social media postings

Decision Support Systems
Enhancing financial performance with social media: An impression management perspective

Decision Support Systems
Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews

Knowledge-Based Systems
A pattern based two-stage text classifier

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Semantic contextual advertising based on the open directory project

ACM Transactions on the Web (TWEB)
A novel image retrieval method based on mutual information descriptors

ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories and Technology
Formal and relational concept analysis for fuzzy-based automatic semantic annotation

Applied Intelligence
Text mining in negative relevance feedback

Web Intelligence and Agent Systems
The k-modes type clustering plus between-cluster information for categorical data

Neurocomputing
Irrelevant attributes and imbalanced classes in multi-label text-categorization domains

Intelligent Data Analysis
A synergistic approach to efficient web searching

Intelligent Decision Technologies

Quantified Score

Hi-index	0.03

Visualization

Abstract

This paper focuses on a comparative evaluation of a wide-range oftext categorization methods, including previously published results on theReuters corpus and new results of additional experiments. Acontrolled study using three classifiers, kNN, LLSF and WORD, wasconducted to examine the impact of configuration variations in fiveversions of Reuters on the observed performance of classifiers.Analysis and empirical evidence suggest that the evaluation results onsome versions of Reuters were significantly affected by the inclusionof a large portion of unlabelled documents, mading those resultsdifficult to interpret and leading to considerable confusions in theliterature. Using the results evaluated on the other versions ofReuters which exclude the unlabelled documents, the performance oftwelve methods are compared directly or indirectly. For indirectcompararions, kNN, LLSF and WORD were used as baselines, since theywere evaluated on all versions of Reuters that exclude the unlabelleddocuments. As a global observation, kNN, LLSF and a neural networkmethod had the best performance; except for a Naive Bayes approach,the other learning algorithms also performed relatively well.