Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
The significance of the Cranfield tests on index languages
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of adding relevance information in a relevance feedback environment
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improving text categorization methods for event tracking
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
High-performing feature selection for text classification
Proceedings of the eleventh international conference on Information and knowledge management
Exploiting Hierarchy in Text Categorization
Information Retrieval
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories
IAAI '90 Proceedings of the The Second Conference on Innovative Applications of Artificial Intelligence
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A repetition based measure for verification of text collections and for text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Text filtering in MUC-3 and MUC-4
MUC4 '92 Proceedings of the 4th conference on Message understanding
Design of the MUC-6 evaluation
MUC6 '95 Proceedings of the 6th conference on Message understanding
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Building a filtering test collection for TREC 2002
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
IMMC: incremental maximum margin criterion
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
ICML '04 Proceedings of the twenty-first international conference on Machine learning
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
An experimental study on large-scale web categorization
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Site abstraction for rare category classification in large-scale web directory
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Automated text classification using a multi-agent framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Methods for learning classifier combinations: no clear winner
Proceedings of the 2005 ACM symposium on Applied computing
Text classification based on data partitioning and parameter varying ensembles
Proceedings of the 2005 ACM symposium on Applied computing
OCFS: optimal orthogonal centroid feature selection for text categorization
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-label informed latent semantic indexing
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
IEEE Transactions on Knowledge and Data Engineering
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Learning hierarchical multi-category text classification models
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning Gaussian processes from multiple tasks
ICML '05 Proceedings of the 22nd international conference on Machine learning
Efficient Text Classification by Weighted Proximal SVM
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Text Classification with Evolving Label-Sets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Effective and Efficient Dimensionality Reduction for Large-Scale and Streaming Data Preprocessing
IEEE Transactions on Knowledge and Data Engineering
Two-stage statistical language models for text database selection
Information Retrieval
Background knowledge for ontology construction
Proceedings of the 15th international conference on World Wide Web
TA-RE: an exchange language for mining software repositories
Proceedings of the 2006 international workshop on Mining software repositories
Automatic expansion of domain-specific lexicons by term categorization
ACM Transactions on Speech and Language Processing (TSLP)
A geometric approach to monitoring threshold functions over distributed data streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Trading convexity for scalability
ICML '06 Proceedings of the 23rd international conference on Machine learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Tackling concept drift by temporal inductive transfer
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Large scale semi-supervised linear SVMs
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Constructing informative prior distributions from domain knowledge in text classification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An analysis of the coupling between training set and neighborhood sizes for the kNN classifier
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised probabilistic principal component analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A scaleable document clustering approach for large document corpora
Information Processing and Management: an International Journal
Scamseek: a language technology project fulfilling research objectives with industrial obligations
SEARCC '05 Proceedings of the 2005 South East Asia Regional Computer Science Confederation (SEARCC) Conference - Volume 46
Using KCCA for Japanese---English cross-language information retrieval and document classification
Journal of Intelligent Information Systems
Technologies That Make You Smile: Adding Humor to Text-Based Applications
IEEE Intelligent Systems
Multi-Output Regularized Feature Projection
IEEE Transactions on Knowledge and Data Engineering
Effective and efficient classification on a search-engine model
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Performance thresholding in practical text classification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Text classification improved through multigram models
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Combining feature selectors for text classification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Text induced spelling correction
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Making computers laugh: investigations in automatic humor recognition
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
Advanced learning algorithms for cross-language patent retrieval and classification
Information Processing and Management: an International Journal
A new suffix tree similarity measure for document clustering
Proceedings of the 16th international conference on World Wide Web
Kernel-Based Learning of Hierarchical Multilabel Classification Models
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Identifying Document Topics Using the Wikipedia Category Network
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Document Classification Based on Support Vector Machine Using a Concept Vector Model
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
The effect of corpus size in combining supervised and unsupervised training for disambiguation
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Trust region Newton methods for large-scale logistic regression
Proceedings of the 24th international conference on Machine learning
An interactive algorithm for asking and incorporating feature feedback into support vector machines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection methods for text classification
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Training Classifiers for Tree-structured Categories with Partially Labeled Data
Journal of VLSI Signal Processing Systems
Out-of-core SVD performance for document indexing
Applied Numerical Mathematics
Projected Gradient Methods for Nonnegative Matrix Factorization
Neural Computation
An integrated system for building enterprise taxonomies
Information Retrieval
A geometric approach to monitoring threshold functions over distributed data streams
ACM Transactions on Database Systems (TODS)
Clustering for unsupervised relation identification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Data & Knowledge Engineering
An investigation into the stability of contextual document clustering
Journal of the American Society for Information Science and Technology
Image and video indexing using networks of operators
Journal on Image and Video Processing
Disorder inequality: a combinatorial approach to nearest neighbor search
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Computers and Operations Research
Author identification: Using text sampling to handle the class imbalance problem
Information Processing and Management: an International Journal
A stopping criterion for active learning
Computer Speech and Language
On applying linear discriminant analysis for multi-labeled problems
Pattern Recognition Letters
Using community-generated contents as a substitute corpus for metadata generation
International Journal of Advanced Media and Communication
Image and video indexing using networks of operators
Journal on Image and Video Processing
Adapting Support Vector Machines for F-term-based Classification of Patents
ACM Transactions on Asian Language Information Processing (TALIP)
Text classification: a recent overview
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Boosting multi-label hierarchical text categorization
Information Retrieval
Shape sensitive geometric monitoring
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Full-text indexing and information retrieval in P2P systems
Ph.D. '08 Proceedings of the 2008 EDBT Ph.D. workshop
Exploring hedge identification in biomedical literature
Journal of Biomedical Informatics
Fast nearest neighbor retrieval for bregman divergences
Proceedings of the 25th international conference on Machine learning
Confidence-weighted linear classification
Proceedings of the 25th international conference on Machine learning
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Training structural SVMs when exact inference is intractable
Proceedings of the 25th international conference on Machine learning
Local likelihood modeling of temporal text streams
Proceedings of the 25th international conference on Machine learning
Fully distributed EM for very large datasets
Proceedings of the 25th international conference on Machine learning
Efficient multiclass maximum margin clustering
Proceedings of the 25th international conference on Machine learning
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
trNon-greedy active learning for text categorization using convex ansductive experimental design
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improving text classification accuracy using topic modeling over an additional corpus
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for Sparse Linear Classifiers in the Massive Data Setting
The Journal of Machine Learning Research
Trust Region Newton Method for Logistic Regression
The Journal of Machine Learning Research
Scaling up text classification for large file systems
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting shared subspace for multi-label classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A sequential dual method for large scale multi-class linear svms
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On updates that constrain the features' connections during learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cuts3vm: a fast semi-supervised svm algorithm
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
An integrated system for automatic customer satisfaction analysis in the services industry
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A knowledge retrieval model using ontology mining and user profiling
Integrated Computer-Aided Engineering
Utilizing phrase-similarity measures for detecting and clustering informative RSS news articles
Integrated Computer-Aided Engineering
Effective and efficient classification on a search-engine model
Knowledge and Information Systems
Multilabel classification via calibrated label ranking
Machine Learning
On Applying Dimension Reduction for Multi-labeled Problems
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Neighborhood-Based Local Sensitivity
ECML '07 Proceedings of the 18th European conference on Machine Learning
Hierarchical Text Categorization Through a Vertical Composition of Classifiers
AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Discovering Knowledge in a Large Organization through Support Vector Machines
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part III
Tensor Space Models for Authorship Identification
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Efficient Pairwise Multilabel Classification for Large-Scale Problems in the Legal Domain
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines
The Journal of Machine Learning Research
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
Relaxation in text search using taxonomies
Proceedings of the VLDB Endowment
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization
Proceedings of the 17th ACM conference on Information and knowledge management
Data weaving: scaling up the state-of-the-art in data clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Incorporating topical support documents into a small training set in text categorization
Proceedings of the 17th ACM conference on Information and knowledge management
Category Classification and Topic Discovery of Japanese and English News Articles
Electronic Notes in Theoretical Computer Science (ENTCS)
An Ontology-Based Framework for Knowledge Retrieval
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Implementing News Article Category Browsing Based on Text Categorization Technique
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Adapting svm for data sparseness and imbalance: A case study in information extraction
Natural Language Engineering
Large scale multi-label classification via metalabeler
Proceedings of the 18th international conference on World wide web
Active Learning Strategies for Multi-Label Text Classification
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Scalable Web Mining with Newistic
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Identifying document topics using the Wikipedia category network
Web Intelligence and Agent Systems
A convex formulation for learning shared structures from multiple tasks
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Good learners for evil teachers
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Boosting with structural sparsity
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Topic model methods for automatically identifying out-of-scope resources
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Large-scale sparse logistic regression
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective multi-label active learning for text classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning with confidence
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
International Journal of Approximate Reasoning
Visual analysis of documents with semantic graphs
Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
Applied Computational Humor and Prospects for Advertising
Proceedings of the 2006 conference on Rob Milne: A Tribute to a Pioneering AI Scientist, Entrepreneur and Mountaineer
On the Quantization Error in SOM vs. VQ: A Critical and Systematic Study
WSOM '09 Proceedings of the 7th International Workshop on Advances in Self-Organizing Maps
Parallel identification of the spelling variants in corpora
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Refined experts: improving classification in large taxonomies
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A User Profiles Acquiring Approach Using Pseudo-Relevance Feedback
RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
A lattice-based framework for enhancing statistical parsers with information from unlabeled corpora
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Using LDA to detect semantically incoherent documents
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
OcVFDT: one-class very fast decision tree for one-class classification of data streams
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
Training Data Cleaning for Text Classification
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Cutting-plane training of structural SVMs
Machine Learning
Topic-Based Hard Clustering of Documents Using Generative Models
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Boosting a Semantic Search Engine by Named Entities
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Automatic Arabic document categorization based on the Naïve Bayes algorithm
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
Avoidance of model re-induction in SVM-based feature selection for text categorization
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Combinatorial Framework for Similarity Search
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
A partial-order based active cache for recommender systems
Proceedings of the third ACM conference on Recommender systems
Topic model analysis of metaphor frequency for psycholinguistic stimuli
CALC '09 Proceedings of the Workshop on Computational Approaches to Linguistic Creativity
Feature generation for text categorization using world knowledge
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Text segmentation via topic modeling: an analytical study
Proceedings of the 18th ACM conference on Information and knowledge management
Topic-dependent sentiment analysis of financial blogs
Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
Labeling design documents based on operators' consensus-A case study of robotic design
Computers in Industry
SVO triple based latent semantic analysis for recognising textual entailment
RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
PhraseRank for document clustering: reweighting the weight of phrase
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Privacy-preserving similarity-based text retrieval
ACM Transactions on Internet Technology (TOIT)
A Clustering Framework Based on Adaptive Space Mapping and Rescaling
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Linear Programming Boosting by Column and Row Generation
DS '09 Proceedings of the 12th International Conference on Discovery Science
Relieving Polysemy Problem for Synonymy Detection
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Using Nearest Neighbor Information to Improve Cross-Language Text Classification
MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Multi-class confidence weighted algorithms
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Automatic content-based categorization of Wikipedia articles
People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
Language models for contextual error detection and correction
CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
Multilingual text induced spelling correction
MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
Modeling user multiple interests by an improved GCS approach
Expert Systems with Applications: An International Journal
A graph-theoretic framework for semantic distance
Computational Linguistics
Mining fuzzy frequent itemsets for hierarchical document clustering
Information Processing and Management: an International Journal
Dynamic hierarchical algorithms for document clustering
Pattern Recognition Letters
A shared-subspace learning framework for multi-label classification
ACM Transactions on Knowledge Discovery from Data (TKDD)
Pairwise-adaptive dissimilarity measure for document clustering
Information Sciences: an International Journal
SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Hash Kernels for Structured Data
The Journal of Machine Learning Research
A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
Entropy-based authorship search in large document collections
ECIR'07 Proceedings of the 29th European conference on IR research
Scaling up semi-supervised learning: an efficient and effective LLGC variant
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Corpus building for corporate knowledge discovery and management: a case study of manufacturing
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
The kernelHMM: learning kernel combinations in structured output domains
Proceedings of the 29th DAGM conference on Pattern recognition
Linear time maximum margin clustering
IEEE Transactions on Neural Networks
Proceedings of the 19th international conference on World wide web
Introducing global scaling parameters into Ncut
Proceedings of the 2010 ACM Symposium on Applied Computing
BVAI'07 Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence
Using typical testors for feature selection in text categorization
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Maximum entropy modeling with feature selection for text categorization
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Data management by self-organizing maps
WCCI'08 Proceedings of the 2008 IEEE world conference on Computational intelligence: research frontiers
Automatic extraction of domain-specific stopwords from labeled documents
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Conditional probability tree estimation analysis and algorithms
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Master defect record retrieval using network-based feature association
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Efficient processing of exact top-k queries over disk-resident sorted lists
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient algorithms for ranking with SVMs
Information Retrieval
SED: supervised experimental design and its application to text classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A scalable two-stage approach for a class of dimensionality reduction techniques
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining positive and negative patterns for relevance feature discovery
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Combined regression and ranking
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A semi-dependent decomposition approach to learn hierarchical classifiers
Pattern Recognition
PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing
Computer Networks: The International Journal of Computer and Telecommunications Networking
Multilabel dimensionality reduction via dependence maximization
ACM Transactions on Knowledge Discovery from Data (TKDD)
A knowledge-based model using ontologies for personalized web information gathering
Web Intelligence and Agent Systems
An incremental space to visualize dynamic data sets
Multimedia Tools and Applications
LETOR: A benchmark collection for research on learning to rank for information retrieval
Information Retrieval
A machine learning approach for text categorization of fixing-issue commits on CVS
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
A Very Fast Method for Clustering Big Text Datasets
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
A ConceptLink graph for text structure mining
ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Close = relevant?: the role of context in efficient language production
CMCL '10 Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics
Efficient set-correlation operator inside databases
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
CiteData: a new multi-faceted dataset for evaluating personalized search performance
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Language pyramid and multi-scale text analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Active caching for similarity queries based on shared-neighbor information
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement
ACM SIGKDD Explorations Newsletter
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Using information from the target language to improve crosslingual text classification
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Bayesian joint optimization for topic model and clustering
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part I
Using correlation dimension for analysing text data
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part I
Multi-label feature transform for image classifications
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Multi-label linear discriminant analysis
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Local class boundaries for support vector machine
LSMS/ICSEE'10 Proceedings of the 2010 international conference on Life system modeling and simulation and intelligent computing, and 2010 international conference on Intelligent computing for sustainable energy and environment: Part II
Directed graph learning via high-order co-linkage analysis
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Cross validation framework to choose amongst models and datasets for transfer learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Multi-label classification and extracting predicted class hierarchies
Pattern Recognition
Top-k vectorial aggregation queries in a distributed environment
Journal of Parallel and Distributed Computing
Distributed threshold querying of general functions by a difference of monotonic representation
Proceedings of the VLDB Endowment
Fast text categorization using concise semantic analysis
Pattern Recognition Letters
Random Fourier approximations for skewed multiplicative histogram kernels
Proceedings of the 32nd DAGM conference on Pattern recognition
Image categorization using directed graphs
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
End-user feature labeling: a locally-weighted regression approach
Proceedings of the 16th international conference on Intelligent user interfaces
Dimensionality reduction for text using domain knowledge
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Exploiting probabilistic topic models to improve text categorization under class imbalance
Information Processing and Management: an International Journal
A perceptron-like linear supervised algorithm for text classification
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Fast k-NN classifier for documents based on a graph structure
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Stochastic Composite Likelihood
The Journal of Machine Learning Research
On exploiting hierarchical label structure with pairwise classifiers
ACM SIGKDD Explorations Newsletter
A coordinate gradient descent method for l1-regularized convex minimization
Computational Optimization and Applications
A geometric approach to monitoring threshold functions over distributed data streams
Ubiquitous knowledge discovery
Entropy based feature selection for text categorization
Proceedings of the 2011 ACM Symposium on Applied Computing
Training linear ranking SVMs in linearithmic time using red-black trees
Pattern Recognition Letters
A geometric approach to monitoring threshold functions over distributed data streams
Ubiquitous knowledge discovery
Text segmentation: A topic modeling perspective
Information Processing and Management: an International Journal
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Content-based filtering in on-line social networks
PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Double-pass clustering technique for multilingual document collections
Journal of Information Science
A fast quasi-Newton method for semi-supervised SVM
Pattern Recognition
Local histograms of character N-grams for authorship attribution
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Model-portability experiments for textual temporal analysis
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
A scalable probabilistic classifier for language modeling
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Text mining techniques for leveraging positively labeled data
BioNLP '11 Proceedings of BioNLP 2011 Workshop
Editorial: Classifying text streams by keywords using classifier ensemble
Data & Knowledge Engineering
Composite hashing with multiple information sources
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Filtering semi-structured documents based on faceted feedback
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Document clustering with universum
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Automatic discovery of patterns in media content
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Learning to trade off between exploration and exploitation in multiclass bandit prediction
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting adversarial advertisements in the wild
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast coordinate descent methods with variable selection for non-negative matrix factorization
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Serendipitous learning: learning beyond the predefined label space
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
The Journal of Machine Learning Research
An improved training algorithm for the linear ranking support vector machine
ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Discriminative experimental design
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Active learning with evolving streaming data
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
A comparative study of thresholding strategies in progressive filtering
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
A statistical model for topically segmented documents
DS'11 Proceedings of the 14th international conference on Discovery science
Passage retrieval for incorporating global evidence in sequence labeling
Proceedings of the 20th ACM international conference on Information and knowledge management
Correlated multi-label feature selection
Proceedings of the 20th ACM international conference on Information and knowledge management
Multiagent systems and information retrieval our experience with X.MAS
Expert Systems with Applications: An International Journal
FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors
Expert Systems with Applications: An International Journal
Natural Language Processing (Almost) from Scratch
The Journal of Machine Learning Research
Unsupervised Supervised Learning II: Margin-Based Classification Without Labels
The Journal of Machine Learning Research
Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning
SIAM Journal on Optimization
An incremental subspace learning algorithm to categorize large scale text data
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Incremental aspect models for mining document streams
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Exploiting extremely rare features in text categorization
ECML'06 Proceedings of the 17th European conference on Machine Learning
Efficient large scale linear programming support vector machines
ECML'06 Proceedings of the 17th European conference on Machine Learning
Semantic correlation network based text clustering
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
A comparative study for wordnet guided text representation
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
A two-stage decision model for information filtering
Decision Support Systems
Acquiring an ontology from the text a legal case study
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Using relative entropy for authorship attribution
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
N-Gram feature selection for authorship identification
AIMSA'06 Proceedings of the 12th international conference on Artificial Intelligence: methodology, Systems, and Applications
MP-Boost: a multiple-pivot boosting algorithm and its application to text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Improved ROCK for text clustering using asymmetric proximity
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Efficient prediction algorithms for binary decomposition techniques
Data Mining and Knowledge Discovery
Text clustering for peer-to-peer networks with probabilistic guarantees
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
A comparison of language identification approaches on short, query-style texts
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
On discriminative joint density modeling
ECML'05 Proceedings of the 16th European conference on Machine Learning
Multinomial event model based abstraction for sequence and text classification
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Bayesian locality sensitive hashing for fast similarity search
Proceedings of the VLDB Endowment
SVM based learning system for information extraction
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
A Bayesian feature selection paradigm for text classification
Information Processing and Management: an International Journal
Laughter abounds in the mouths of computers: investigations in automatic humor recognition
INTETAIN'05 Proceedings of the First international conference on Intelligent Technologies for Interactive Entertainment
Application of latent semantic indexing to processing of noisy text
ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Using topic concepts for semantic video shots classification
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Efficient multilabel classification algorithms for large-scale problems in the legal domain
Semantic Processing of Legal Texts
Machine Recognition of Music Emotion: A Review
ACM Transactions on Intelligent Systems and Technology (TIST)
A scalable supervised algorithm for dimensionality reduction on streaming data
Information Sciences: an International Journal
A set correlation model for partitional clustering
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Classifier ensemble for uncertain data stream classification
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A novel clustering and verification based microarray data bi-clustering method
ICSI'10 Proceedings of the First international conference on Advances in Swarm Intelligence - Volume Part II
Text categorization methods for automatic estimation of verbal intelligence
Expert Systems with Applications: An International Journal
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Document hierarchies from text and links
Proceedings of the 21st international conference on World Wide Web
Proceedings of the 21st international conference on World Wide Web
Exploiting concept clumping for efficient incremental news article categorization
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
Prediction-based geometric monitoring over distributed data streams
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Automated text classification using a dynamic artificial neural network model
Expert Systems with Applications: An International Journal
2012 Special Issue: Analysis of the IJCNN 2011 UTL challenge
Neural Networks
Ensemble of binary learners for reliable text categorization with a reject option
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
A latent variable ranking model for content-based retrieval
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Artificial immune system for illicit content identification in social media
Journal of the American Society for Information Science and Technology
Multi-label classification using conditional dependency networks
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Active online classification via information maximization
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Dealing with concept drift and class imbalance in multi-label stream classification
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Unsupervised lexicon acquisition for HPSG-based relation extraction
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Cross-Guided Clustering: Transfer of Relevant Supervision across Tasks
ACM Transactions on Knowledge Discovery from Data (TKDD)
Online learning to diversify from implicit feedback
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning for hierarchical text classification
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Confidence-weighted linear classification for text categorization
The Journal of Machine Learning Research
Fast on-line learning for multilingual categorization
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Constrained co-clustering with non-negative matrix factorisation
International Journal of Business Intelligence and Data Mining
Preliminary experiments using subjective logic for the polyrepresentation of information needs
Proceedings of the 4th Information Interaction in Context Symposium
A new document author representation for authorship attribution
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Learning to classify service data with latent semantics
RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
Fairness on the web: alternatives to the power law
Proceedings of the 3rd Annual ACM Web Science Conference
Incorporating lexical priors into topic models
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Skip n-grams and ranking functions for predicting script events
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Modeling topic dependencies in hierarchical text categorization
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Parallel nearest neighbour algorithms for text categorization
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Supporting factual statements with evidence from the web
Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient jaccard-based diversity analysis of large document collections
Proceedings of the 21st ACM international conference on Information and knowledge management
On active learning in hierarchical classification
Proceedings of the 21st ACM international conference on Information and knowledge management
Identifying well-formed biomedical phrases in MEDLINE® text
Journal of Biomedical Informatics
Tree ensembles for predicting structured outputs
Pattern Recognition
Scalable text classification with sparse generative modeling
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Sentiment classification with supervised sequence embedding
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Graph-Based transduction with confidence
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Essentials of the self-organizing map
Neural Networks
Feature selection for multi-label classification using multivariate mutual information
Pattern Recognition Letters
Concept comparison engines: A new frontier of search
Decision Support Systems
MCut: a thresholding strategy for multi-label classification
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Parallel rare term vector replacement: Fast and effective dimensionality reduction for text
Journal of Parallel and Distributed Computing
Projective clustering ensembles
Data Mining and Knowledge Discovery
Variable-constraint classification and quantification of radiology reports under the ACR Index
Expert Systems with Applications: An International Journal
Audience targeting by B-to-B advertisement classification: A neural network approach
Expert Systems with Applications: An International Journal
Threshold optimisation for multi-label classifiers
Pattern Recognition
Learning to rank from structures in hierarchical text classification
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Incremental reranking for hierarchical text classification
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
The use of orthogonal similarity relations in the prediction of authorship
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Multi-label classification with a reject option
Pattern Recognition
Class-indexing-based term weighting for automatic text classification
Information Sciences: an International Journal
Adaptive regularization of weight vectors
Machine Learning
Information-theoretic term weighting schemes for document clustering
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Comparison of text feature selection policies and using an adaptive framework
Expert Systems with Applications: An International Journal
Semantic hashing using tags and topic modeling
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Sequential testing in classifier evaluation yields biased estimates of effectiveness
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Logistic regression with weight grouping priors
Computational Statistics & Data Analysis
Fast rank-2 nonnegative matrix factorization for hierarchical document clustering
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-label relational neighbor classification using social context features
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Recursive regularization for large-scale classification with hierarchical and graphical dependencies
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable inference in max-margin topic models
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards anytime active learning: interrupting experts to reduce annotation costs
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
Learning compact hashing codes for efficient tag completion and prediction
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Towards minimizing the annotation cost of certified text classification
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
INDREX: in-database distributional relation extraction
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Improving Text Classification Accuracy by Training Label Cleaning
ACM Transactions on Information Systems (TOIS)
What's buzzing in the blizzard of buzz? Automotive component isolation in social media postings
Decision Support Systems
When classification becomes a problem: using branch-and-bound to improve classification efficiency
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Integrated instance- and class-based generative modeling for text classification
Proceedings of the 18th Australasian Document Computing Symposium
Probabilistic multi-label classification with sparse feature learning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Active learning with multi-label SVM classification
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Discriminative Orthogonal Nonnegative matrix factorization with flexibility for data representation
Expert Systems with Applications: An International Journal
Text classification using a few labeled examples
Computers in Human Behavior
Large-scale linear nonparallel support vector machine solver
Neural Networks
Tag recommendation for open source software
Frontiers of Computer Science: Selected Publications from Chinese Universities
An efficient privacy-preserving multi-keyword search over encrypted cloud data with ranking
Distributed and Parallel Databases
Editor's Choice Article: Sparse feature selection based on graph Laplacian for web image annotation
Image and Vision Computing
Irrelevant attributes and imbalanced classes in multi-label text-categorization domains
Intelligent Data Analysis
Intelligent Data Analysis
Feature ranking fusion for text classifier
Intelligent Data Analysis
Big data text-oriented benchmark creation for Hadoop
IBM Journal of Research and Development
Hi-index | 0.02 |
Reuters Corpus Volume I (RCV1) is an archive of over 800,000 manually categorized newswire stories recently made available by Reuters, Ltd. for research purposes. Use of this data for research on text categorization requires a detailed understanding of the real world constraints under which the data was produced. Drawing on interviews with Reuters personnel and access to Reuters documentation, we describe the coding policy and quality control procedures used in producing the RCV1 data, the intended semantics of the hierarchical category taxonomies, and the corrections necessary to remove errorful data. We refer to the original data as RCV1-v1, and the corrected data as RCV1-v2. We benchmark several widely used supervised learning methods on RCV1-v2, illustrating the collection's properties, suggesting new directions for research, and providing baseline results for future studies. We make available detailed, per-category experimental results, as well as corrected versions of the category assignments and taxonomy structures, via online appendices.