Classification algorithms
A cluster-based approach to thesaurus construction
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
The automatic indexing system AIR/PHYS - from research to applications
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Models for retrieval with probabilistic indexing
Information Processing and Management: an International Journal - Modeling data, information and knowledge
Probabilistic document indexing from relevance feedback data
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments with query acquisition and use in document retrieval systems
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Term clustering of syntactic phrases
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
The use of phrases and structured queries in information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Representation and learning in information retrieval
Representation and learning in information retrieval
Automatic Indexing: An Experimental Inquiry
Journal of the ACM (JACM)
Information Retrieval
CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories
IAAI '90 Proceedings of the The Second Conference on Innovative Applications of Artificial Intelligence
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Information filtering and information retrieval: two sides of the same coin?
Communications of the ACM - Special issue on information filtering
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Partial orders for document representation: a new methodology for combining document features
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based text categorization: a comparison of category search strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Natural language processing for information retrieval
Communications of the ACM
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Automatic essay grading using text categorization techniques
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Mining Text Using Keyword Distributions
Journal of Intelligent Information Systems
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
Event tracking based on domain dependency
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Web page classification based on k-nearest neighbor approach
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Text categorization using hybrid (mined) terms (poster session)
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A model of multimedia information retrieval
Journal of the ACM (JACM)
Evaluating document clustering for interactive information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Combining and selecting characteristics of information use
Journal of the American Society for Information Science and Technology
Topic-oriented collaborative crawling
Proceedings of the eleventh international conference on Information and knowledge management
Hierarchical Text Categorization Using Neural Networks
Information Retrieval
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
ACIRD: Intelligent Internet Document Organization and Retrieval
IEEE Transactions on Knowledge and Data Engineering
Using Statistical Methods to Improve Knowledge-Based News Categorization
IEEE Expert: Intelligent Systems and Their Applications
Uncertainty-Based Noise Reduction and Term Selection in Text Categorization
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Text Categorization: An Experiment Using Phrases
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Second Order Features for Maximising Text Classification Performance
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Information Access Based on Associative Calculation
SOFSEM '00 Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
Handbook of data mining and knowledge discovery
Exploiting sophisticated representations for document retrieval
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Document classification using a finite mixture model
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A layered approach to NLP-based information retrieval
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Meta-clustering of gene expression data and literature-based information
ACM SIGKDD Explorations Newsletter
Interactive Information Retrieval Using Clustering and Spatial Proximity
User Modeling and User-Adapted Interaction
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Forming test collections with no system pooling
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The BankSearch web document dataset: investigating unsupervised clustering and category similarity
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
Detecting action-items in e-mail
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Manipulating large corpora for text classification
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Paraphrasing Japanese noun phrases using character-based indexing
PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
Higher order feature selection for text classification
Knowledge and Information Systems
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Using bag-of-concepts to improve the performance of support vector machines in text categorization
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Comparison of feature selection and classification algorithms in identifying malicious executables
Computational Statistics & Data Analysis
Contextual feature selection for text classification
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Examining the content load of part of speech blocks for information retrieval
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Multi-candidate reduction: Sentence compression as a tool for document summarization tasks
Information Processing and Management: an International Journal
Language morphology offset: Text classification on a Croatian-English parallel corpus
Information Processing and Management: an International Journal
Reconstructing ddc for interactive classification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Dimensionality reduction of features for text categorization
ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
Overview and semantic issues of text mining
ACM SIGMOD Record
Boosting RVM Classifiers for Large Data Sets
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
Statistical Identification of Key Phrases for Text Classification
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Hierarchical Text Categorization Through a Vertical Composition of Classifiers
AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Can Social Tags Help You Find What You Want?
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Text classification based on multi-word with support vector machine
Knowledge-Based Systems
Kernel methods, syntax and semantics for relational text categorization
Proceedings of the 17th ACM conference on Information and knowledge management
A two-stage text mining model for information filtering
Proceedings of the 17th ACM conference on Information and knowledge management
An adaptive personalized news dissemination system
Journal of Intelligent Information Systems
AutoPCS: A Phrase-Based Text Categorization System for Similar Texts
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Automatic classification of Tamil documents using vector space model and artificial neural network
Expert Systems with Applications: An International Journal
Adaptive Web SitesA Knowledge Extraction from Web Data Approach
Proceedings of the 2008 conference on Adaptive Web Sites: A Knowledge Extraction from Web Data Approach
Automatic Detecting Documents Containing Personal Health Information
AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
Syntactic and semantic kernels for short text pair categorization
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
Automatic thesaurus construction based on grammatical relations
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hierarchical Bayesian clustering for automatic text classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
An effective model of using negative relevance feedback for information filtering
Proceedings of the 18th ACM conference on Information and knowledge management
Phrase-based document categorization revisited
Proceedings of the 2nd international workshop on Patent information retrieval
An extensive study on automated Dewey Decimal Classification
Journal of the American Society for Information Science and Technology
A framework for the computerized assessment of university student essays
Computers in Human Behavior
An ordering of terms based on semantic relatedness
IWCS-8 '09 Proceedings of the Eighth International Conference on Computational Semantics
Learning filtering rulesets for ranking refinement in relevance feedback
Knowledge-Based Systems
Adaptive classification of web documents to users interests
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Sentence-level event classification in unstructured texts
Information Retrieval
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Using typical testors for feature selection in text categorization
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Fast categorization of web documents represented by graphs
WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
Smoothing LDA model for text categorization
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Text and hypertext categorization
Artificial intelligence
A study of spam filtering using support vector machines
Artificial Intelligence Review
Mining positive and negative patterns for relevance feature discovery
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Text classification with the support of pruned dependency patterns
Pattern Recognition Letters
A comparative study of TF*IDF, LSI and multi-words for text classification
Expert Systems with Applications: An International Journal
A robust linguistic platform for efficient and domain specific web content analysis
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Word co-occurrence features for text classification
Information Systems
A pattern mining approach for information filtering systems
Information Retrieval
Feature selection strategy in text classification
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
High-precision phrase-based document classification on a modern scale
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A semantic kernel to exploit linguistic knowledge
AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence
A new nearest neighbor rule for text categorization
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
On the utility of incremental feature selection for the classification of textual data streams
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Combining contents and citations for scientific document classification
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Filtering contents with bigrams and named entities to improve text classification
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
A two-stage decision model for information filtering
Decision Support Systems
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Effectiveness of document representation for classification
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
AIS'04 Proceedings of the 13th international conference on AI, Simulation, and Planning in High Autonomy Systems
A Non-VSM kNN algorithm for text classification
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Topic tracking based on linguistic features
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Assigning polarity scores to reviews using machine learning techniques
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A survey on feature extraction for pattern recognition
Artificial Intelligence Review
Beyond the bag of words: a text representation for sentence selection
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Clustering information retrieval search outputs
IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research
MCut: a thresholding strategy for multi-label classification
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Free-gram phrase identification for modeling Chinese text
Information Processing Letters
A pattern based two-stage text classifier
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Text classification for assisting moderators in online health communities
Journal of Biomedical Informatics
Hi-index | 0.02 |
Syntactic phrase indexing and term clustering have been widely explored as text representation techniques for text retrieval. In this paper we study the properties of phrasal and clustered indexing languages on a text categorization task, enabling us to study their properties in isolation from query interpretation issues. We show that optimal effectiveness occurs when using only a small proportion of the indexing terms available, and that effectiveness peaks at a higher feature set size and lower effectiveness level for a syntactic phrase indexing than for word-based indexing. We also present results suggesting that traditional term clustering method are unlikely to provide significantly improved text representations. An improved probabilistic text categorization method is also presented.