Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
Ontology-based metadata generation from semi-structured information
Proceedings of the 1st international conference on Knowledge capture
A Hybrid Approach to Optimize Feature Selection Process in Text Classification
AI*IA 01 Proceedings of the 7th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Supervised document classification based upon domain-specific term taxonomies
International Journal of Metadata, Semantics and Ontologies
RitroveRAI: a web application for semantic indexing and hyperlinking of multimedia news
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Hi-index | 0.00 |
Although several attempts have been made to introduce Natural Language Processing (NLP) techniques in Information Retrieval, most ones failed to prove their effectiveness in increasing performances. In this paper Text Classification (TC) has been taken as the IR task and the effect of linguistic capabilities of the underlying system have been studied. A novel model for TC, extending a well know statistical model (i.e. Rocchio's formula [Ittner et al., 1995]) and applied to linguistic features has been defined and experimented. The proposed model represents an effective feature selection methodology. All the experiments result in a significant improvement with respect to other purely statistical methods (e.g. [Yang, 1999]), thus stressing the relevance of the available linguistic information. Moreover, the derived classifier reachs the performance (about 85%) of the best known models (i.e. Support Vector Machines (SVM) and K -Nearest Neighbour (KNN)) characterized by an higher computational complexity for training and processing.