Neural Network Based Document Clustering Using WordNet Ontologies
International Journal of Hybrid Intelligent Systems
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
The category discrimination method (CDM) is a new learning algorithm designed for text categorization. The motivation is that there are statistical problems associated with natural language text when it is applied as input to existing machine learning algorithms (too much noise, too many features, skewed distribution). The bases of the CDM are research results about the way that humans learn categories and concepts vis-a-vis contrasting concepts. The essential formula is cue validity borrowed from cognitive psychology, and used to select from all possible single word-based features the 'best' predictors of a given category. The hypothesis that CDM's performance exceeds two non-domain specific algorithms, Bayesian classification and decision tree learners, is empirically tested.