ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support Vector Machines Based on a Semantic Kernel for Text Categorization
IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 5 - Volume 5
Semantic Kernels for Text Classification Based on Topological Measures of Feature Similarity
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
Word sense disambiguation with spreading activation networks generated from thesauri
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Text relatedness based on a word thesaurus
Journal of Artificial Intelligence Research
A semantic kernel to exploit linguistic knowledge
AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence
Word sense disambiguation for exploiting hierarchical thesauri in text classification
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Semantic smoothing for text clustering
Knowledge-Based Systems
Hi-index | 0.01 |
Typically, in textual document classification the documents are represented in the vector space using the "Bag of Words" (BOW) approach. Despite its ease of use, BOW representation cannot handle word synonymy and polysemy problems and does not consider semantic relatedness between words. In this paper, we overcome the shortages of the BOW approach by embedding a known WordNet-based semantic relatedness measure for pairs of words, namely Omiotis, into a semantic kernel. The suggested measure incorporates the TF-IDF weighting scheme, thus creating a semantic kernel which combines both semantic and statistical information from text. Empirical evaluation with real data sets demonstrates that our approach successfully achieves improved classification accuracy with respect to the standard BOW representation, when Omiotis is embedded in four different classifiers.