Word sense disambiguation for free-text indexing using a massive semantic network
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Text Classification by Boosting Weak Learners based on Terms and Concepts
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A graph model for unsupervised lexical acquisition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
On the strength of hyperclique patterns for text categorization
Information Sciences: an International Journal
GDClust: A Graph-Based Document Clustering Technique
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Discriminative parameter learning for Bayesian networks
Proceedings of the 25th international conference on Machine learning
Building semantic kernels for text classification using wikipedia
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent pattern-growth approach for document organization
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Using Wikipedia for Co-clustering Based Cross-Domain Text Classification
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
An Abstraction-Based Data Model for Information Retrieval
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Semantically-guided clustering of text documents via frequent subgraphs discovery
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Hi-index | 0.00 |
There is a growing interest in efficient models of text mining and an emergent need for new data structures that address word relationships. Detailed knowledge about the taxonomic environment of keywords that are used in text documents can provide valuable insight into the nature of the subject matter contained therein. Such insight may be used to enhance the data structures used in the text data mining task as relationships become usefully apparent. A popular scalable technique used to infer these relationships, while reducing dimensionality, has been Latent Semantic Analysis. We present a new approach, which uses an ontology of lexical abstractions to create abstraction profiles of documents and uses these profiles to perform text organization based on a process that we call frequent abstraction analysis. We introduce TATOO, the Text Abstraction TOOlkit, which is a full implementation of this new approach. We present our data model via an example of how taxonomically derived abstractions can be used to supplement semantic data structures for the text classification task. © 2013 Wiley Periodicals, Inc.