Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
GATE: an architecture for development of robust HLT applications
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites
Computational Linguistics
Enhancing cluster labeling using wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Analysis of structural relationships for hierarchical cluster labeling
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Cluster labeling for multilingual scatter/gather using comparable corpora
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Hi-index | 0.01 |
Existing mechanisms for concept discovery tend to pick up all possible relationships between terms in a document based on roles of terms identified [3]. The proposed work aims to enhance this discovery process by employing machine learning and semantic modelling. We explore a framework for automatically discovering labeled clusters from a large collection of documents. The aim of this framework is to enable the extraction of concepts and to structure these into labeled concepts for use by text processing applications such as text summarization and text categorization. We have developed a mechanism for automatically inducing a set of words that captures the meaning of a collection of documents. The WordNet lexical database is used to extract root meanings and to determine relationships amongst these terms.