Pictures of relevance: a geometric analysis of similarity measures
Journal of the American Society for Information Science
Automatic thesaurus construction using Bayesian networks
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Mining generalized association rules
Future Generation Computer Systems - Special double issue on data mining
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Inferring hierarchical descriptions
Proceedings of the eleventh international conference on Information and knowledge management
Thematic mapping - from unstructured documents to taxonomies
Proceedings of the eleventh international conference on Information and knowledge management
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Building and applying a concept hierarchy representation of a user profile
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent concepts and the number orthogonal factors in latent semantic analysis
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Exploiting background knowledge to build reference sets for information extraction
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Constructing reference sets from unstructured, ungrammatical text
Journal of Artificial Intelligence Research
Discovering a term taxonomy from term similarities using principal component analysis
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Hi-index | 0.00 |
We show that the singular value decomposition of a term similarity matrix induces a term hierarchy. This decomposition, used in Latent Semantic Analysis and Principal Component Analysis for text, aims at identifying “concepts” that can be used in place of the terms appearing in the documents. Unlike terms, concepts are by construction uncorrelated and hence are less sensitive to the particular vocabulary used in documents. In this work, we explore the relation between terms and concepts and show that for each term there exists a latent subspace dimension for which the term coincides with a concept. By varying the number of dimensions, terms similar but more specific than the concept can be identified, leading to a term hierarchy.