Learning semantic relatedness from term discrimination information

Authors:
D. Cai;C. J. van Rijsbergen
Affiliations:
Department of Computing Science, University of Glasgow, Glasgow G12 8RZ, UK;Department of Computing Science, University of Glasgow, Glasgow G12 8RZ, UK
Venue:
Expert Systems with Applications: An International Journal
Year:
2009

Citing 39
Cited 4

A probability distribution model for information retrieval

Information Processing and Management: an International Journal - Modeling data, information and knowledge
Word association norms, mutual information, and lexicography

Computational Linguistics
Query expansion using domain-adapted, weighted thesaurus in an extended Boolean model

CIKM '94 Proceedings of the third international conference on Information and knowledge management
Optimization of relevance feedback weights

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Two-level document ranking using mutual information in natural language information retrieval

Information Processing and Management: an International Journal
A comparison of collocation-based similarity measures in query expansion

Information Processing and Management: an International Journal
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A corpus analysis approach for automatic query expansion and its extension to multiple databases

ACM Transactions on Information Systems (TOIS)
Query expansion using heterogeneous thesauri

Information Processing and Management: an International Journal
An information-theoretic approach to automatic query expansion

ACM Transactions on Information Systems (TOIS)
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Automatic query wefinement using lexical affinities with maximal information gain

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
Determining Semantic Similarity among Entity Classes from Different Ontologies

IEEE Transactions on Knowledge and Data Engineering
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Coupled clustering: a method for detecting structural correspondence

The Journal of Machine Learning Research
A family of additive online algorithms for category ranking

The Journal of Machine Learning Research
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Correcting real-word spelling errors by restoring lexical cohesion

Natural Language Engineering
Word classification based on combined measures of distributional and semantic similarity

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics
Modeling consensus: classifier combination for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ADSS: an approach to determining semantic similarity

Advances in Engineering Software
Effective profiling of consumer information retrieval needs: a unified framework and empirical comparison

Decision Support Systems
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Similarity of Semantic Relations

Computational Linguistics
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A semantic approach to IE pattern induction

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Non-classical lexical semantic relations

CLS '04 Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics
Models for the semantic classification of noun phrases

CLS '04 Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics
Distributional measures of concept-distance: a task-oriented evaluation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Measuring the semantic similarity of texts

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing

A semi-supervised incremental algorithm to automatically formulate topical queries

Information Sciences: an International Journal
Modeling term proximity for probabilistic information retrieval models

Information Sciences: an International Journal
Fast supervised feature extraction by term discrimination information pooling

Proceedings of the 20th ACM international conference on Information and knowledge management
Clustering and understanding documents via discrimination information maximization

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I

Quantified Score

Hi-index	12.05

Visualization

Abstract

Formalization and quantification of the intuitive notion of relatedness between terms has long been a major challenge for computing science, and an intriguing problem for other sciences. In this study, we meet the challenge by considering a general notion of relatedness between terms and a given topic. We introduce a formal definition of a relatedness measure based on term discrimination measures. Measurement of discrimination information (MDI) of terms is a fundamental issue for many areas of science. In this study, we focus on MDI, and present an in-depth investigation into the concept of discrimination information conveyed in a term. Information radius is an information measure relevant to a wide variety of applications and is the basis of this investigation. In particular, we formally interpret discrimination measures in terms of a simple but important property identified by this study, and argue the interpretation is essential for guiding their application. The discrimination measures can then naturally and conveniently be utilized to formalize and quantify the relatedness between terms and a given topic. Some key points about the information radius, discrimination measures and relatedness measures are also made. An example is given to demonstrate how the relatedness measures can deal with some basic concepts of applications in the context of text information retrieval (IR). We summarize important features of, and differences between, the information radius and two other information measures, from a practical perspective. The aim of this study is part of an attempt to establish a theoretical framework, with MDI at its core, towards effective estimation of semantic relatedness between terms. Due to its generality, our method can be expected to be a useful tool with a wide range of application areas.