Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections

Authors:
Trevor Cohen;Roger Schvaneveldt;Dominic Widdows
Affiliations:
Center for Cognitive Informatics and Decision Making, School of Health Information Sciences, University of Texas, Houston, USA;Applied Psychology Unit, Arizona State University, Arizona, USA;Google Inc., USA
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 12
Cited 12

Toward discovery support systems: a replication, re-examination, and extension of Swanson's work on literature-based discovery of a connection between Raynaud's and fish oil

Journal of the American Society for Information Science
An interactive system for finding complementary literatures: a stimulus to scientific discovery

Artificial Intelligence - Special issue on scientific discovery
Applied numerical linear algebra

Applied numerical linear algebra
Using latent semantic indexing for literature based discovery

Journal of the American Society for Information Science
Context Vectors: A Step Toward a "Grand Unified Representation"

Hybrid Neural Systems, revised papers from a workshop
Text mining: generating hypotheses from MEDLINE

Journal of the American Society for Information Science and Technology
A Linear Least Squares Fit mapping method for information retrieval from natural language texts

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Orthogonal negation in vector spaces for modelling word-meanings and document retrieval

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Using statistical and knowledge-based approaches for literature-based discovery

Journal of Biomedical Informatics
Methodological Review: Empirical distributional semantics: Methods and biomedical applications

Journal of Biomedical Informatics
A new evaluation methodology for literature-based discovery systems

Journal of Biomedical Informatics
A bare bones approach to literature-based discovery: an analysis of the raynaud's/fish-oil and migraine-magnesium discoveries in semantic space

DS'05 Proceedings of the 8th international conference on Discovery Science

Reflective random indexing for semi-automatic indexing of the biomedical literature

Journal of Biomedical Informatics
The S-Space package: an open source package for word space models

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
Semantically enhanced collaborative filtering based on RSVD

ICCCI'11 Proceedings of the Third international conference on Computational collective intelligence: technologies and applications - Volume Part II
Finding schizophrenia's Prozac: emergent relational similarity in predication space

QI'11 Proceedings of the 5th international conference on Quantum interaction
Content based recommender system by using eye gaze data

Proceedings of the Symposium on Eye Tracking Research and Applications
Random indexing for finding similar nodes within large RDF graphs

ESWC'11 Proceedings of the 8th international conference on The Semantic Web
Long-tail recommendation based on reflective indexing

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Linked data-based concept recommendation: comparison of different methods in open innovation scenario

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Discovering discovery patterns with predication-based Semantic Indexing

Journal of Biomedical Informatics
Many paths lead to discovery: analogical retrieval of cancer therapies

QI'12 Proceedings of the 6th international conference on Quantum Interaction
Multi-Relational learning for recommendation of matches between semantic structures

KES'12 Proceedings of the 16th international conference on Knowledge Engineering, Machine Learning and Lattice Computing with Applications
Efficient top-k retrieval with signatures

Proceedings of the 18th Australasian Document Computing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indirect inference, limiting its ability to scale to large text corpora. In this paper, we evaluate the ability of Random Indexing (RI), a scalable distributional model of word associations, to draw meaningful implicit relationships between terms in general and biomedical language. Proponents of this method have achieved comparable performance to LSA on several cognitive tasks while using a simpler and less computationally demanding method of dimension reduction than LSA employs. In this paper, we demonstrate that the original implementation of RI is ineffective at inferring meaningful indirect connections, and evaluate Reflective Random Indexing (RRI), an iterative variant of the method that is better able to perform indirect inference. RRI is shown to lead to more clearly related indirect connections and to outperform existing RI implementations in the prediction of future direct co-occurrence in the MEDLINE corpus.