Journal of the American Society for Information Science
An interactive system for finding complementary literatures: a stimulus to scientific discovery
Artificial Intelligence - Special issue on scientific discovery
Using latent semantic indexing for literature based discovery
Journal of the American Society for Information Science
Literature-based discovery by lexical statistics
Journal of the American Society for Information Science
Journal of the American Society for Information Science and Technology
Inferring query models by computing information flow
Proceedings of the eleventh international conference on Information and knowledge management
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Text mining: generating hypotheses from MEDLINE
Journal of the American Society for Information Science and Technology
Geometry and Meaning
A Practical Logic of Cognitive Systems, Volume 2: The Reach of Abduction: Insight and Trial
A Practical Logic of Cognitive Systems, Volume 2: The Reach of Abduction: Insight and Trial
Methodological Review: Empirical distributional semantics: Methods and biomedical applications
Journal of Biomedical Informatics
Journal of Biomedical Informatics
Discovering discovery patterns with predication-based Semantic Indexing
Journal of Biomedical Informatics
Many paths lead to discovery: analogical retrieval of cancer therapies
QI'12 Proceedings of the 6th international conference on Quantum Interaction
Hi-index | 0.00 |
Literature discovery can be characterized as a goal directed search for previously unknown implicit knowledge captured within a collection of scientific articles. Swanson's serendipitous discovery of a treatment for Raynaud's disease by dietary fish-oil while browsing Medline, an online collection of biomedical literature, exemplifies such a discovery. By means of a series of experiments, the impact of stop words, various weighting schemes, discovery mechanisms, and contextual reduction are studied in relation to replicating the Raynaud/fish-oil and migraine-magnesium discoveries by operational means. Two aspects of discovery were brought under focus: (i) the discovery of intermediate, or B –terms, and (ii) the discovery of indirect A – C connections via the B–terms. A semantic space representation of the underlying corpus is computed and discoveries automated by computing associations between words in both higher and contextually reduced spaces. It was found that the discovery of B–terms and A – C connections can be achieved to an encouraging degree with a standard stop word list. In addition, no single weighting scheme seems to suffice. Log-likelihood appears to be potentially effective for leading to the discovery of B–terms, whereas both odds ratio and simple co-occurrence frequencies both facilitate the discovery of A – C connections. With regard to discovery mechanism, both semantic similarity (via cosine) and information flow computation seem promising for computing A – C connections, but more research is needed to understand their relative strengths and weaknesses. Discovery in a contextually reduced semantic space revealed mixed results.