Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Data mining and knowledge discovery in databases
Communications of the ACM
Mining Text Using Keyword Distributions
Journal of Intelligent Information Systems
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Protein association discovery in biomedical literature
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A task parallel algorithm for finding all-pairs shortest paths using the GPU
International Journal of High Performance Computing and Networking
Hi-index | 0.00 |
In this paper we discuss text data mining (TDM) mainly in the context of the biomedical domain, where we extract associations from MEDLINE text articles and construct association graphs. We explore two techniques, the co-occurrence method and transitive method. We propose a novel transitive method of finding associations that does not rely on meta-data, and compare the results with another known transitive method that uses metadata in text, to find a link/relationship between objects of interest. Co-occurrence of these terms (objects) is not required in the transitive methods to find out that they are associated. The results show that our proposed new method is as accurate as the known method that uses meta-data. This, in turn, implies that relationships can be discovered even when meta-data is not available or incomplete. A case study of a transitive association between a pair of genes (BRCAI---STATI) is also carried out to illustrate the effective hypothesis generating ability of our method. Based on the results, we conclude that our method can be used effectively for association extraction and also for hypothesis generation, which can later be validated through biological experimental analysis.