Ontology Based Clustering for Improving Genomic IR

Authors:
Jian Wen;Zhoujun Li;Xiaohua Hu
Affiliations:
National University of Defence Technology, China;Beihang University, China;Drexel University, USA
Venue:
CBMS '07 Proceedings of the Twentieth IEEE International Symposium on Computer-Based Medical Systems
Year:
2007

Citing 0
Cited 2

Exploiting noun phrases and semantic relationships for text document clustering

Information Sciences: an International Journal
A Semi-supervised Topic-Driven Approach for Clustering Textual Answers to Survey Questions

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work has shown that ontology is useful to improve the performance of information retrieval, especially in biomedical literatures. The method of ontology-based can solve synonym problems. In this paper, we propose a new frame for genomic information retrieval based on UMLS. In our frame, Genomic information retrieval includes three processes: first, documents were indexed based UMLS, which means documents were represented by concepts, besides, the concept weight was re-calculated combined with similarity between concepts. Second, documents were clustered using fuzzy c-means method. At last cluster language model is utilized for information retrieval. Our method can solve partly synonymy and polysemy problems. The new method is evaluated on TREC 2004/05 Genomics Track collections. Experiments show that the retrieval performance is greatly improved by the new method compared with the basic language model.