WordNet: a lexical database for English
Communications of the ACM
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A reference ontology for biomedical informatics: the foundational model of anatomy
Journal of Biomedical Informatics - Special issue: Unified medical language system
Assigning Gene Ontology Categories (GO) to Yeast Genes Using Text-Based Supervised Learning Methods
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
Biomedical knowledge navigation by literature clustering
Journal of Biomedical Informatics
Exploiting noun phrases and semantic relationships for text document clustering
Information Sciences: an International Journal
Journal of Biomedical Informatics
An ontology-based approach to Chinese semantic advertising
Information Sciences: an International Journal
Ontology-Based hazard information extraction from chinese food complaint documents
ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
Hi-index | 0.00 |
Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.