Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Extraction of regulatory gene/protein networks from Medline
Bioinformatics
The Description Logic Handbook
The Description Logic Handbook
Classifying semantic relations in bioscience texts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
RelEx---Relation extraction using dependency parse trees
Bioinformatics
Text analysis for ontology and terminology engineering
Applied Ontology
Methodological Review: Empirical distributional semantics: Methods and biomedical applications
Journal of Biomedical Informatics
The Stanford typed dependencies representation
CrossParser '08 Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
Using ontologies and the web to learn lexical semantics
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Unsupervised learning of semantic relations between concepts of a molecular biology ontology
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Suggested ontology for pharmacogenomics (SO-Pharm): modular construction and preliminary testing
OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part I
Using statistical text mining to supplement the development of an ontology
Journal of Biomedical Informatics
Semantics-aware open information extraction in the biomedical domain
Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences
Transforming semi-structured life science diagrams into meaningful domain ontologies with DiDOn
Journal of Biomedical Informatics
Journal of Biomedical Informatics
A mutation-centric approach to identifying pharmacogenomic relations in text
Journal of Biomedical Informatics
Journal of Biomedical Informatics
Systematic identification of pharmacogenomics information from clinical trials
Journal of Biomedical Informatics
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Journal of Biomedical Informatics
Hi-index | 0.00 |
Most pharmacogenomics knowledge is contained in the text of published studies, and is thus not available for automated computation. Natural Language Processing (NLP) techniques for extracting relationships in specific domains often rely on hand-built rules and domain-specific ontologies to achieve good performance. In a new and evolving field such as pharmacogenomics (PGx), rules and ontologies may not be available. Recent progress in syntactic NLP parsing in the context of a large corpus of pharmacogenomics text provides new opportunities for automated relationship extraction. We describe an ontology of PGx relationships built starting from a lexicon of key pharmacogenomic entities and a syntactic parse of more than 87 million sentences from 17 million MEDLINE abstracts. We used the syntactic structure of PGx statements to systematically extract commonly occurring relationships and to map them to a common schema. Our extracted relationships have a 70-87.7% precision and involve not only key PGx entities such as genes, drugs, and phenotypes (e.g., VKORC1, warfarin, clotting disorder), but also critical entities that are frequently modified by these key entities (e.g., VKORC1 polymorphism, warfarin response, clotting disorder treatment). The result of our analysis is a network of 40,000 relationships between more than 200 entity types with clear semantics. This network is used to guide the curation of PGx knowledge and provide a computable resource for knowledge discovery.