Semantic retrieval for the accurate identification of relational concepts in massive textbases
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Artificial Intelligence in Medicine
Methodological Review: Extracting interactions between proteins from the literature
Journal of Biomedical Informatics
High-performance information extraction with AliBaba
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
BioNoculars: extracting protein-protein interactions from biomedical text
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Classification of Protein Interaction Sentences via Gaussian Processes
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Assigning roles to protein mentions: The case of transcription factors
Journal of Biomedical Informatics
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
BioNLP '11 Proceedings of BioNLP 2011 Workshop
Improving phenotype name recognition
Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Enabling information extraction by inference of regular expressions from sample entities
Proceedings of the 20th ACM international conference on Information and knowledge management
A tree kernel-based method for protein-protein interaction mining from biomedical literature
KDLL'06 Proceedings of the 2006 international conference on Knowledge Discovery in Life Science Literature
Extracting protein-protein interactions in biomedical literature using an existing syntactic parser
KDLL'06 Proceedings of the 2006 international conference on Knowledge Discovery in Life Science Literature
Sequential patterns to discover and characterise biological relations
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Time-based discovery in biomedical literature: mining temporal links
International Journal of Data Analysis Techniques and Strategies
Hi-index | 3.84 |
Motivation: An enormous number of protein--protein interaction relationships are buried in millions of research articles published over the years, and the number is growing. Rediscovering them automatically is a challenging bioinformatics task. Solutions to this problem also reach far beyond bioinformatics. Results: We study a new approach that involves automatically discovering English expression patterns, optimizing them and using them to extract protein--protein interactions. In a sister paper, we described how to generate English expression patterns related to protein--protein interactions, and this approach alone has already achieved precision and recall rates significantly higher than those of other automatic systems. This paper continues to present our theory, focusing on how to improve the patterns. A minimum description length (MDL)-based pattern-optimization algorithm is designed to reduce and merge patterns. This has significantly increased generalization power, and hence the recall and precision rates, as confirmed by ourexperiments. Availability: http://spies.cs.tsinghua.edu.cn Contact: zxy-dcs@tsinghua.edu.cn