Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The nature of statistical learning theory
The nature of statistical learning theory
A maximum entropy approach to natural language processing
Computational Linguistics
Relational learning of pattern-match rules for information extraction
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Information Extraction: Techniques and Challenges
SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A Probabilistic Model for Identifying Protein Names and their Name Boundaries
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Automatically identifying gene/protein terms in MEDLINE abstracts
Journal of Biomedical Informatics
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Enhancing performance of protein and gene name recognizers with filtering and integration strategies
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Improving the performance of dictionary-based approaches in protein name recognition
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
A hybrid approach to protein name identification in biomedical texts
Information Processing and Management: an International Journal
Protein names precisely peeled off free text
Bioinformatics
High-recall protein entity recognition using a dictionary
Bioinformatics
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Boosting precision and recall of dictionary-based protein name recognition
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Protein name tagging for biomedical annotation in text
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Generalizing predicates with string arguments
Applied Intelligence
Journal of Biomedical Informatics
Methodological Review: Extracting interactions between proteins from the literature
Journal of Biomedical Informatics
Representing sentence structure in hidden Markov models for information extraction
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Comparative experiments on learning information extractors for proteins and their interactions
Artificial Intelligence in Medicine
Bio-medical entity extraction using support vector machines
Artificial Intelligence in Medicine
Recognizing biomedical named entities using skip-chain conditional random fields
BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
Automated crime report analysis and classification for e-government and decision support
Proceedings of the 14th Annual International Conference on Digital Government Research
Hi-index | 0.00 |
Protein name extraction, one of the basic tasks in automatic extraction of information from biological texts, remains challenging. In this paper, we explore the use of two different machine learning techniques and present the results of the conducted experiments. In the first method, Bigram language model is used to extract protein names. In the latter, we use an automatic rule learning method that can identify protein names located in the biological texts. In both cases, we generalize protein names by using hierarchically categorized syntactic token types. We conducted our experiments on two different datasets. Our first method based on Bigram language model achieved an F-score of 67.7% on the YAPEX dataset and 66.8% on the GENIA corpus. The developed rule learning method obtained 61.8% F-score value on the YAPEX dataset and 61.0% on the GENIA corpus. The results of the comparative experiments demonstrate that both techniques are applicable to the task of automatic protein name extraction, a prerequisite for the large-scale processing of biomedical literature.