Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
The syntactic process
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution
Maximum entropy models for natural language ambiguity resolution
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Supertagging: an approach to almost parsing
Computational Linguistics
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Parsing strategies with 'lexicalized' grammars: application to tree adjoining grammars
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
A comparison of parsing technologies for the biomedical domain
Natural Language Engineering
Investigating GIS and smoothing for maximum entropy taggers
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Bootstrapping statistical parsers from small datasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Building deep dependency structures with a wide-coverage CCG parser
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Generative models for statistical parsing with Combinatory Categorial Grammar
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Example selection for bootstrapping statistical parsers
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Parsing with generative models of predicate-argument structure
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Intricacies of Collins' Parsing Model
Computational Linguistics
Head-Driven Statistical Models for Natural Language Parsing
Computational Linguistics
Parsing the WSJ using CCG and log-linear models
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Probabilistic disambiguation models for wide-coverage HPSG parsing
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Reranking and self-training for parser adaptation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Partial training for a lexicalized-grammar parser
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
The second release of the RASP system
COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Evaluating the accuracy of an unlexicalized statistical parser on the PARC DepBank
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank
Computational Linguistics
Wide-coverage efficient statistical parsing with ccg and log-linear models
Computational Linguistics
Self-training for biomedical parsing
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Domain adaptation with structural correspondence learning
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Extremely lexicalized models for accurate and fast HPSG parsing
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Language model adaptation with MAP estimation and the perceptron algorithm
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Tools to address the interdependence between tokenisation and standoff annotation
NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Adapting WSJ-trained parsers to the British National Corpus using in-domain self-training
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Evaluating and integrating treebank parsers on a biomedical corpus
Software '05 Proceedings of the Workshop on Software
A dependency-based method for evaluating broad-coverage parsers
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
MAP adaptation of stochastic grammars
Computer Speech and Language
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Adapting a probabilistic disambiguation model of an HPSG parser to a new domain
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Overview of BioNLP'09 shared task on event extraction
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Guest Editorial: Current issues in biomedical text mining and natural language processing
Journal of Biomedical Informatics
Evaluating a statistical CCG parser on Wikipedia
People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
Correlating natural language parser performance with statistical measures of the text
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Faster parsing by supertagger adaptation
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Two strong baselines for the BioNLP 2009 event extraction task
BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
A comparative study of syntactic parsers for event extraction
BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
Detecting speculative language using syntactic dependencies and logistic regression
CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Exploring variations across biomedical subdomains
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Evaluating dependency representation for event extraction
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Natural language query processing for life science knowledge
AMT'10 Proceedings of the 6th international conference on Active media technology
Relation guided bootstrapping of semantic lexicons
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Cross-Domain Effects on Parse Selection for Precision Grammars
Research on Language and Computation
Discovering novel biomedical relations using ASKNet semantic networks
Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
BioNLP Shared Task 2011: supporting resources
BioNLP Shared Task '11 Proceedings of the BioNLP Shared Task 2011 Workshop
CharaParser for fine-grained semantic annotation of organism morphological descriptions
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
This paper introduces a state-of-the-art, linguistically motivated statistical parser to the biomedical text mining community, and proposes a method of adapting it to the biomedical domain requiring only limited resources for data annotation. The parser was originally developed using the Penn Treebank and is therefore tuned to newspaper text. Our approach takes advantage of a lexicalized grammar formalism, Combinatory Categorial Grammar (ccg), to train the parser at a lower level of representation than full syntactic derivations. The ccg parser uses three levels of representation: a first level consisting of part-of-speech (pos) tags; a second level consisting of more fine-grained ccg lexical categories; and a third, hierarchical level consisting of ccg derivations. We find that simply retraining the pos tagger on biomedical data leads to a large improvement in parsing performance, and that using annotated data at the intermediate lexical category level of representation improves parsing accuracy further. We describe the procedure involved in evaluating the parser, and obtain accuracies for biomedical data in the same range as those reported for newspaper text, and higher than those previously reported for the biomedical resource on which we evaluate. Our conclusion is that porting newspaper parsers to the biomedical domain, at least for parsers which use lexicalized grammars, may not be as difficult as first thought.