Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Learning the scope of hedge cues in biomedical texts
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text
CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Any domain parsing: automatic domain adaptation for natural language parsing
Any domain parsing: automatic domain adaptation for natural language parsing
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text
CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Cross-genre and cross-domain detection of semantic uncertainty
Computational Linguistics
Syntactic stylometry for deception detection
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
This paper describes our system about detecting hedges and their scope in natural language texts for our participation in CoNLL-2010 shared tasks. We formalize these two tasks as sequence labeling problems, and implement them using conditional random fields (CRFs) model. In the first task, we use a greedy forward procedure to select features for the classifier. These features include part-of-speech tag, word form, lemma, chunk tag of tokens in the sentence. In the second task, our system exploits rich syntactic features about dependency structures and phrase structures, which achieves a better performance than only using the flat sequence features. Our system achieves the third score in biological data set for the first task, and achieves 0.5265 F1 score for the second task.