Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Event-coreference across multiple, multi-lingual sources in the Mumis project
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Collective content selection for concept-to-text generation
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Situated models of meaning for sports video retrieval
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Learning semantic correspondences with less supervision
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Training a multilingual sportscaster: using perceptual context to learn language
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
In this paper, we describe the Baseball Announcers' Language Linked with General Annotation of Meaningful Events (BALLGAME) project -- a text corpus for research in computional semantics. We collected pitch-by-pitch event data for a sample of baseball games and used this data to build an annotated corpus composed of transcripts of radio broadcasts of these games. Our annotation links text from the broadcast to events in a formal representation of the semantics of the baseball game. We describe our corpus model, the annotation tool used to create the corpus, and conclude by discussing applications of this corpus in semantics research and natural language processing.