Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
International standard for a linguistic annotation framework
Natural Language Engineering
Natural Language Engineering
OntoNotes: A Unified Relational Semantic Representation
ICSC '07 Proceedings of the International Conference on Semantic Computing
Making sense of word sense variation
DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
GrAF: a graph-based format for linguistic annotations
LAW '07 Proceedings of the Linguistic Annotation Workshop
WordNet and FrameNet as complementary resources for annotation
ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
Natural Language Processing with Python
Natural Language Processing with Python
Anveshan: a framework for analysis of multiple annotators' labeling behavior
LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
A collaborative annotation between human annotators and a statistical parser
LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
Towards generating text from discourse representation structures
ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
Bridging the gaps: interoperability for language engineering architectures using GrAF
Language Resources and Evaluation
POWLA: modeling linguistic corpora in OWL/DL
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
A platform for collaborative semantic annotation
EACL '12 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations
Language Resources and Evaluation
FrameNet, current collaborations and future goals
Language Resources and Evaluation
A model for linguistic resource description
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
CSAF: a community-sourcing annotation framework
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Language Resources and Evaluation
Hi-index | 0.00 |
The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, community-based effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.