A statistical approach to machine translation
Computational Linguistics
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Understanding Natural Language
Understanding Natural Language
Statistical Models for Co-occurrence Data
Statistical Models for Co-occurrence Data
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Dialogue act modeling for automatic tagging and recognition of conversational speech
Computational Linguistics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Assigning function tags to parsed text
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using semantic preferences to identify verbal participation in role switching alternations
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A fully statistical approach to natural language interfaces
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing a semantically annotated lexicon via EM-based clustering
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Design of the MUC-6 evaluation
MUC6 '95 Proceedings of the 6th conference on Message understanding
Hi-index | 0.00 |
Over the past decade, natural language processing has been transformed by the adoption of statistical methods. The statistical approach began with shallow problems such as part-of-speech tagging, progressed to syntactic parsing, and is now being applied to higher-level semantic tasks. We present a statistical system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence. The system operates at the level of frame semantics, which provide us with an intermediate representation between the detail of complete theories of semantics and simpler domain-specific slot-filler representations. Given an input sentence, the system labels constituents with roles such as SPEAKER, MESSAGE, and TOPIC, identifying participants in various types of actions or states.The system is based on statistical classifiers that were trained on roughly 50,000 sentences hand labeled with semantic roles in the FrameNet semantic labeling project. We then parsed each training sentence and extracted various lexical and syntactic features, including the syntactic category of the constituent, its grammatical function, and position in the sentence. These features were combined with knowledge of the target verb, noun, or adjective: as well as information such as the prior probabilities of various combinations of semantic roles. We also used various methods of lexical clustering to generalize across possible fillers of roles. Test sentences were parsed, annotated with these features, and then passed through the classifiers.Our system achieves 80% accuracy in identifying the semantic role of presegmented constituents. At the harder task of simultaneously segmenting constituents and identifying their semantic role, the system achieved 65% precision and 61% recall.