Identifying semantic relations in text

Authors:
Daniel Gildea;Daniel Jurafsky
Affiliations:
University of California, Berkeley, and International Computer Science Institute;University of Colorado, Boulder
Venue:
Exploring artificial intelligence in the new millennium
Year:
2003

Citing 17
Cited 0

A statistical approach to machine translation

Computational Linguistics
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Understanding Natural Language

Understanding Natural Language
Statistical Models for Co-occurrence Data

Statistical Models for Co-occurrence Data
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Dialogue act modeling for automatic tagging and recognition of conversational speech

Computational Linguistics
Learning methods to combine linguistic indicators: improving aspectual classification and revealing linguistic insights

Computational Linguistics
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Assigning function tags to parsed text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using semantic preferences to identify verbal participation in role switching alternations

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A fully statistical approach to natural language interfaces

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Untangling text data mining

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing a semantically annotated lexicon via EM-based clustering

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Design of the MUC-6 evaluation

MUC6 '95 Proceedings of the 6th conference on Message understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the past decade, natural language processing has been transformed by the adoption of statistical methods. The statistical approach began with shallow problems such as part-of-speech tagging, progressed to syntactic parsing, and is now being applied to higher-level semantic tasks. We present a statistical system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence. The system operates at the level of frame semantics, which provide us with an intermediate representation between the detail of complete theories of semantics and simpler domain-specific slot-filler representations. Given an input sentence, the system labels constituents with roles such as SPEAKER, MESSAGE, and TOPIC, identifying participants in various types of actions or states.The system is based on statistical classifiers that were trained on roughly 50,000 sentences hand labeled with semantic roles in the FrameNet semantic labeling project. We then parsed each training sentence and extracted various lexical and syntactic features, including the syntactic category of the constituent, its grammatical function, and position in the sentence. These features were combined with knowledge of the target verb, noun, or adjective: as well as information such as the prior probabilities of various combinations of semantic roles. We also used various methods of lexical clustering to generalize across possible fillers of roles. Test sentences were parsed, annotated with these features, and then passed through the classifiers.Our system achieves 80% accuracy in identifying the semantic role of presegmented constituents. At the harder task of simultaneously segmenting constituents and identifying their semantic role, the system achieved 65% precision and 61% recall.