Combining relations for information extraction from free text

Authors:
Mstislav Maslennikov;Tat-Seng Chua
Affiliations:
National University of Singapore, Singapore;National University of Singapore, Singapore
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2010

Citing 30
Cited 0

Attention, intentions, and the structure of discourse

Computational Linguistics
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Learning dictionaries for information extraction by multi-level bootstrapping

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
First steps in building a model for the retrieval of court decisions

International Journal of Human-Computer Studies
Partially Supervised Classification of Text Documents

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
A maximum entropy approach to information extraction from semi-structured and free text

Eighteenth national conference on Artificial intelligence
A Global Rule Induction Approach to Information Extraction

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
The syntax-discourse interface: effects of the main-subordinate distinction on attention structure

The syntax-discourse interface: effects of the main-subordinate distinction on attention structure
Unsupervised learning of soft patterns for generating definitions from online news

Proceedings of the 13th international conference on World Wide Web
Probabilistic reasoning for entity & relation recognition

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Unsupervised learning of generalized names

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Sentence level discourse parsing using syntactic and lexical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Using predicate-argument structures for information extraction

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A bootstrapping approach to named entity classification using successive learners

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Counter-training in discovery of semantic patterns

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Anaphora and Discourse Structure

Computational Linguistics
A bootstrapping method for learning semantic lexicons using extraction pattern contexts

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Ontology-driven discourse analysis for information extraction

Data & Knowledge Engineering - Special issue: Natural language and database and information systems: NLDB 03
Dependency tree kernels for relation extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Modeling commonality among related classes in relation extraction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A composite kernel to extract relations between entities with both flat and structured features

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Cascading use of soft and hard matching pattern rules for weakly supervised information extraction

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
ARE: instance splitting strategies for dependency relation-based information extraction

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Exploiting subjectivity classification to improve information extraction

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Adaptive information extraction from text by rule induction and generalisation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Relations between entities of the same semantic type tend to be sparse in free texts. Therefore, combining relations is the key to effective information extraction (IE) on free text datasets with a small set of training samples. Previous approaches to bootstrapping for IE used different types of relations, such as dependency or co-occurrence, and faced the problems of paraphrasing and misalignment of instances. To cope with these problems, we propose a framework that integrates several types of relations. After extracting candidate entities, our framework evaluates relations between them at the phrasal, dependency, semantic frame, and discourse levels. For each of these levels, we build a classifier that outputs a score for relation instances. In order to integrate these scores, we propose three strategies: (1) integrate evaluation scores from each relation classifier; (2) incorporate the elimination of negatively labeled instances in a previous strategy; and (3) add cascading of extracted relations into strategy (2). Our framework improves the state-of-art results for supervised systems by 8%, 15%, 3%, and 5% on MUC4 (terrorism); MUC6 (management succession); ACE RDC 2003 (news, general types); and ACE RDC 2003 (news, specific types) domains respectively.