Discovering relations between named entities from a large raw corpus using tree similarity-based clustering

Authors:
Min Zhang;Jian Su;Danmei Wang;Guodong Zhou;Chew Lim Tan
Affiliations:
Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Department of Computer Science, National University of Singapore, Singapore
Venue:
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Year:
2005

Citing 13
Cited 17

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Text classification using string kernels

The Journal of Machine Learning Research
Kernel methods for relation extraction

The Journal of Machine Learning Research
A novel use of statistical parsing to extract information from text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Hierarchical directed acyclic graph kernel: methods for structured natural language data

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Convolution kernels with feature selection for natural language processing tasks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A study on convolution kernels for shallow semantic parsing

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Discovering relations among named entities from large corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Dependency tree kernels for relation extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions

Modeling commonality among related classes in relation extraction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting relation information from text documents by exploring various types of knowledge

Information Processing and Management: an International Journal
Hierarchical learning strategy in semantic relation extraction

Information Processing and Management: an International Journal
Label propagation via bootstrapped support vectors for semantic relation extraction between named entities

Computer Speech and Language
Comparison of similarity models for the relation discovery task

LD '06 Proceedings of the Workshop on Linguistic Distances
Convolution kernels on constituent, dependency and sequential structures for relation extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Semi-supervised learning for semantic relation classification using stratified sampling strategy

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Clustering-based stratified seed sampling for semi-supervised relation classification

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Semi-supervised semantic pattern discovery with guidance from unsupervised pattern clusters

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
In-domain relation discovery with meta-constraints via posterior regularization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
End-to-end relation extraction using distant supervision from external semantic repositories

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
A study on dependency tree kernels for automatic extraction of protein-protein interaction

BioNLP '11 Proceedings of BioNLP 2011 Workshop
Using syntactic and semantic structural kernels for classifying definition questions in Jeopardy!

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discovering relations between noun categories

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extracting information networks from the blogosphere

ACM Transactions on the Web (TWEB)
Combining tree structures, flat features and patterns for biomedical relation extraction

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Structural linguistics and unsupervised information extraction

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a tree-similarity-based unsupervised learning method to extract relations between Named Entities from a large raw corpus. Our method regards relation extraction as a clustering problem on shallow parse trees. First, we modify previous tree kernels on relation extraction to estimate the similarity between parse trees more efficiently. Then, the similarity between parse trees is used in a hierarchical clustering algorithm to group entity pairs into different clusters. Finally, each cluster is labeled by an indicative word and unreliable clusters are pruned out. Evaluation on the New York Times (1995) corpus shows that our method outperforms the only previous work by 5 in F-measure. It also shows that our method performs well on both high-frequent and less-frequent entity pairs. To the best of our knowledge, this is the first work to use a tree similarity metric in relation clustering.