Relation extraction from wikipedia using subtree mining

Authors:
Dat P. T. Nguyen;Yutaka Matsuo;Mitsuru Ishizuka
Affiliations:
University of Tokyo, Bunkyo-ku, Tokyo, Japan;AIST, Tokyo, Japan;University of Tokyo, Bunkyo-ku, Tokyo, Japan
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Year:
2007

Citing 15
Cited 27

Making large-scale support vector machine learning practical

Advances in kernel methods
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A machine learning approach to coreference resolution of noun phrases

Computational Linguistics - Special issue on computational anaphora resolution
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Coreference for NLP applications

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic Wikipedia

Proceedings of the 15th international conference on World Wide Web
Dependency tree kernels for relation extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Integrating probabilistic extraction models and data mining to discover relations and patterns in text

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Overcoming the brittleness bottleneck using wikipedia: enhancing text categorization with encyclopedic knowledge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Automatically refining the wikipedia infobox ontology

Proceedings of the 17th international conference on World Wide Web
Semantic relation extraction from socially-generated tags: a methodology for metadata generation

DCMI '08 Proceedings of the 2008 International Conference on Dublin Core and Metadata Applications
Cross-lingual alignment and completion of Wikipedia templates

CLIAWS3 '09 Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies
An integrated probabilistic and logic approach to encyclopedia relation extraction with multiple features

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Mining meaning from Wikipedia

International Journal of Human-Computer Studies
Decoding wikipedia categories for knowledge acquisition

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Instance-driven discovery of ontological relation labels

LaTeCH-SHELT&R '09 Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education
An integrated discriminative probabilistic approach to information extraction

Proceedings of the 18th ACM conference on Information and knowledge management
Unsupervised relation extraction by mining Wikipedia texts using information from the web

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
On learning subtypes of the part-whole relation: do not mix your seeds

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Exploiting macro and micro relations toward web intelligence

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Information extraction from Wikipedia using pattern learning

Acta Cybernetica
Building ontological models from Arabic Wikipedia: a proposed hybrid approach

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
Introduction to linked data and its lifecycle on the web

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Encyclopedic knowledge patterns from wikipedia links

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
SCMS: semantifying content management systems

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part II
Towards a top-down and bottom-up bidirectional approach to joint information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
Transforming Wikipedia into a large scale multilingual concept network

Artificial Intelligence
Improving the performance of a named entity recognition system with knowledge acquisition

EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
DeFacto - deep fact validation

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Wiki3C: exploiting wikipedia for context-aware concept categorization

Proceedings of the sixth ACM international conference on Web search and data mining
Social relation extraction based on chinese wikipedia articles

CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics
Development and evaluation of a biomedical search engine using a predicate-based vector space model

Journal of Biomedical Informatics
Introduction to linked data and its lifecycle on the web

RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access
Towards better understanding and utilizing relations in DBpedia

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The exponential growth and reliability of Wikipedia have made it a promising data source for intelligent systems. The first challenge of Wikipedia is to make the encyclopedia machine-processable. In this study, we address the problem of extracting relations among entities from Wikipedia's English articles, which in turn can serve for intelligent systems to satisfy users' information needs. Our proposed method first anchors the appearance of entities in Wikipedia articles using some heuristic rules that are supported by their encyclopedic style. Therefore, it uses neither the Named Entity Recognizer (NER) nor the Coreference Resolution tool, which are sources of errors for relation extraction. It then classifies the relationships among entity pairs using SVM with features extracted from the web structure and subtrees mined from the syntactic structure of text. The innovations behind our work are the following: a) our method makes use of Wikipedia characteristics for entity allocation and entity classification, which are essential for relation extraction; b) our algorithm extracts a core tree, which accurately reflects a relationship between a given entity pair, and subsequently identifies key features with respect to the relationship from the core tree. We demonstrate the effectiveness of our approach through evaluation of manually annotated data from actual Wikipedia articles.