Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Kernel methods for relation extraction
The Journal of Machine Learning Research
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval)
Description of the UMass system as used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Data Mining and Knowledge Discovery Handbook
Data Mining and Knowledge Discovery Handbook
Automatically labeling hierarchical clusters
dg.o '06 Proceedings of the 2006 international conference on Digital government research
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Discovering relations among named entities from large corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Dependency tree kernels for relation extraction
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Classifying semantic relations in bioscience texts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Exploring various knowledge in relation extraction
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Extracting personal names from email: applying named entity recognition to informal text
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A shortest path dependency kernel for relation extraction
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Preemptive information extraction using unrestricted relation discovery
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Clustering for unsupervised relation identification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Introduction to Information Retrieval
Introduction to Information Retrieval
StatSnowball: a statistical approach to extracting entity relationships
Proceedings of the 18th international conference on World wide web
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
Enhancing cluster labeling using wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Distant supervision for relation extraction without labeled data
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Open information extraction using Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Text relatedness based on a word thesaurus
Journal of Artificial Intelligence Research
Identifying relations for open information extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A weighting scheme for open information extraction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
Discovering relations using matrix factorization methods
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
We study the problem of automatically extracting information networks formed by recognizable entities as well as relations among them from social media sites. Our approach consists of using state-of-the-art natural language processing tools to identify entities and extract sentences that relate such entities, followed by using text-clustering algorithms to identify the relations within the information network. We propose a new term-weighting scheme that significantly improves on the state-of-the-art in the task of relation extraction, both when used in conjunction with the standard tf ċ idf scheme and also when used as a pruning filter. We describe an effective method for identifying benchmarks for open information extraction that relies on a curated online database that is comparable to the hand-crafted evaluation datasets in the literature. From this benchmark, we derive a much larger dataset which mimics realistic conditions for the task of open information extraction. We report on extensive experiments on both datasets, which not only shed light on the accuracy levels achieved by state-of-the-art open information extraction tools, but also on how to tune such tools for better results.