Automatic set instance extraction using the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
ONTOMO: Development of Ontology Building Service
PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
Web-scale distributional similarity and entity set expansion
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Character-level analysis of semi-structured documents for set expansion
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Coupled semi-supervised learning for information extraction
Proceedings of the third ACM international conference on Web search and data mining
Learning 5000 relational extractors
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Distributional similarity vs. PU learning for entity set expansion
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Corpus-based semantic class mining: distributional vs. pattern-based approaches
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
The role of queries in ranking labeled instances extracted from text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
SEISA: set expansion by iterative similarity aggregation
Proceedings of the 20th international conference on World wide web
Entity set expansion in opinion documents
Proceedings of the 22nd ACM conference on Hypertext and hypermedia
ITEM: extract and integrate entities from tabular data to RDF knowledge base
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
Automatically building training examples for entity extraction
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
ACM Transactions on Asian Language Information Processing (TALIP)
Finding dimensions for queries
Proceedings of the 20th ACM international conference on Information and knowledge management
An analysis of structured data on the web
Proceedings of the VLDB Endowment
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Learning to find comparable entities on the web
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Autonomously reviewing and validating the knowledge base of a never-ending learning system
Proceedings of the 22nd international conference on World Wide Web companion
Methods for exploring and mining tables on Wikipedia
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Hi-index | 0.00 |
Set expansion refers to expanding a partial set of "seed" objects into a more complete set. One system that does set expansion is SEAL (Set Expander for Any Language), which expands entities automatically by utilizing resources from the Web in a language independent fashion. In a previous study, SEAL showed good set expansion performance using three seed entities; however, when given a larger set of seeds (e.g., ten), SEAL's expansion method performs poorly. In this paper, we present Iterative SEAL (iSEAL), which allows a user to provide many seeds. Briefly, iSEAL makes several calls to SEAL, each call using a small number of seeds. We also show that iSEAL can be used in a "bootstrapping" manner, where each call to SEAL uses a mixture of user-provided and self-generated seeds. We show that the bootstrapping version of iSEAL obtains better results than SEAL even when using fewer user-provided seeds. In addition, we compare the performance of various ranking algorithms used in iSEAL, and show that the choice of ranking method has a small effect on performance when all seeds are user-provided, but a large effect when iSEAL is bootstrapped. In particular, we show that Random Walk with Restart is nearly as good as Bayesian Sets with user-provided seeds, and performs best with bootstrapped seeds.