Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Acquisition of categorized named entities for web search
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Fast Random Walk with Restart and Its Applications
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Semantic taxonomy induction from heterogenous evidence
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Proceedings of the 16th international conference on World Wide Web
Weakly-supervised discovery of named entities using web search queries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Language-Independent Set Expansion of Named Entities Using the Web
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Iterative Set Expansion of Named Entities Using the Web
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Bootstrapping named entity recognition with automatically generated gazetteer lists
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Automatic set expansion for list question answering
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
A web service for automatic word class acquisition
Proceedings of the 3rd International Universal Communication Symposium
The role of queries in ranking labeled instances extracted from text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Nonlinear evidence fusion and propagation for hyponymy relation mining
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
ACM Transactions on Asian Language Information Processing (TALIP)
Towards semantic category verification with arbitrary precision
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Proceedings of the 20th ACM international conference on Information and knowledge management
WebSets: extracting sets of entities from the web using unsupervised information extraction
Proceedings of the fifth ACM international conference on Web search and data mining
A semi-supervised approach to extracting multiword entity names from user reviews
Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search
Ensemble semantics for large-scale unsupervised relation extraction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Collectively representing semi-structured data from the web
AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
WebPut: efficient web-based data imputation
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Acquisition of open-domain classes via intersective semantics
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.01 |
An important and well-studied problem is the production of semantic lexicons from a large corpus. In this paper, we present a system named ASIA (Automatic Set Instance Acquirer), which takes in the name of a semantic class as input (e.g., "car makers") and automatically outputs its instances (e.g., "ford", "nissan", "toyota"). ASIA is based on recent advances in web-based set expansion - the problem of finding all instances of a set given a small number of "seed" instances. This approach effectively exploits web resources and can be easily adapted to different languages. In brief, we use language-dependent hyponym patterns to find a noisy set of initial seeds, and then use a state-of-the-art language-independent set expansion system to expand these seeds. The proposed approach matches or outperforms prior systems on several English-language benchmarks. It also shows excellent performance on three dozen additional benchmark problems from English, Chinese and Japanese, thus demonstrating language-independence.