Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Using corpus-derived name lists for named entity recognition
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Named Entity recognition without gazetteers
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A layered approach to NLP-based information retrieval
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
A rule-based approach to prepositional phrase attachment disambiguation
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Automatic construction of a hypernym-labeled noun hierarchy from text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
First story detection using a composite document representation
HLT '01 Proceedings of the first international conference on Human language technology research
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting strong syntactic heuristics and co-training to learn semantic lexicons
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Automatically generating hypertext in newspaper articles by computing semantic relatedness
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Using decision trees for conference resolution
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A search result clustering method using informatively named entities
Proceedings of the 7th annual ACM international workshop on Web information and data management
MICE^3: An Information Desktop on the Web
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Weakly-supervised discovery of named entities using web search queries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Pattern-based automatic taxonomy learning from the Web
AI Communications
Using web information for creating publication venue authority files
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Ontology-driven, unsupervised instance population
Web Semantics: Science, Services and Agents on the World Wide Web
Pattern-based semantic class discovery with multi-membership support
Proceedings of the 17th ACM conference on Information and knowledge management
Automatic Extraction of the Fine Category of Person Named Entities from Text Corpora
IEICE - Transactions on Information and Systems
Constructing folksonomies from user-specified relations on flickr
Proceedings of the 18th international conference on World wide web
Efficient approximate entity extraction with edit distance constraints
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A language independent approach for name categorization and discrimination
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Clique-based clustering for improving named entity recognition systems
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Structured generative models for unsupervised named-entity clustering
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Corpus-based semantic lexicon induction with Web-based corroboration
UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
Automatic set instance extraction using the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Employing topic models for pattern-based semantic class discovery
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Toward completeness in concept extraction and classification
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A methodology to learn ontological attributes from the Web
Data & Knowledge Engineering
Domain information for fine-grained person name categorization
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Learning medical ontologies from the web
AIME'07 Proceedings of the 2007 conference on Knowledge management for health care procedures
Not all seeds are equal: measuring the quality of text mining seeds
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Inducing domain-specific semantic class taggers from (almost) nothing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning arguments and supertypes of semantic relations using recursive patterns
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A semi-supervised method to learn and construct taxonomies using the web
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Corpus-based semantic class mining: distributional vs. pattern-based approaches
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Nonlinear evidence fusion and propagation for hyponymy relation mining
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Insights from network structure for text mining
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
The Web as a Source of Evidence for Filtering Candidate Answers to Natural Language Questions
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Mining paraphrases from self-anchored web sentence fragments
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Question answering for dutch using dependency relations
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Finding instance names and alternative glosses on the web: wordnet reloaded
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Class label enhancement via related instances
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
REPENTINO – a wide-scope gazetteer for entity recognition in portuguese
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Turning the web into a database: extracting data and structure
NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems
Corpus-Driven hyponym acquisition for turkish language
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Ensemble-based semantic lexicon induction for semantic tagging
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
No noun phrase left behind: detecting and typing unlinkable entities
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Ensemble semantics for large-scale unsupervised relation extraction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Cause-effect relation learning
TextGraphs-7 '12 Workshop Proceedings of TextGraphs-7 on Graph-based Methods for Natural Language Processing
Incorporating word correlation into tag-topic model for semantic knowledge acquisition
Proceedings of the 21st ACM international conference on Information and knowledge management
An automatic approach for ontology-based feature extraction from heterogeneous textualresources
Engineering Applications of Artificial Intelligence
Journal of Web Engineering
Extracting query facets from search results
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Tailoring the automated construction of large-scale taxonomies using the web
Language Resources and Evaluation
Hi-index | 0.00 |
The recognition of names and their associated categories within unstructured text traditionally relies on semantic lexicons and gazetteers. The amount of effort required to assemble large lexicons confines the recognition to either a limited domain (e.g., medical imaging), or a small set of pre-defined, broader categories of interest (e.g., persons, countries, organizations, products). This constitutes a serious limitation in an information seeking context. In this case, the categories of potential interest to users are more diverse (universities, agencies, retailers, celebrities), often refined (e.g., SLR digital cameras, programming languages, multinational oil companies), and usually overlapping (e.g., the same entity may be concurrently a brand name, a technology company, and an industry leader). We present a lightly supervised method for acquiring named entities in arbitrary categories. The method applies lightweight lexico-syntactic extraction patterns to the unstructured text of Web documents. The method is a departure from traditional approaches to named entity recognition in that: 1) it does not require any start-up seed names or training; 2) it does not encode any domain knowledge in its extraction patterns; 3) it is only lightly supervised, and data-driven; 4) it does not impose any a-priori restriction on the categories of extracted names. We illustrate applications of the method in Web search, and describe experiments on 500 million Web documents and news articles.