Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning to classify text from labeled and unlabeled documents
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Relational learning of pattern-match rules for information extraction
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Web-collaborative filtering: recommending music by crawling the Web
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Snowball: a prototype system for extracting relations from large text collections
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Scaling question answering to the web
ACM Transactions on Information Systems (TOIS)
A flexible learning system for wrapping tables and lists in HTML documents
Proceedings of the 11th international conference on World Wide Web
Hierarchical Wrapper Induction for Semistructured Information Sources
Autonomous Agents and Multi-Agent Systems
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Wrapper induction for information extraction
Wrapper induction for information extraction
Measuring praise and criticism: Inference of semantic orientation from association
ACM Transactions on Information Systems (TOIS)
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Is it the right answer?: exploiting web redundancy for Answer Validation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting strong syntactic heuristics and co-training to learn semantic lexicons
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Can we derive general world knowledge from texts?
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Adaptive information extraction from text by rule induction and generalisation
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Efficiently inducing features of conditional random fields
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
A search engine for natural language applications
WWW '05 Proceedings of the 14th international conference on World Wide Web
Adapting Web information extraction knowledge via mining site-invariant and site-dependent features
ACM Transactions on Internet Technology (TOIT)
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Ontologizing semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semantic taxonomy induction from heterogenous evidence
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting product features and opinions from reviews
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
KnowItNow: fast, scalable information extraction from the web
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
OPINE: extracting product features and opinions from reviews
HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
URES: an unsupervised web relation extraction system
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A redundancy-based method for the extraction of relation instances from the Web
International Journal of Human-Computer Studies
Extracting relevant named entities for automated expense reimbursement
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Transactions on Speech and Language Processing (TSLP)
Proceedings of the 4th international conference on Knowledge capture
Strategies for lifelong knowledge extraction from the web
Proceedings of the 4th international conference on Knowledge capture
Autonomously semantifying wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Semantic verification in an online fact seeking environment
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Clustering for unsupervised relation identification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Building structured web community portals: a top-down, compositional, and incremental approach
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A relational approach to incrementally extracting and querying structure in unstructured data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Learning non-taxonomic relationships from web documents for domain ontology construction
Data & Knowledge Engineering
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Bringing taxonomic structure to large digital libraries
International Journal of Metadata, Semantics and Ontologies
Pattern-based automatic taxonomy learning from the Web
AI Communications
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Information extraction from Wikipedia: moving down the long tail
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
Ontology-driven, unsupervised instance population
Web Semantics: Science, Services and Agents on the World Wide Web
Web-Based Lemmatisation of Named Entities
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Self-supervised relation extraction from the Web
Knowledge and Information Systems
Web-scale named entity recognition
Proceedings of the 17th ACM conference on Information and knowledge management
Extracting the author of web pages
Proceedings of the 2nd ACM workshop on Information credibility on the web
Data & Knowledge Engineering
Harvesting, searching, and ranking knowledge on the web: invited talk
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Database and information-retrieval methods for knowledge discovery
Communications of the ACM - A Direct Path to Dependable Software
The YAGO-NAGA approach to knowledge discovery
ACM SIGMOD Record
Using Wikipedia to bootstrap open information extraction
ACM SIGMOD Record
StatSnowball: a statistical approach to extracting entity relationships
Proceedings of the 18th international conference on World wide web
Measuring the similarity between implicit semantic relations from the web
Proceedings of the 18th international conference on World wide web
Generating complex ontology instances from documents
Journal of Algorithms
Computer Speech and Language
Named entity mining from click-through data using weakly supervised latent dirichlet allocation
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting concept descriptions from the Web: the importance of attributes and values
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Automatically Harvesting and Ontologizing Semantic Relations
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Unsupervised Web-based Automatic Annotation
Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
Semantic disambiguation of taxonomies
Proceedings of the 2007 conference on Artificial Intelligence Research and Development
Named entity recognition in query
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Natural Language Processing as a Foundation of the Semantic Web
Foundations and Trends in Web Science
Proceedings of the 11th International Conference on Electronic Commerce
Exploring models for semantic category verification
Information Systems
Exploring models for semantic category verification
Information Systems
A context pattern induction method for named entity extraction
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Glen, Glenda or Glendale: unsupervised and semi-supervised learning of English noun gender
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Knowledge integration across multiple texts
Proceedings of the fifth international conference on Knowledge capture
Instance-based ontology population exploiting named-entity substitution
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Translation and extension of concepts across languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Analysing Wikipedia and gold-standard corpora for NER training
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Boosting unsupervised relation extraction by using NER
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Unsupervised information extraction approach using graph mutual reinforcement
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Scaling textual inference to the web
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
TextRunner: open information extraction on the web
NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Harvesting relations from the web: quantifiying the impact of filtering functions
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Knowledge-driven learning and discovery
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Structured generative models for unsupervised named-entity clustering
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Semi-automatic entity set refinement
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Coupling semi-supervised learning of categories and relations
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
BE: a search engine for NLP research
WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
Expert Systems with Applications: An International Journal
Automated ontology instantiation from tabular web sources-The AllRight system
Web Semantics: Science, Services and Agents on the World Wide Web
Corpus-based semantic lexicon induction with Web-based corroboration
UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
A probabilistic model of redundancy in information extraction
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Finding intermediate entity between two examples on the web
Proceedings of the eleventh international workshop on Web information and data management
Mutual Screening Graph Algorithm: A New Bootstrapping Algorithm for Lexical Acquisition
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Automatic Web Pages Author Extraction
FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
From information to knowledge: harvesting entities and relationships from web sources
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Large scale relation detection
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Multi-modal multi-correlation person-centric news retrieval
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Language pyramid and multi-scale text analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Boosting relation extraction with limited closed-world knowledge
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Semantic annotation of biomedical literature using google
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Hi-index | 0.00 |
The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, and scalable manner. The paper presents an overview of KNOWITALL's novel architecture and design principles, emphasizing its distinctive ability to extract information without any hand-labeled training examples. In its first major run, KNOWITALL extracted over 50,000 class instances, but suggested a challenge: How can we improve KNOWITALL's recall and extraction rate without sacrificing precision?This paper presents three distinct ways to address this challenge and evaluates their performance. Pattern Learning learns domain-specific extraction rules, which enable additional extractions. Subclass Extraction automatically identifies sub-classes in order to boost recall (e.g., "chemist" and "biologist" are identified as sub-classes of "scientist"). List Extraction locates lists of class instances, learns a "wrapper" for each list, and extracts elements of each list. Since each method bootstraps from KNOWITALL's domain-independent methods, the methods also obviate hand-labeled training examples. The paper reports on experiments, focused on building lists of named entities, that measure the relative efficacy of each method and demonstrate their synergy. In concert, our methods gave KNOWITALL a 4-fold to 8-fold increase in recall at precision of 0.90, and discovered over 10,000 cities missing from the Tipster Gazetteer.