Turning web text and search queries into factual knowledge: hierarchical class attribute extraction

Authors:
Marius Pasca
Affiliations:
Google Inc., Mountain View, California
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Year:
2008

Citing 12
Cited 10

TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Discovering relations among named entities from large corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Ontologizing semantic relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semantic taxonomy induction from heterogenous evidence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds

Proceedings of the 16th international conference on World Wide Web
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Cross-lingual query suggestion using query logs of different languages

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A context pattern induction method for named entity extraction

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
SemEval-2007 task 17: English lexical sample, SRL and all words

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Locating complex named entities in web text

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Web-derived resources for web information retrieval: from conceptual hierarchies to attribute hierarchies

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Outclassing Wikipedia in open-domain information extraction: weakly-supervised acquisition of attributes over conceptual hierarchies

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Deriving generalized knowledge from corpora using WordNet abstraction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Automatic generation of topic pages using query-based aspect models

Proceedings of the 18th ACM conference on Information and knowledge management
Latent variable models of concept-attribute attachment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Open information extraction using Wikipedia

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Entity disambiguation for knowledge base population

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Interoperability and information brokers in public safety: an approach toward seamless emergency communications

Journal of Theoretical and Applied Electronic Commerce Research
Medicinal property knowledge extraction from herbal documents for supporting question answering system

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Finding additional semantic entity information for search engines

Proceedings of the Seventeenth Australasian Document Computing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

A seed-based framework for textual information extraction allows for weakly supervised acquisition of open-domain class attributes over conceptual hierarchies, from a combination of Web documents and query logs. Automatically-extracted labeled classes, consisting of a label (e.g., painkillers) and an associated set of instances (e.g., vicodin, oxycontin), are linked under existing conceptual hierarchies (e.g., brain disorders and skin diseases are linked under the concepts BrainDisorder and SkinDisease respectively). Attributes extracted for the labeled classes are propagated upwards in the hierarchy, to determine the attributes of hierarchy concepts (e.g., Disease) from the attributes of their subconcepts (e.g., BrainDisorder and SkinDisease).