Automatic creation of domain templates

Authors:
Elena Filatova;Vasileios Hatzivassiloglou;Kathleen McKeown
Affiliations:
Columbia University;The University of Texas at Dallas;Columbia University
Venue:
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Year:
2006

Citing 15
Cited 10

An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
Automatic labeling of semantic roles

Computational Linguistics
Optimized Substructure Discovery for Semi-structured Data

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Generating natural language summaries from multiple on-line sources

Computational Linguistics - Special issue on natural language generation
Multidocument summarization via information extraction

HLT '01 Proceedings of the first international conference on Human language technology research
Complexity of event structure in IE scenarios

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
An improved extraction pattern representation model for automatic IE pattern acquisition

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Issues and methodology for template design for information extraction

HLT '94 Proceedings of the workshop on Human Language Technology
Principles of template design

HLT '94 Proceedings of the workshop on Human Language Technology
Topic themes for multi-document summarization

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical acquisition of content selection rules for natural language generation

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Tell me what you do and I'll tell you what you are: learning occupation-related activities for biographies

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Automatically generating Wikipedia articles: a structure-aware approach

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Generating templates of entity summaries with an entity-aspect model and pattern mining

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning web query patterns for imitating Wikipedia articles

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Template-based information extraction without the templates

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
An aspect-driven random walk model for topic-focused multi-document summarization

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Editorial: Occupation inference through detection and classification of biographical activities

Data & Knowledge Engineering
Text stream processing

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Event linking: grounding event reference in a news archive

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Automatically building templates for entity summary construction

Information Processing and Management: an International Journal
Ontology-enriched multi-document summarization in disaster management using submodular function

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, many Natural Language Processing (NLP) applications have improved the quality of their output by using various machine learning techniques to mine Information Extraction (IE) patterns for capturing information from the input text. Currently, to mine IE patterns one should know in advance the type of the information that should be captured by these patterns. In this work we propose a novel methodology for corpus analysis based on cross-examination of several document collections representing different instances of the same domain. We show that this methodology can be used for automatic domain template creation. As the problem of automatic domain template creation is rather new, there is no well-defined procedure for the evaluation of the domain template quality. Thus, we propose a methodology for identifying what information should be present in the template. Using this information we evaluate the automatically created domain templates through the text snippets retrieved according to the created templates.