TINA: a natural language system for spoken language applications
Computational Linguistics
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
COGEX: a logic prover for question answering
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Verbnet: a broad-coverage, comprehensive verb lexicon
Verbnet: a broad-coverage, comprehensive verb lexicon
Question answering based on semantic structures
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A semantic approach to recognizing textual entailment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Question answering based on semantic roles
DeepLP '07 Proceedings of the Workshop on Deep Linguistic Processing
Using semantic and syntactic graphs for call classification
FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing
The PASCAL recognising textual entailment challenge
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Hi-index | 0.00 |
The task of information distillation is to extract snippets from massive multilingual audio and textual document sources that are relevant for a given templated query. We present an approach that focuses on the sentence extraction phase of the distillation process. It selects document sentences with respect to their relevance to a query via statistical classification with support vector machines. The distinguishing contribution of the approach is a novel method to generate classification features. The features are extracted from charts, compilations of elements from various annotation layers, such as word transcriptions, syntactic and semantic parses, and information extraction (IE) annotations. We describe a procedure for creating charts from documents and queries, while paying special attention to query slots (free-text descriptions of names, organizations, topic, events and so on, around which templates are centered), and suggest various types of classification features that can be extracted from these charts. While observing a 30% relative improvement due to non-lexical annotation layers, we perform a detailed analysis of the contributions of each of these layers to classification performance.