A statistical approach to machine translation
Computational Linguistics
The identification of important concepts in highly structured technical papers
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
New Methods in Automatic Extracting
Journal of the ACM (JACM)
Statistics-Based Summarization - Step One: Sentence Compression
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Generating natural language summaries from multiple on-line sources
Computational Linguistics - Special issue on natural language generation
A novel use of statistical parsing to extract information from text
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Generation that exploits corpus-based statistical knowledge
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Exploiting a probabilistic hierarchical model for generation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Statistical parsing with a context-free grammar and word statistics
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Making MIRACLEs: Interactive translingual search for Cebuano and Hindi
ACM Transactions on Asian Language Information Processing (TALIP)
Multi-candidate reduction: Sentence compression as a tool for document summarization tasks
Information Processing and Management: an International Journal
Personalized web exploration with task models
Proceedings of the 17th international conference on World Wide Web
Hindi, telugu, oromo, english CLIR evaluation
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Hi-index | 0.00 |
This paper presents new approaches to headline generation for English newspaper texts, with an eye toward the production of document surrogates for document selection in cross-language information retrieval. This task is difficult because the user must make decisions about relevance based on (often poor) translations of retrieved documents. To facilitate the decision-making process we need translations that can be assessed rapidly and accurately; our approach is to provide an English headline for the non-English document. We describe two approaches to headline generation and their application to the recent DARPA TIDES-2003 Surprise Language Exercise for Hindi. For comparison, we also implemented an alternative method for surrogate generation: a system that produces topic lists for (Hindi) articles. We present the results of a series of experiments comparing each of these approaches. We demonstrate in both automatic and human evaluations that our linguistically motivated approach outperforms two other surrogate-generation methods: a statistical system and a topic discovery system.