Summarization beyond sentence extraction: a probabilistic approach to sentence compression
Artificial Intelligence
Sentence reduction for automatic text summarization
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Hedge Trimmer: a parse-and-trim approach to headline generation
HLT-NAACL-DUC '03 Proceedings of the HLT-NAACL 03 on Text summarization workshop - Volume 5
Supervised and unsupervised learning for sentence compression
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Extractive summarization using inter- and intra- event relevance
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Probabilistic sentence reduction using support vector machines
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Who, what, when, where, why?: comparing multiple approaches to the cross-lingual 5W task
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Self-training PCFG grammars with latent annotations across languages
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Hi-index | 0.00 |
In this paper, we propose an event-based approach for Chinese sentence compression without using any training corpus. We enhance the linguistically-motivated heuristics by exploiting event word significance and event information density. This is shown to improve the preservation of important information and the tolerance of POS and parsing errors, which are more common in Chinese than English. The heuristics are only required to determine possibly removable constituents instead of selecting specific constituents for removal, and thus are easier to develop and port to other languages and domains. The experimental results show that around 72% of our automatic compressions are grammatically and semantically correct, preserving around 69% of the most important information on average.