Statistics-Based Summarization - Step One: Sentence Compression
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Forest-based statistical sentence generation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Studying cooperation and conflict between authors with history flow visualizations
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Supervised and unsupervised learning for sentence compression
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Bayesian synchronous tree-substitution grammar induction and its application to sentence compression
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Wikipedia revision toolkit: efficiently accessing Wikipedia's edit history
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
Simple English Wikipedia: a new text simplification task
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Learning to simplify sentences using Wikipedia
MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Learning to simplify sentences with quasi-synchronous grammar and integer programming
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
User edits classification using document revision histories
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Collaboratively built semi-structured content and Artificial Intelligence: The story so far
Artificial Intelligence
An abstractive approach to sentence compression
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
WHAD: Wikipedia historical attributes data
Language Resources and Evaluation
Hi-index | 0.00 |
A well-recognized limitation of research on supervised sentence compression is the dearth of available training data. We propose a new and bountiful resource for such training data, which we obtain by mining the revision history of Wikipedia for sentence compressions and expansions. Using only a fraction of the available Wikipedia data, we have collected a training corpus of over 380,000 sentence pairs, two orders of magnitude larger than the standardly used Ziff-Davis corpus. Using this new-found data, we propose a novel lexicalized noisy channel model for sentence compression, achieving improved results in grammaticality and compression rate criteria with a slight decrease in importance.