New Methods in Automatic Extracting
Journal of the ACM (JACM)
Summarization of discussion groups
Proceedings of the tenth international conference on Information and knowledge management
Computational Linguistics - Summarization
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Summarizing email conversations with clue words
Proceedings of the 16th international conference on World Wide Web
Using lexical chains for keyword extraction
Information Processing and Management: an International Journal
Maximum entropy estimation for feature forests
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Comments-oriented blog summarization by sentence extraction
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Information Processing and Management: an International Journal
Using Question-Answer Pairs in Extractive Summarization of Email Conversations
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Correlation between ROUGE and human evaluation of extractive meeting summaries
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Improving "email speech acts" analysis via n-gram selection
ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
An exploration of document impact on graph-based multi-document summarization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
A lexical chain approach for update-style query-focused multi-document summarization
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Hi-index | 0.00 |
We propose a method to summarize threaded, multi-topical texts automatically, particularly online discussions and e-mail conversations. These corpora have a so-called reply-to structure among the posts, where multiple topics are discussed simultaneously with a certain level of continuity, although each post is typically short. We specifically focus on the multi-topical aspect of the corpora, and propose the use of two linguistically motivated features: lexical chains and cue words, which capture the topics and topic structure. Particularly, we introduce the structured lexical chain, which is a combination of traditional lexical chains with the thread structure. In experiments, we show the effectiveness of these features on the Innovation Jam 2008 Corpus and the BC3 Mailing List Corpus based on two task settings: key-sentence and keyword extraction. We also present detailed analysis of the result with some intuitive examples.