Integrating geometrical and linguistic analysis for email signature block parsing
ACM Transactions on Information Systems (TOIS)
Taking email to task: the design and evaluation of a task management centered email tool
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Detecting emails containing requests for action
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Extracting personal concepts from users' emails to initialize their personal information models
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Novel Approach for Tagging of Discourse Segments in Help-Desk E-Mails
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Full-text search in email archives using social evaluation, attached and linked resources
Proceedings of the 21st international conference companion on World Wide Web
Hi-index | 0.00 |
In the early days of email, widely-used conventions for indicating quoted reply content and email signatures made it easy to segment email messages into their functional parts. Today, the explosion of different email formats and styles, coupled with the ad hoc ways in which people vary the structure and layout of their messages, means that simple techniques for identifying quoted replies that used to yield 95% accuracy now find less than 10% of such content. In this paper, we describe Zebra, an SVM-based system for segmenting the body text of email messages into nine zone types based on graphic, orthographic and lexical cues. Zebra performs this task with an accuracy of 87.01%; when the number of zones is abstracted to two or three zone classes, this increases to 93.60% and 91.53% respectively.