Approximating clique and biclique problems
Journal of Algorithms
Approximating the minimum equivalent digraph
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems
Journal of the ACM (JACM)
Reinventing the inbox: supporting the management of pending tasks in email
CHI '02 Extended Abstracts on Human Factors in Computing Systems
On bipartite and multipartite clique problems
Journal of Algorithms
Exploring discussion lists: steps and directions
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Mining e-mail content for author identification forensics
ACM SIGMOD Record
Mining newsgroups using networks arising from social behavior
WWW '03 Proceedings of the 12th international conference on World Wide Web
Email classification for contact centers
Proceedings of the 2003 ACM symposium on Applied computing
Automatic Reassembly of Document Fragments via Context Based Statistical Models
ACSAC '03 Proceedings of the 19th Annual Computer Security Applications Conference
The maximum edge biclique problem is NP-complete
Discrete Applied Mathematics
Scalable discovery of hidden emails from large folders
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Summarizing email conversations with clue words
Proceedings of the 16th international conference on World Wide Web
Hi-index | 0.00 |
The popularity of email has triggered researchers to look for ways to help users better organize the enormous amount of information stored in their email folders. One challenge that has not been studied extensively in text mining is the reconstruction of hidden emails. A hidden email is an original email that has been quoted in subsequent emails but is not itself present in the folder; it may have been deleted or may never have been received. This paper proposes a method for reconstructing hidden emails using the embedded quotations found in messages further down the thread hierarchy. To do so, we model all the quoted fragments in a precedence graph, from which hidden emails are regenerated as bulletized documents. The bulletized model is our solution to the situation when a total ordering of fragment is not possible. We give a necessary and sufficient condition for each component of the precedence graph to be captured in a single bulletized email, and we develop heuristics that minimize the number of regenerated emails when the condition is not met. Finally, we present empirical results showing the scalability of our approach.