Diversity in the use of electronic mail: a preliminary inquiry
ACM Transactions on Information Systems (TOIS)
Original Contribution: Stacked generalization
Neural Networks
Diversity versus Quality in Classification Ensembles Based on Feature Selection
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Case Representation Issues for Case-Based Reasoning from Ensemble Research
ICCBR '01 Proceedings of the 4th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
A Comparative Study of Classification Based Personal E-mail Filtering
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Marked for deletion: an analysis of email data
CHI '03 Extended Abstracts on Human Factors in Computing Systems
Email classification with co-training
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Understanding email use: predicting action on a message
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
An Assessment of Case-Based Reasoning for Spam Filtering
Artificial Intelligence Review
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
When can i expect an email response? a study of rhythms in email usage
ECSCW'03 Proceedings of the eighth conference on European Conference on Computer Supported Cooperative Work
Maintenance by a Committee of Experts: The MACE Approach to Case-Base Maintenance
ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Hi-index | 0.00 |
The variety in email related tasks, as well as the increase in daily email load, has created a need for automated email management tools. In this paper, we provide an empirical evaluation of representational schemes and retrieval strategies for email. In particular, we study the impact of both textual and non-textual email content for case representation applied to Email task management. Our first contribution is Stack, an email representation based on stacking. Multiple casebases are created, each using a different case representation related with attributes corresponding to semi-structured email content. A k-NN classifier is applied to each casebase and the output is used to form a new case representation. Our second contribution is a new evaluation method allowing the creation of random chronological stratified train-test trials that respect both temporal and class distribution aspects, crucial for the email domain. The Enron corpus was used to create a dataset for the email deletion prediction task. Evaluation results show significant improvements with Stackover single casebase retrieval and multiple casebases retrieval combined using majority vote.