Elements of information theory
Elements of information theory
Email overload: exploring personal information management of email
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A vector space model for automatic indexing
Communications of the ACM
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Challenges of the Email Domain for Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Introduction to Information Retrieval
Introduction to Information Retrieval
Suggesting friends using the implicit social graph
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Expert Systems with Applications: An International Journal
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Hi-index | 0.00 |
Automatic organization of email messages into folders is both an open problem and challenge for machine learning techniques. Besides the effect of email overload, which affects many email users worldwide, there are some increasing difficulties caused by the semantics applied by each user. The varying number of folders and their meaning are personal and in many cases pose difficulties to learning methods. This paper addresses automatic organization of email messages into folders, based on supervised learning algorithms. The textual fields of the email message (subject and body) are considered for learning, with different representations, feature selection methods, and classifiers. The participant fields are embedded into a vector-space model representation. The classification decisions from the different email fields are combined by majority voting. Experiments on a subset of the Enron Corpus and on a private email data set show the significant improvement over both single classifiers on these fields as well as over previous works.