A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
Mat'Graph: transformation matricielle de graphe pour visualiser des échanges électroniques
IHM 2005 Proceedings of the 17th international conference on Francophone sur l'Interaction Homme-Machine
Visualisation du parcours des fichiers attachés aux messages électroniques
Proceedings of the 20th International Conference of the Association Francophone d'Interaction Homme-Machine
Thread arcs: an email thread visualization
INFOVIS'03 Proceedings of the Ninth annual IEEE conference on Information visualization
Hi-index | 0.01 |
E-mailing has become an essential component of cooperation in business. Consequently, the large number of messages manually produced or automatically generated can rapidly cause information overflow for users. Many research projects have examined this issue but surprisingly few have tackled the problem of the files attached to e-mails that, in many cases, contain a substantial part of the semantics of the message. This paper considers this specific topic and focuses on the problem of clustering and visualization of attached files. Relying on the multinomial mixture model, we used the Classification EM algorithm (CEM) to cluster the set of files, and MDSDCA to visualize the obtained classes of documents. Like the Multidimensional Scaling method, the aim of the MDSDCA algorithm based on the Difference of Convex functions is to optimize the stress criterion. As MDSDCA is iterative, we propose an initialization approach to avoid starting with random values. Experiments are investigated using simulations and textual data.