On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
The Journal of Machine Learning Research
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
prefuse: a toolkit for interactive information visualization
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A rank-by-feature framework for interactive exploration of multidimensional data
Information Visualization
iVIBRATE: Interactive visualization-based framework for clustering large datasets
ACM Transactions on Information Systems (TOIS)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Proceedings of the 12th international conference on Intelligent user interfaces
Interactive clustering of text collections according to a user-specified criterion
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Helping users sort faster with adaptive machine learning recommendations
INTERACT'11 Proceedings of the 13th IFIP TC 13 international conference on Human-computer interaction - Volume Part III
TextFlow: Towards Better Understanding of Evolving Topics in Text
IEEE Transactions on Visualization and Computer Graphics
Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons
SIAM Journal on Scientific Computing
Fast interactive visualization for multivariate data exploration
CHI '13 Extended Abstracts on Human Factors in Computing Systems
Hi-index | 0.00 |
Clustering plays an important role in many large-scale data analyses providing users with an overall understanding of their data. Nonetheless, clustering is not an easy task due to noisy features and outliers existing in the data, and thus the clustering results obtained from automatic algorithms often do not make clear sense. To remedy this problem, automatic clustering should be complemented with interactive visualization strategies. This paper proposes an interactive visual analytics system for document clustering, called iVisClustering, based on a widely-used topic modeling method, latent Dirichlet allocation (LDA). iVisClustering provides a summary of each cluster in terms of its most representative keywords and visualizes soft clustering results in parallel coordinates. The main view of the system provides a 2D plot that visualizes cluster similarities and the relation among data items with a graph-based representation. iVisClustering provides several other views, which contain useful interaction methods. With help of these visualization modules, we can interactively refine the clustering results in various ways. Keywords can be adjusted so that they characterize each cluster better. In addition, our system can filter out noisy data and re-cluster the data accordingly. Cluster hierarchy can be constructed using a tree structure and for this purpose, the system supports cluster-level interactions such as sub-clustering, removing unimportant clusters, merging the clusters that have similar meanings, and moving certain clusters to any other node in the tree structure. Furthermore, the system provides document-level interactions such as moving mis-clustered documents to another cluster and removing useless documents. Finally, we present how interactive clustering is performed via iVisClustering by using real-world document data sets. © 2012 Wiley Periodicals, Inc.