Hierarchical mixtures of experts and the EM algorithm
Neural Computation
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Machine Learning
On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms
Data Mining and Knowledge Discovery
Tracking dynamics of topic trends using a finite mixture model
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating the Support of a High-Dimensional Distribution
Neural Computation
Online Model Selection Based on the Variational Bayes
Neural Computation
An Introduction to Copulas (Springer Series in Statistics)
An Introduction to Copulas (Springer Series in Statistics)
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Linear Time Model Selection for Mixture of Heterogeneous Components
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Dynamic Model Selection With its Applications to Novelty Detection
IEEE Transactions on Information Theory
Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Pair-copula based mixture models and their application in clustering
Pattern Recognition
Hi-index | 0.00 |
This paper proposes an online mixture modeling methodology in which individual components can have different marginal distributions and dependency structures. Mixture models have been widely studied and applied to various application areas, including density estimation, fraud/failure detection, image segmentation, etc. Previous research has been almost exclusively focused on mixture models having components of a single type (e.g., a Gaussian mixture model.) However, recent growing needs for complicated data modeling necessitate the use of more flexible mixture models (e.g., a mixture of a lognormal distribution for medical costs and a Gaussian distribution for blood pressure, for medical analytics.) Our key ideas include: 1) separating marginal distributions and their dependencies using copulas and 2) online extension of a recently-developed "expectation minimization of description length," which enable us to efficiently learn types of both marginal distributions and copulas as well as their parameters. The proposed method provides not only good performance in applications, but also scalable, automatic model selection, which greatly reduces the intensive modeling costs in data mining processes. We show that the proposed method outperforms state-of-the-art methods in application to density estimation and to anomaly detection.