Online heterogeneous mixture modeling with marginal and copula selection

Authors:
Ryohei Fujimaki;Yasuhiro Sogawa;Satoshi Morinaga
Affiliations:
NEC Laboratories America, Cupertino, CA, USA;Osaka University, Osaka, Japan;NEC Corporation, Kanagawa, Japan
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 13
Cited 2

Hierarchical mixtures of experts and the EM algorithm

Neural Computation
A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Soft Margins for AdaBoost

Machine Learning
On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms

Data Mining and Knowledge Discovery
Tracking dynamics of topic trends using a finite mixture model

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating the Support of a High-Dimensional Distribution

Neural Computation
Online Model Selection Based on the Variational Bayes

Neural Computation
An Introduction to Copulas (Springer Series in Statistics)

An Introduction to Copulas (Springer Series in Statistics)
Isolation Forest

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Linear Time Model Selection for Mixture of Heterogeneous Components

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Dynamic Model Selection With its Applications to Novelty Detection

IEEE Transactions on Information Theory

A finite mixture model based on pair-copula construction of multivariate distributions and its application to color image segmentation

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Pair-copula based mixture models and their application in clustering

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes an online mixture modeling methodology in which individual components can have different marginal distributions and dependency structures. Mixture models have been widely studied and applied to various application areas, including density estimation, fraud/failure detection, image segmentation, etc. Previous research has been almost exclusively focused on mixture models having components of a single type (e.g., a Gaussian mixture model.) However, recent growing needs for complicated data modeling necessitate the use of more flexible mixture models (e.g., a mixture of a lognormal distribution for medical costs and a Gaussian distribution for blood pressure, for medical analytics.) Our key ideas include: 1) separating marginal distributions and their dependencies using copulas and 2) online extension of a recently-developed "expectation minimization of description length," which enable us to efficiently learn types of both marginal distributions and copulas as well as their parameters. The proposed method provides not only good performance in applications, but also scalable, automatic model selection, which greatly reduces the intensive modeling costs in data mining processes. We show that the proposed method outperforms state-of-the-art methods in application to density estimation and to anomaly detection.