Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables
Machine Learning - Special issue on learning with probabilistic representations
A view of the EM algorithm that justifies incremental, sparse, and other variants
Proceedings of the NATO Advanced Study Institute on Learning in graphical models
Very fast EM-based mixture model clustering using multiresolution kd-trees
Proceedings of the 1998 conference on Advances in neural information processing systems II
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
On-line EM Algorithm for the Normalized Gaussian Network
Neural Computation
Fast learning from sparse data
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
An expectation-maximization algorithm working on data summary
Second international workshop on Intelligent systems design and application
Scalable Model-based Clustering by Working on Data Summaries
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Scalable Model-Based Clustering for Large Databases Based on Data Summarization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Convergence Theorems for Generalized Alternating Minimization Procedures
The Journal of Machine Learning Research
A Scalable Framework For Segmenting Magnetic Resonance Images
Journal of Signal Processing Systems
Sampling-based estimators for subset-based queries
The VLDB Journal — The International Journal on Very Large Data Bases
Data Mining and Knowledge Discovery
Scalable model-based cluster analysis using clustering features
Pattern Recognition
Active curve axis Gaussian mixture models
Pattern Recognition
Fast online estimation of the joint probability distribution
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Data compression by volume prototypes for streaming data
Pattern Recognition
A fast implementation of the EM algorithm for mixture of multinomials
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
A fast convergence clustering algorithm merging MCMC and EM methods
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires significant computational resources and has been dismissed as impractical for large databases. We present two approaches that significantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial E-steps for which we can use the results of Neal and Hinton (In Jordan, M. (Ed.), Learning in Graphical Models, pp. 355–371. The Netherlands: Kluwer Academic Publishers) to obtain the standard convergence guarantees of EM. The first approach is a version of the incremental EM algorithm, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically effects the efficiency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the significance of each data case and then proceed for several iterations actively using only the significant cases. We demonstrate that both methods can significantly reduce computational costs through their application to high-dimensional real-world and synthetic mixture modeling problems for large databases.