|| Two-Timescale Stochastic EM Algorithms
||Belhal Karimi, Ping Li, Baidu Research, United States|
||D2-S6-T3: Estimation & Learning
||Tuesday, 13 July, 23:40 - 00:00
||Wednesday, 14 July, 00:00 - 00:20
The Expectation-Maximization (EM) algorithm is a popular choice for learning latent variable models. Variants of the EM have been initially introduced by Neal and Hinton (1998), using incremental updates to scale to large datasets, and by Wei and Tanner (1990); Delyon et al. (1999), using Monte Carlo (MC) approximations to bypass the intractable conditional expectation of the latent data for most nonconvex models. In this paper, we propose a general class of methods called Two-Timescale EM Methods based on a two-stage approach of stochastic updates to tackle an essential nonconvex optimization task for latent variable models. We motivate the choice of a double dynamic by invoking the variance reduction virtue of each stage of the method on both sources of noise: the index sampling for the incremental update and the MC approximation. We establish finite-time and global convergence bounds for nonconvex objective functions. Numerical applications on various models such as a deformable template for image analysis or Gaussian Mixture Models are also presented to illustrate our findings.