I first learned this solution from Botond Cseke . I'm not sure where it originates; It is essentially Laplace's method for approximating integrals using a Gaussian distribution, where the parameters of the Gaussian distribution might come from any number of various approximate inference approaches.
If I have a Bayesian statistical model with hyperparameters $\Theta$, with a no closed-form posterior, how can I optimize $\Theta$?