{"id":172,"date":"2014-06-13T09:58:39","date_gmt":"2014-06-13T09:58:39","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=172"},"modified":"2025-02-26T13:21:38","modified_gmt":"2025-02-26T13:21:38","slug":"monitoring-convergence-in-high-dimensions","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/06\/13\/monitoring-convergence-in-high-dimensions\/","title":{"rendered":"Monitoring Convergence in High Dimensions"},"content":{"rendered":"<p>In Chapter 5 of \u2018<em>Bayesian Analysis with Stata<\/em>\u2019 I discussed methods for monitoring the convergence of a set of MCMC simulations. Obviously it is important to demonstrate the effectiveness of different approaches to the assessment of convergence using output from real MCMC analyses, but often\u00a0real problems\u00a0are difficult to judge because we\u00a0do not\u00a0know the exact truth behind the convergence of the chain. What is more, with real chains\u00a0it can be\u00a0difficult to control the specific convergence characteristics that one wants to investigate.<\/p>\n<p>An alternative way to judge a method for monitoring convergence is to use an artificial chain with a known structure and in the book I used artificially generated one-dimensional series to illustrate some of the standard methods for assessing convergence. It is relatively straight forward to generalize that approach into higher dimensions and I will discuss that approach here.<\/p>\n<p>Our problem is to simulate an autocorrelated\u00a0chain for a multidimensional set of parameters that together have a specified posterior distribution. A simple method for doing this is to make the target distribution a mixture of multivariate normals. We could\u00a0generate the chain by running a chosen set of MCMC\u00a0samplers with a specified target posterior that does not depend on any data, but in my\u00a0method I simulate autocorrelated points directly from the target distribution.<\/p>\n<p>For illustration, I ran a simulation of a chain of length 1000 with 3 parameters. In this case, the\u00a0simulated posterior is\u00a0composed of a single multivariate normal distribution with\u00a0mean (5,5,5) and covariance matrix (2,2,2\\2,3,3\\2,3,4) but the chain starts with a poor initial guess of (4,10,0). The other\u00a0feature that must be specified is the autocorrelation between sucessive points, which I set to be quite high at 0.95.<\/p>\n<p>Here is a plot of the simulations. The first 100 points are shown in red and are joined in the order in which they were generated and the remainder, which represent the chain after the early drift, are shown in blue. We can see the impact of the\u00a0poor initial values\u00a0most clearly in the plot of theta2 against theta3.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/convplot.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-176\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/convplot-1024x745.png\" alt=\"convplot\" width=\"620\" height=\"451\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/convplot-1024x745.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/convplot-300x218.png 300w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/convplot.png 1028w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>The plot is also quite instructive in that it shows an unexpected blue tail of low values that was generated later in the chain. We are used to seeing random samples from a distribution in which occasional extreme values are generated but\u00a0such extremes\u00a0tend to be isolated. In a chain with high autocorrelation,\u00a0one extreme value is likely to be followed by another. This is why highly correlated chains need to be run for so long even after they have recovered from the initial values.<\/p>\n<p>We might imagine that this early drift would be obvious in the marginal trace plots but in fact\u00a0it is\u00a0not that clear.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/trace.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-177\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/trace-1024x745.png\" alt=\"trace\" width=\"620\" height=\"451\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/trace-1024x745.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/trace-300x218.png 300w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/trace.png 1028w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>The early drift can be detected once we know that it is there and the blue tail is evident at around iteration 600, though not if we look at theta3.\u00a0Using these trace plots alone,\u00a0I doubt whether many people would have spotted the convergence\u00a0problems.<\/p>\n<p>In the book I tried to stress the importance of thinking about convergence\u00a0as a multi-dimensional issue and as an example of the type of methods that could be used I suggested plotting the Mahalanobis\u00a0distance. This means plotting the Mahalanobis distance of the three dimensional points from the mean of the chain. The plot produced by <strong>mcmcmahal<\/strong> is,<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/mahal.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-179\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/mahal-1024x745.png\" alt=\"mahal\" width=\"620\" height=\"451\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/mahal-1024x745.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/mahal-300x218.png 300w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/mahal.png 1028w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>I think that the features of the chain are clearer in this plot.<\/p>\n<p>Next week I will return to this issue and say a little more about how\u00a0my simulation program works and illustrate some convergence plots created from simulated posteriors\u00a0with more complex shape. Meanwhile if you want to try these methods for yourself you can download the Stata\u00a0<a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/06\/code.pdf\">code<\/a>\u00a0from here.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Chapter 5 of \u2018Bayesian Analysis with Stata\u2019 I discussed methods for monitoring the convergence of a set of MCMC simulations. Obviously it is important to demonstrate the effectiveness of different approaches to the assessment of convergence using output from real MCMC analyses, but often\u00a0real problems\u00a0are difficult to judge because we\u00a0do not\u00a0know the exact truth [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[23,22],"class_list":["post-172","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-convergence","tag-mcmc"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/172","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=172"}],"version-history":[{"count":4,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/172\/revisions"}],"predecessor-version":[{"id":181,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/172\/revisions\/181"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=172"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=172"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=172"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}