{"id":90,"date":"2014-04-11T11:32:31","date_gmt":"2014-04-11T11:32:31","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=90"},"modified":"2025-02-26T13:21:39","modified_gmt":"2025-02-26T13:21:39","slug":"setting-a-wishart-prior","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/04\/11\/setting-a-wishart-prior\/","title":{"rendered":"Setting a Wishart prior"},"content":{"rendered":"<p>When I\u00a0started this blog I\u00a0decided to try and post once a week for the first few months and so\u00a0I has to plan\u00a0ahead. I found that there were many topics that I wanted to cover that involved multivariate normal models for which the conjugate prior on the precision matrix is a Wishart distribution. Now I have already explained (<a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/03\/14\/why-i-dont-use-winbugs-priors\/\" target=\"_blank\">Why I don\u2019t use WinBUGS priors<\/a>) that I am not a fan of very vague priors, so\u00a0over the coming weeks I will need to set informative Wishart priors and I thought that I would dedicate this posting to the general problem of\u00a0specifying this type of\u00a0prior.<\/p>\n<p>To illustrate the problem, I performed a small experiment in which I generated some random values from a two dimensional Wishart distribution. Here is the code from a Stata do file that does the job. The program is written using Stata commands but it calls WinBUGS to perform the simulation. The program uses several of the downloadable wbs programs that are described in the book \u2018<em>Bayesian Analysis with Stata<\/em>\u2019 that will be published in a few weeks time.<\/p>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbsmodel wishartsim.do model.txt<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\/*<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>model {<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0\u00a0\u00a0 T[1:2,1:2] ~ dwish(R[1:2,1:2],DF)<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>}<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>*\/<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>local DF = 5<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>matrix B = (1, 0.5 \\ 0.5, 1)<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>matrix T = syminv(B)<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>matrix R = `DF'*B<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbslist (matrix R) (DF=`DF') using data.txt, replace<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbslist (matrix=T) using init.txt , replace<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbsscript using script.txt , replace set(T) update(1000) \/\/\/<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 model(model.txt) data(data.txt) init(init.txt) coda(out) \/\/\/<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 log(log.txt)<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbsrun using script.txt<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>type log.txt<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>wbscoda using out , clear<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>gen v1 = .<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>gen v2 = .<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>gen rho = .<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>forvalues i=1\/1000 {<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0\u00a0 matrix T[1,1] = T_1_1[`i']<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 <\/b><b>matrix T[1,2] = T_1_2[`i']<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0\u00a0 matrix T[2,1] = T_2_1[`i']<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 matrix T[2,2] = T_2_2[`i']<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 matrix V = syminv(T)<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 <\/b><b>qui replace v1 = V[1,1] in `i'<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 qui replace v2 = V[2,2] in `i'<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>\u00a0\u00a0 <\/b><b>qui replace rho = V[1,2]\/sqrt(V[1,1]*V[2,2]) in `i'<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>}<\/b><\/span><\/pre>\n<pre style=\"padding-left: 60px\"><span style=\"color: #0000ff\"><b>histogram rho , start(-1) width(0.05)<\/b><\/span><\/pre>\n<p>T is the precision matrix (inverse of the covariance matrix) that follows the Wishart distribution and R and DF are the two parameters that the user has to choose when setting a prior. WinBUGS is used to simulate 1000 random matrices and then each one is inverted and the variances and correlation are extracted. The command wbsmodel was discussed in one of\u00a0my previous postings\u00a0under the title\u00a0<a title=\"records\" href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/03\/21\/keeping-good-records\/\">keeping good records<\/a>.<\/p>\n<p>Keep in mind that there are two commonly used parameterizations of the Wishart. I have followed WinBUGS in parameterizing in terms of a matrix R and a scalar DF, but many people parameterize in terms of the inverse of R and the scalar DF.<\/p>\n<p>I ran the program with different values for the degrees of freedom, DF, and matrix R=DF*(1,0.5.5,1) and then I created the following histograms of the simulated correlations,<b> rho<\/b>. Mathematically, DF cannot be less than the dimension of the precision matrix, so I start with DF=2. <b><\/b><\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_rho.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-91 aligncenter\" alt=\"wish_rho\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_rho.png\" width=\"573\" height=\"417\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_rho.png 716w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_rho-300x218.png 300w\" sizes=\"auto, (max-width: 573px) 100vw, 573px\" \/><\/a><\/p>\n<p>Clearly, the larger the parameter DF, the stronger is the user\u2019s prior. The basis matrix from which R is formed, i.e. B=(1, 0.5\\ 0.5, 1), is a guess at the covariance matrix, so in this case we start with the belief that the variances are one and the covariance is 0.5, which implies that <b>rho<\/b> is also 0.5. We can see that as DF increases so our beliefs concentrate around that value. However, the parameter that must be set in the Wishart is not B itself but R=DF*B.<\/p>\n<p>I also plotted histograms for the variances of the first variable and as before the distribution of prior beliefs concentrates around the value in the basis matrix, in this case converging to 1. I have truncated the xaxis of these histograms at 20 and in the case of DF=2 this has excluded a few extremely large values. People who favour vague priors need to keep this in mind because when DF is low, the sampler will try some very extreme (perhaps unbelievable) variances and these could cause numerical problems.<\/p>\n<p align=\"center\"><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_var.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-92\" alt=\"wish_var\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_var.png\" width=\"573\" height=\"417\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_var.png 716w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_var-300x218.png 300w\" sizes=\"auto, (max-width: 573px) 100vw, 573px\" \/><\/a><\/p>\n<p>Finally we need to be aware that the simulated variances and correlations will not be independent. Here is a plot of the log-variance against correlation for DF=5.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_lowess.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-93 aligncenter\" alt=\"wish_lowess\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_lowess.png\" width=\"573\" height=\"417\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_lowess.png 716w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish_lowess-300x218.png 300w\" sizes=\"auto, (max-width: 573px) 100vw, 573px\" \/><\/a><\/p>\n<p>As the lowess trend line emphasises, when rho is close to 1 or -1 the variance is noticeably larger than it is in the centre of its range. Some people have taken this pattern as grounds for criticising the use of the use of the Wishart but for me, the important point is that seeking independence is a bit like seeking a uniform prior for one of the parameters; even is you achieve it, it only applies to that particular parameterization and a transformation to a new scale will destroy the uniformity and\/or the independence. So if we were to find a distribution that made the correlation and variance independent, this would still leave us with a relationship between say, the variance and covariance.<\/p>\n<p>The real question is, does the pattern of the Wishart distribution reasonably approximate your actual prior beliefs? if not, then find another distribution but don\u2019t expect to find one where everything is independent and uniformly distributed.<\/p>\n<p>One of the difficulties with the Wishart is that there is only one parameter to describe the strength of our beliefs and this has to apply to the whole matrix, so we cannot be more certain about some variances than others. Instead we are forced to compromise.<\/p>\n<p>Imagine that we have decided that we want to centre our beliefs on B=(1, 0.5\\ 0.5, 1) and now we want to choose DF. If we believe that the variances will, almost certainly, be less than 5 then our extreme case is Bmax=(5, 2.5\\ 2.5, 5). This implies precision matrices T=inv(B) = (1.33, -0.66\\-0.66, 1.33) and Tmin= inv(Bmax) = (0.27, -0.13\\-0.13, 0.27).<\/p>\n<p>The range of precisions between the centre and the minimum\u00a0is 1.33 to 0.27, so the standard deviation of the precision must be about half that range, or 0.55. According to the delta method, the standard deviation of a precision is\u00a0approximately mean(precision)*sqrt(2\/DF), so 0.55=1.33sqrt(2\/DF) and we need to choose DF to be about 12.<\/p>\n<p>Here are the simulations of a Wishart distribution with B=(1, 0.5\\ 0.5, 1)\u00a0and DF=12. The variance has the property that we wanted and we would have to be willing to accept the consequential prior on the correlation.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish12.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-94 aligncenter\" alt=\"wish12\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish12.png\" width=\"493\" height=\"359\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish12.png 1028w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish12-300x218.png 300w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/04\/wish12-1024x745.png 1024w\" sizes=\"auto, (max-width: 493px) 100vw, 493px\" \/><\/a><\/p>\n<p>If in doubt, simulate some data and do the experiment.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I\u00a0started this blog I\u00a0decided to try and post once a week for the first few months and so\u00a0I has to plan\u00a0ahead. I found that there were many topics that I wanted to cover that involved multivariate normal models for which the conjugate prior on the precision matrix is a Wishart distribution. Now I have [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[14,4,5,13],"class_list":["post-90","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-prior","tag-stata","tag-winbugs","tag-wishart"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/90","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=90"}],"version-history":[{"count":10,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/90\/revisions"}],"predecessor-version":[{"id":105,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/90\/revisions\/105"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=90"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=90"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=90"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}