{"id":783,"date":"2015-08-21T09:09:34","date_gmt":"2015-08-21T09:09:34","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=783"},"modified":"2025-02-26T13:21:37","modified_gmt":"2025-02-26T13:21:37","slug":"frequentist-accuracy","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2015\/08\/21\/frequentist-accuracy\/","title":{"rendered":"Frequentist accuracy"},"content":{"rendered":"<p>Every few weeks I\u00a0try to find\u00a0an hour or so to browse through the latest issues of the main statistics journals. This gives me\u00a0the opportunity\u00a0to\u00a0read papers on\u00a0topics that I know relatively little about; it\u00a0is\u00a0my way of keeping up to date with new developments\u00a0beyond my day to day work on genetic epidemiology. Of course,\u00a0it is only possible to read\u00a0a handful of these papers and one of the ways that I select those that I will read\u00a0is to follow a few well-known authors who are good at explaining their work to non-specialists.<\/p>\n<p>One of the authors\u00a0that I look out for\u00a0is Bradley Efron and a few weeks ago I\u00a0came across\u00a0one of his latest\u00a0papers. It has a Bayesian theme, so I thought that it would be worth mentioning in this blog. The reference is,<br \/>\n<strong>Efron B<\/strong><br \/>\n<strong> Frequentist accuracy of Bayesian estimates<\/strong><br \/>\n<strong><em> Journal of the Royal Statistical Society, Series B<\/em> 2015;77:617-646<\/strong><\/p>\n<p>The paper addresses the question, how\u00a0do you\u00a0assess the accuracy of a Bayesian parameter estimate when\u00a0you use a convenience or reference prior?<\/p>\n<p>My short answer to this question is that you should not use a convenience or reference prior.\u00a0In my opinion\u00a0you should <span style=\"text-decoration: underline\">always<\/span> think about your priors and if you know little about the parameters then choose a prior that is realistically vague, rather than relying on a standard zero centred normal prior with a huge variance, or a Jeffrey\u2019s prior, or whatever.<\/p>\n<p>None the less there are plenty of people who advocate objective Bayesian methods (<a href=\"http:\/\/www.stat.cmu.edu\/~rsteorts\/btheory\/berger_objective_2006.pdf\">Berger 2006 <\/a>is\u00a0probably best reference) and\u00a0since I do not accept the objective Bayesian argument, I feel compelled to try to be fair to them.<\/p>\n<p>Objective Bayes comes it many forms and there is an honourable tradition of trying to find priors that produce\u00a0posterior distributions\u00a0with good theoretical properties,\u00a0often good frequentist properties. Unfortunately, this has proved very difficult\u00a0and in practice what passes for objective Bayes\u00a0may just be\u00a0the adhoc use of supposedly noninformative priors.\u00a0These priors cannot be said to reflect anyone\u2019s real prior opinion, so the question for\u00a0people who use them is, what do\u00a0your results mean?<\/p>\n<p>The usual Bayesian interpretation of a 95% credible interval is\u00a0that you believe that there is a 0.95 probability that the parameter lies in the\u00a0 interval. However, the calculation of the interval is conditional on both the model and the prior. If you do not believe in the prior, how can you make statements of belief based on the posterior?<\/p>\n<p>Adhoc objective Bayesians\u00a0can of course say that their analysis applies to a hypothetical individual who holds these reference beliefs, but often the\u00a0objective priors are so extreme that it is hard to imagine any real person with that degree of ignorance.<\/p>\n<p>Alternatively, they can appeal to a type of robustness and say that their prior gives results very similar to other noninformative priors and so they\u00a0can claim\u00a0to have a standard analysis that will apply to anyone\u00a0who does not have\u00a0strong prior knowledge. This would be fine if it were not for the fact\u00a0that robustness to the choice of prior is problem specific. In complex models it can\u00a0be that seemingly noninformative priors do influence the results. So to justify the robustness, the researchers\u00a0ought really to consider a wide range of\u00a0noninformative priors.<\/p>\n<p>Efron investigated a frequentist solution to assessing the accuracy of a Bayesian analysis.\u00a0It would apply to any analysis, whether or not an objective\u00a0prior was used, but its main application\u00a0is likely to be in objective Bayesian analysis when the justification for the prior lies in the good frequentist properties of the results that it produces.<\/p>\n<p>I am going to use a slightly different notation to Efron, so be warned.<\/p>\n<p>The idea is to estimate some parameter that interests us using\u00a0its posterior mean, let\u2019s call it\u00a0\u03b2(x) to emphasise that\u00a0it is an estimate\u00a0based on the real data, x. As part of the Bayesian analysis we will have specified a data generating model f(x|\u03b8), where in my notation \u03b8 is the full set of parameters, and a prior, \u03c0(\u03b8), that might or might not reflect someone&#8217;s real beliefs. The Bayesian analysis also provides the posterior mean estimates \u03b8(x).<\/p>\n<p>So we could use the data generating model f(x*|\u03b8(x)) to simulate a new data set x*. In which case, we could analyse x* to obtain the posterior mean estimate of the important parameter, \u03b2(x*). Given\u00a0enough computing power we could create thousands of simulated datasets and thousands of Bayesian estimates of \u03b2.<\/p>\n<p>A histogram of the \u03b2\u2019s would be an approximation to the\u00a0sampling distribution for \u03b2 from which we could obtain the frequentist standard deviation, which in this context is usually called the standard error. Provided that the sampling distribution is shaped roughly like a normal distribution, the standard error will lead us to an approximate 95% confidence interval.<\/p>\n<p>The important point is that this\u00a095% confidence interval will have the usual frequentist interpretation, i.e. 95% of such intervals would include the true \u03b2 and this interpretation will not depend on believing in the prior, \u03c0(\u03b8).<\/p>\n<p style=\"text-align: left\">As you might imagine this process would be computationally very demanding, so Efron provides a short cut in the form of an approximation to the standard error.\u00a0He shows that the variance of the sampling distribution is approximately,<br \/>\n<strong>Cov(\u03b2,\u03b1)\u2019V(\u00b5 |\u03b8) Cov(\u03b2,\u03b1)<\/strong><br \/>\nand the standard error is the square root of this variance.<\/p>\n<p>In this formula, V(\u00b5|\u03b8) is the posterior covariance matrix of \u00b5, the predicted values of x, and \u03b1\u00a0relates small changes in the data to the standard deviation via the log of the density; it is the vector of derivatives,<br \/>\n<strong>d log f(x|\u03b8)\/dxi<\/strong><\/p>\n<p>This is not a trivial calculation. If there are n observations and one \u03b2, the matrix V will nxn and Cov(\u03b2,\u03b1) will be nx1.<\/p>\n<p>As you would expect in a paper by Bradley Efron,\u00a0he goes on to consider using the ideas of bootstrapping to improve the confidence interval and also discusses some simplifications that result when the data generating model is\u00a0a member of\u00a0the exponential family.<\/p>\n<p>Let\u2019s see how we might find\u00a0the approximate\u00a0standard error in Stata.<\/p>\n<p>In the article Efron presents data from an experiment in which mouse nuclei were introduced into colonies of human cells. Batches of colonies were infused with one of five proportions of mouse nuclei and left for between 1 and 5 days. Each colony was eventually classified as \u2018thriving\u2019 or not and the data consist of the number thriving out of the number of colonies.<\/p>\n<table border=\"2\">\n<tbody>\n<tr>\n<td width=\"103\"><\/td>\n<td style=\"text-align: left\" width=\"103\">Day D=1<\/td>\n<td style=\"text-align: left\" width=\"103\">D=2<\/td>\n<td style=\"text-align: left\" width=\"103\">D=3<\/td>\n<td style=\"text-align: left\" width=\"103\">D=4<\/td>\n<td style=\"text-align: left\" width=\"103\">D=5<\/td>\n<\/tr>\n<tr>\n<td width=\"103\">Proportion I=1<\/td>\n<td width=\"103\">5\/31<\/td>\n<td width=\"103\">3\/28<\/td>\n<td width=\"103\">\u00a020\/45<\/td>\n<td width=\"103\">\u00a024\/47<\/td>\n<td width=\"103\">\u00a029\/35<\/td>\n<\/tr>\n<tr>\n<td width=\"103\">I=2<\/td>\n<td width=\"103\">\u00a015\/77<\/td>\n<td width=\"103\">\u00a036\/78<\/td>\n<td width=\"103\">43\/71<\/td>\n<td width=\"103\">56\/71<\/td>\n<td width=\"103\">66\/74<\/td>\n<\/tr>\n<tr>\n<td width=\"103\">I=3<\/td>\n<td width=\"103\">\u00a048\/126<\/td>\n<td width=\"103\">\u00a068\/116<\/td>\n<td width=\"103\">\u00a0145\/171<\/td>\n<td width=\"103\">\u00a098\/119<\/td>\n<td width=\"103\">\u00a0114\/129<\/td>\n<\/tr>\n<tr>\n<td width=\"103\">I=4<\/td>\n<td width=\"103\">\u00a029\/92<\/td>\n<td width=\"103\">\u00a035\/52<\/td>\n<td width=\"103\">\u00a057\/85<\/td>\n<td width=\"103\">\u00a038\/50<\/td>\n<td width=\"103\">\u00a072\/77<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left\" width=\"103\">I=5<\/td>\n<td style=\"text-align: left\" width=\"103\">\u00a011\/53<\/td>\n<td style=\"text-align: left\" width=\"103\">\u00a020\/52<\/td>\n<td style=\"text-align: left\" width=\"103\">\u00a020\/48<\/td>\n<td style=\"text-align: left\" width=\"103\">\u00a040\/55<\/td>\n<td style=\"text-align: left\" width=\"103\">\u00a056\/61<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>I entered these data into Stata as\u00a04 columns of 25 rows and I called the columns, I, D, x and n. Then I\u00a0saved the data\u00a0in a file call <strong>cell.dta<\/strong>\u00a0and plotted the percentages that thrived.<\/p>\n<p><span style=\"color: #0000ff\">gen p=100*x\/n<\/span><\/p>\n<p><span style=\"color: #0000ff\">twoway (line p I if D==1) (line p I if D==2) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 (line p I if D==3) (line p I if D==4)\u00a0 <\/span><span style=\"color: #0000ff\">(line p I if D==5) , legend(ring(0) pos(5) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 label(1 &#8220;Day 1&#8221;) label(2 &#8220;Day 2&#8221;) label(3 &#8220;Day 3&#8221;)\u00a0 <\/span><span style=\"color: #0000ff\">label(4 &#8220;Day 4&#8221;) label(5 &#8220;Day 5&#8221;) ) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 xtitle(&#8220;Proprtion I&#8221;) ytitle(&#8220;Percentage thriving&#8221;)<\/span><\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/percentbyI2-e1439379289728.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-806\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/percentbyI2-e1439379289728.png\" alt=\"percentbyI2\" width=\"600\" height=\"439\" \/><\/a><\/p>\n<p>Perhaps it would be better to plot the logit of the proportion thriving rather than the percentage but that plot looks almost identical and the percentage scale is easier to understand. There\u00a0seems to be\u00a0some justification for including a quadratic term in the proportion\u00a0I, but a similar plot against day shows no evidence of a quadratic term.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/percentbyD-e1439379420449.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-807\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/percentbyD-e1439379420449.png\" alt=\"percentbyD\" width=\"600\" height=\"439\" \/><\/a><\/p>\n<p>Despite this Efron uses a\u00a0Binomial model with probability pjk where<br \/>\n<strong>logit(p<sub>jk<\/sub>) = \u03b8<sub>0<\/sub> + \u03b8<sub>1<\/sub>I + \u03b8<sub>2<\/sub>I<sup>2<\/sup> + \u03b8<sub>3<\/sub>D + \u03b8<sub>4<\/sub>D<sup>2<\/sup><\/strong><br \/>\nj denoted proportion 1 to 5 and k denotes day<br \/>\nFor illustration I will adopt the same structure.<\/p>\n<p>Efron supposes that our interest is in <strong>\u03b2 = Average(p<sub>j5<\/sub>)\/Average(p<sub>j1<\/sub>),<\/strong> that is the ratio of the average probability of thriving on day 5 compared with the average probability on day 1.<\/p>\n<p>Now we\u00a0will do what I consider to be the wrong thing and use a prior that\u00a0I do not believe in. Let\u2019s choose \u03b8i ~ N(0,sd=100).<\/p>\n<p>This is a simple model to fit in Stata. The log posterior can be calculated by,<br \/>\n<span style=\"color: #0000ff\">program logpost<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 args lnf b<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 local theta0 = `b'[1,1]<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 local theta1 = `b'[1,2]\u00a0 <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 local theta2 = `b'[1,3]<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 local theta3 = `b'[1,4]<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 local theta4 = `b'[1,5] <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 tempvar eta lnL<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 gen `eta&#8217; = `<span style=\"color: #0000ff\">theta<\/span>0&#8242; + `<span style=\"color: #0000ff\">theta<\/span>1&#8217;*I + `<span style=\"color: #0000ff\">theta<\/span>2&#8217;*I*I + `<span style=\"color: #0000ff\">theta<\/span>3&#8217;*D + `<span style=\"color: #0000ff\">theta<\/span>4&#8217;*D*D<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 gen `lnL&#8217; = x*`eta&#8217; &#8211; n*log(1+exp(`eta&#8217;))<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 qui su `lnL&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 scalar `lnf&#8217; = r(sum) &#8211; (`<span style=\"color: #0000ff\">theta<\/span>0&#8217;*`<span style=\"color: #0000ff\">theta<\/span>0&#8217;+`<span style=\"color: #0000ff\">theta<\/span>1&#8217;*`<span style=\"color: #0000ff\">theta<\/span>1&#8217;+ \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 `<span style=\"color: #0000ff\">theta<\/span>2&#8217;*`<span style=\"color: #0000ff\">theta<\/span>2&#8217;+`<span style=\"color: #0000ff\">theta<\/span>3&#8217;*`<span style=\"color: #0000ff\">theta<\/span>3&#8217;+`<span style=\"color: #0000ff\">theta<\/span>4&#8217;*`<span style=\"color: #0000ff\">theta<\/span>4&#8242;)\/20000<\/span><br \/>\n<span style=\"color: #0000ff\">end<\/span><br \/>\nand the model can be fitted using mcmcrun. Notice the use of centring of I and D; this gives an equivalent model but the reparameterization reduces the correlation between the parameters and so\u00a0improves the mixing.<\/p>\n<p><span style=\"color: #0000ff\">use cell.dta, clear<\/span><\/p>\n<p><span style=\"color: #0000ff\">replace I = I &#8211; 2.5<br \/>\nreplace D = D &#8211; 2.5<\/span><\/p>\n<p><span style=\"color: #0000ff\">matrix b = (0.0, 0.0, 0.0, 0.0, 0.0)<\/span><br \/>\n<span style=\"color: #0000ff\">matrix s = (0.05, 0.05, 0.05, 0.05, 0.05)<\/span><br \/>\n<span style=\"color: #0000ff\">mcmcrun logpost b using temp.csv, replace <\/span><span style=\"color: #0000ff\">samplers( 5(mhsnorm , sd(s) )\u00a0 ) adapt \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 param(<span style=\"color: #0000ff\">theta<\/span>0 <span style=\"color: #0000ff\">theta<\/span>1 <span style=\"color: #0000ff\">theta<\/span>2 <span style=\"color: #0000ff\">theta<\/span>3 <span style=\"color: #0000ff\">theta<\/span>4) burn(5000) update(20000) thin(10)<\/span><br \/>\n<span style=\"color: #0000ff\">matrix list s<\/span><br \/>\n<span style=\"color: #0000ff\">insheet using temp.csv, clear<\/span><br \/>\n<span style=\"color: #0000ff\">mcmcstats_v2 <span style=\"color: #0000ff\">theta<\/span>* , newey(50) corr<\/span><br \/>\n<span style=\"color: #0000ff\">mcmctrace <span style=\"color: #0000ff\">theta<\/span>*<\/span><\/p>\n<pre>----------------------------------------------------------------------------\r\nParameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 n\u00a0\u00a0\u00a0\u00a0 mean\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sd\u00a0\u00a0\u00a0\u00a0\u00a0 sem\u00a0\u00a0 median\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 95% CrI\r\n----------------------------------------------------------------------------\r\n\u00a0theta0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a02000\u00a0\u00a0\u00a0 0.583\u00a0\u00a0\u00a0 0.090\u00a0\u00a0 0.0027\u00a0\u00a0\u00a0 0.582 (\u00a0\u00a0 0.407,\u00a0\u00a0 0.759 )\r\n\u00a0theta1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a02000\u00a0\u00a0\u00a0 0.478\u00a0\u00a0\u00a0 0.060\u00a0\u00a0 0.0020\u00a0\u00a0\u00a0 0.478 (\u00a0\u00a0 0.362,\u00a0\u00a0 0.597 )\r\n\u00a0theta2\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0\u00a0 -0.316\u00a0\u00a0\u00a0 0.036\u00a0\u00a0 0.0012\u00a0\u00a0 -0.316 (\u00a0 -0.388,\u00a0 -0.245 )\r\n\u00a0theta3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0\u00a0\u00a0 0.792\u00a0\u00a0\u00a0 0.049\u00a0\u00a0 0.0012\u00a0\u00a0\u00a0 0.793 (\u00a0\u00a0 0.696,\u00a0\u00a0 0.891 )\r\n\u00a0theta4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0\u00a0 -0.041\u00a0\u00a0\u00a0 0.034\u00a0\u00a0 0.0009\u00a0\u00a0 -0.041 (\u00a0 -0.108,\u00a0\u00a0 0.028 )\r\n----------------------------------------------------------------------------<\/pre>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/efrontrace-e1439451573449.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-809\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/efrontrace-e1439451573449.png\" alt=\"efrontrace\" width=\"600\" height=\"439\" \/><\/a><\/p>\n<p>We can use these results to calculate the ratio beta. I first read the data and saved the n&#8217;s as locals then I re-read the mcmc results, and calculated the probabilities from the logit model and the expected numbers that thrived. Beta is just a function of those probabilities.<\/p>\n<p><span style=\"color: #0000ff\">use cell.dta,clear<\/span><br \/>\n<span style=\"color: #0000ff\">forvalues i=1\/25 {<\/span><br \/>\n<span style=\"color: #0000ff\">local n`i&#8217; = n[`i&#8217;]<\/span><br \/>\n<span style=\"color: #0000ff\">}<\/span><br \/>\n<span style=\"color: #0000ff\">insheet using temp.csv, clear<\/span><\/p>\n<p><span style=\"color: #0000ff\">local k = 0<\/span><br \/>\n<span style=\"color: #0000ff\">forvalues i=1\/5 {<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 forvalues d=1\/5 {<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0 local ++k<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0 gen eta`k&#8217; = theta0+theta1*(`i&#8217;-2.5)+theta2*(`i&#8217;-2.5)*(`i&#8217;-2.5)+ \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 theta3*(`d&#8217;-2.5)+theta4*(`d&#8217;-2.5)*(`d&#8217;-2.5)<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0 gen p`k&#8217; = invlogit(eta`k&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0 gen mu`k&#8217; = `n`k&#8221;*p`k&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 }<\/span><br \/>\n<span style=\"color: #0000ff\">}<\/span><br \/>\n<span style=\"color: #0000ff\">gen beta = (p5+p10+p15+p20+p25)\/(p1+p6+p11+p16+p21)<\/span><br \/>\n<span style=\"color: #0000ff\">mcmcstats_v2 beta, newey(50)<\/span><\/p>\n<p><span style=\"color: #0000ff\">mcmcdensity beta<\/span><\/p>\n<pre>----------------------------------------------------------------------------\r\nParameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 n\u00a0\u00a0\u00a0\u00a0 mean\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sd\u00a0\u00a0\u00a0\u00a0\u00a0 sem\u00a0\u00a0 median\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 95% CrI\r\n----------------------------------------------------------------------------\r\n\u00a0beta\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0\u00a0\u00a0 3.367\u00a0\u00a0\u00a0 0.261\u00a0\u00a0 0.0057\u00a0\u00a0\u00a0 3.356 (\u00a0\u00a0 2.896,\u00a0\u00a0 3.938 )\r\n----------------------------------------------------------------------------<\/pre>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/efronbetaden-e1439451778910.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-810\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/efronbetaden-e1439451778910.png\" alt=\"efronbetaden\" width=\"600\" height=\"439\" \/><\/a><\/p>\n<p>So if we had believed in the prior we would be able to say that now we believe with probability 0.95 that beta is between 2.90 and 3.94. But we do not really believe in the prior so we need some other way of expressing the results.<\/p>\n<p>In order to study the frequentist properties of our analysis,\u00a0we need to calculate the covariance matrix V and the vector of 25 derivatives with respect to each of the x\u2019s.<\/p>\n<p>For the binomial model the log-likelihood has the form<br \/>\nSum<sub>j,k<\/sub> {x<sub>jk<\/sub>log(p<sub>jk<\/sub>) + (n<sub>jk<\/sub>-x<sub>jk<\/sub>)log(1-p<sub>jk<\/sub>) }<br \/>\nA little algebra shows that in terms of \u03b8 this becomes<br \/>\nSum<sub>j,k<\/sub> {x<sub>jk<\/sub>[\u03b8<sub>0<\/sub> + \u03b8<sub>1<\/sub>j + \u03b8<sub>2<\/sub>j<sup>2<\/sup> + \u03b8<sub>3<\/sub>k + \u03b8<sub>4<\/sub>k<sup>2<\/sup>] &#8211; n<sub>jk<\/sub>log(1+exp[\u03b8<sub>0<\/sub> + \u03b8<sub>1<\/sub>j + \u03b8<sub>2<\/sub>j<sup>2<\/sup> + \u03b8<sub>3<\/sub>k + \u03b8<sub>4<\/sub>k<sup>2<\/sup>]) }<br \/>\nIf we differentiate with respect to x<sub>jk<\/sub> we get<br \/>\n\u03b1<sub>jk<\/sub> = \u03b8<sub>0<\/sub> + \u03b8<sub>1<\/sub>j + \u03b8<sub>2<\/sub>j<sup>2<\/sup> + \u03b8<sub>3<\/sub>k + \u03b8<sub>4<\/sub>k<sup>2<\/sup><br \/>\nNow we are in a position to find V and the covariances and to combine them to get the approximate standard error.<\/p>\n<p><span style=\"color: #0000ff\">matrix accum V = mu*, nocon dev<\/span><br \/>\n<span style=\"color: #0000ff\">matrix V = V\/2000<\/span><\/p>\n<p><span style=\"color: #0000ff\">matrix COV = J(1,25,0)<\/span><br \/>\n<span style=\"color: #0000ff\">forvalues i=1\/25 {<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 qui corr beta eta`i&#8217; , cov<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0 matrix COV[1,`i&#8217;] = r(cov_12)<\/span><br \/>\n<span style=\"color: #0000ff\">}<\/span><\/p>\n<p><span style=\"color: #0000ff\">matrix VAR = COV*V*COV&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\">di sqrt(VAR[1,1])<\/span><\/p>\n<p>The answer is that the approximate standard error is 0.252.<\/p>\n<p>You will have noticed that this standard error\u00a0is very close to the standard deviation of the posterior, which was 0.261.<\/p>\n<p>The slightly disappointing thing about this method is that, in most cases, the frequentist standard error will turn out to be almost identical to the posterior standard deviation. So the slightly ignorant approach of treating the credible interval as being interchangeable with a confidence interval will often work.<\/p>\n<p>None the less it seems to me that\u00a0deriving the frequentist properties of a Bayesian solution offers the only practical hope for objective Bayesian analysis.\u00a0Of course,\u00a0it is simpler and better to use priors that you actually believe in.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Every few weeks I\u00a0try to find\u00a0an hour or so to browse through the latest issues of the main statistics journals. This gives me\u00a0the opportunity\u00a0to\u00a0read papers on\u00a0topics that I know relatively little about; it\u00a0is\u00a0my way of keeping up to date with new developments\u00a0beyond my day to day work on genetic epidemiology. Of course,\u00a0it is only possible [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[78,79,76,77],"class_list":["post-783","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-confidence-interval","tag-efron","tag-objective-bayes","tag-standard-error"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/783","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=783"}],"version-history":[{"count":20,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/783\/revisions"}],"predecessor-version":[{"id":955,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/783\/revisions\/955"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=783"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=783"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=783"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}