{"id":451,"date":"2015-02-06T09:52:28","date_gmt":"2015-02-06T09:52:28","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=451"},"modified":"2025-02-26T13:21:38","modified_gmt":"2025-02-26T13:21:38","slug":"poisson-regression-with-two-random-effects","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2015\/02\/06\/poisson-regression-with-two-random-effects\/","title":{"rendered":"Poisson regression with two random effects"},"content":{"rendered":"<p><strong>Here we go again<\/strong><\/p>\n<p>There has been quite a long gap since my last post because over the New Year and for a large part of January I was visiting my daughter who is working in Colombia. It was my first time in South America and I was immediately\u00a0struck by the beauty and the vitality of the country. It was a refreshing change from visiting continental Europe, which appears tired and lacking in any sense of optimism. Columbia is definitely a country that is moving forward.<\/p>\n<p>After such a gap it is natural to look back over last year\u2019s postings and to think a little about the direction that the blog should take in the future. I notice that there has been a tendency for the topics that I cover to become more and more complex as time progresses. This is perhaps natural but it goes away from my reasons for writing the book. <em>Bayesian Analysis with Stata<\/em> was meant to be a guide to Bayesian analysis for a typical Stata user, by which I mean someone who is a user of statistical techniques but who is not a research statistician. I will try to correct the balance over the coming year by including more practical applications.<\/p>\n<p><strong>Downloading my programs and data<\/strong><\/p>\n<p>One of the problems that I had in the first year was that of making the software and data that I refer to available to readers. The blog itself can link to pdf files but not to files with any other format, so I can\u2019t use it to make available the do, ado or dta files that I refer to. To make matters worse, my University has adopted a web protocol that Stata cannot read so I cannot post the code on my website and allow users to <strong>net install<\/strong> it in Stata.<\/p>\n<p>There does not seem to be a totally satisfactory solution to this problem, but I have decided to add a page to my personal homepage (<a href=\"https:\/\/www2.le.ac.uk\/Members\/trj\">https:\/\/www2.le.ac.uk\/Members\/trj<\/a>) called <strong>blog downloads<\/strong> and I will use this page to store zipped versions of the programs and data that I refer to. You will not be able to access them directly from within Stata as I would wish, but it will be a small job to download and unpack the files. As time allows I will\u00a0do back over last year&#8217;s posting and add the code from those postings to the same webpage.<\/p>\n<p><strong>A Bayesian GLM with random effects<\/strong><\/p>\n<p>In the spirit of starting the year with something\u00a0straightforward I thought that I would take a simple Bayesian model and fit it using Stata, Mata and then WinBUGS in order to re-establish the general approach. I have chosen to use the well-known epilepsy data of Thall and Vail and to fit a Poisson regression with\u00a0two random effects. Of course, nothing is quite as simple as you expect and in\u00a0turns out that\u00a0this example throws up several interesting points, not least how best to program random effects in Stata.<\/p>\n<p><strong>The Model<\/strong><\/p>\n<p>The epilepsy data were introduced by Thall and Vail in 1990 (<a href=\"http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/2242408\">http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/2242408<\/a>) and come from a drug trial on 59 patients with epilepsy. The response is the count of the number of seizures that each subject has in each of 4 consecutive 2-week periods. The explanatory variables are the treatment, the total number of seizures in an 8 week period prior to treatment and\u00a0the subject\u2019s age. Because so many people have looked at these data, the form of the Poisson regression has become standardised and I will not attempt to revisit the form but just accept,<br \/>\n<span style=\"color: #0000ff\">log(\u03bc) = \u03b20 + \u03b21log(baseline\/4)+ \u03b22 trt + \u03b23log(age) + \u03b24trt*log(baseline\/4) + \u03b25v4 + u + e<\/span><br \/>\nwhere v4 is an indicator for the fourth visit, u is a subject specific random effect and e is a random effect with a different value for each visit of each subject. So u can be thought of as allowing for the unmeasured characteristics of the subject and e as allowing for the unmeasured events occurring to that person in that particular two-week period.<\/p>\n<p>Most\u00a0Bayesian analyses of these data\u00a0have used very vague priors but since I do not really approve of\u00a0such priors, I will opt for what I like to call realistically vague priors. Let us assume that prior to the trial we anticipate that the subject will have around one seizure per day so the two-week counts will be about 14. If we centre the covariates, then we would expect the constant to be about log(14)=2.6. I would anticipate that the coefficient of the baseline term (log(baseline\/4) will be about\u00a01.0 since this would\u00a0imply no change between baseline and the trial.\u00a0Finally I will centre the priors of the other regression coefficients on zero because I cannot confidently predict their direction of effect.<\/p>\n<p>Assuming that people vary so that some average only one seizure\u00a0every two weeks and some average 2 per day then log(\u03bc) will range from 0 to 3.5, a variance of about 0.8 (sd about 0.9). If the linear predictor explains half of this variance then the random effect will have variance of about 0.2 each, which implies a standard deviation of 0.45. These of course are guesses but the standard deviations will not be 4.5 and are unlikely to be 0.045, so there is little point in making our priors ridiculously wide.<\/p>\n<p>In the end I decided on,<br \/>\n<span style=\"color: #0000ff\">Cons: \u03b20 ~ N(2.6,1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 log(base\/4): \u03b21~ N(1,1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Trt: \u03b22 ~ N(0,1)<\/span><br \/>\n<span style=\"color: #0000ff\"> log(age): \u03b23 ~ N(0,1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 trt*log(base\/4): \u03b24 ~ N(0,1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0V4: \u03b25 ~ N(0,1)<\/span><br \/>\n<span style=\"color: #0000ff\"> Precision of u: \u03c4u ~ G(8,0.5)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Precision of e: \u03c4e ~ G(8,0.5)<\/span><\/p>\n<p>In fact my guesses are a little off target, the data suggest that typically people in the trial have about 1 seizure every two days and not one every day and the range is greater than I allowed for; some subject have hardly any seizures and the upper limit is higher than I anticipated with a few having over 50 within a single two week period. Of course, I cannot change\u00a0my priors in the light of the data.<\/p>\n<p><strong>Calculating the posterior<\/strong><\/p>\n<p>Fitting this model is computationally quite challenging because as well as the 6 regression coefficients, \u03b2, we also have two variances and (4&#215;59+59=295) values for the two random effects. Computation would be simpler is we were able to integrate the posterior over the 295 values of the random effects, but with a Poisson model this integration is not tractable so either we do in numerically or we estimate the 295 values as part of the analysis. The computation will be a particular challenge for Stata because it is slow relative to Mata and WinBUGS.<\/p>\n<p><strong>Analysis in Stata<\/strong><\/p>\n<p>The standard way\u00a0that I use for\u00a0fitting a Bayesian model in Stata is to place the values of all of the parameters in a row matrix. In this case, were we to pack all 303 parameters into a matrix, we would expect the resulting program to be slowed down by the\u00a0repeated need to unpack the matrix. Probably we are not interested in the 295 values of the random effects and we would have integrated then out of the posterior had that been simple, so a convenient alternative would be place the 6 regression coefficients and the two precisions in a matrix but to place the values of the random effects in Stata variables. Only the first 8 parameter will be saved in the results file but the coding can be made much more efficient with vectors instead of matrices; this speeds up the program and makes for simpler code.<\/p>\n<p>In the file <strong>epilepsy.dta<\/strong>, the \u00a0data are stored\u00a0\u00a0in \u00a0seven variables with one line for each person<br \/>\n<span style=\"color: #0000ff\">y_1, y_2, y_3, y_4 = the counts for each of the four weeks<\/span><br \/>\n<span style=\"color: #0000ff\"> Base\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = the count for the baseline period<\/span><br \/>\n<span style=\"color: #0000ff\"> Trt\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = the treatment coded 0,1<\/span><br \/>\n<span style=\"color: #0000ff\"> Age\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = subject\u2019s age in years<\/span><\/p>\n<p>Let us start by centring the data. This should improve mixing.<br \/>\n<span style=\"color: #0000ff\">use epilepsy.dta , clear<\/span><br \/>\n<span style=\"color: #0000ff\"> gen LB = log(Base\/4)<\/span><br \/>\n<span style=\"color: #0000ff\"> qui su LB<\/span><br \/>\n<span style=\"color: #0000ff\"> gen cLB = LB \u2013 r(mean)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen lnAge = log(Age)<\/span><br \/>\n<span style=\"color: #0000ff\"> qui su lnAge<\/span><br \/>\n<span style=\"color: #0000ff\"> gen clnAge = lnAge &#8211; r(mean)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen BXT = LB*Trt<\/span><br \/>\n<span style=\"color: #0000ff\"> qui su BXT<\/span><br \/>\n<span style=\"color: #0000ff\"> qui gen cBXT = BXT &#8211; r(mean)<\/span><br \/>\n<span style=\"color: #0000ff\"> qui su Trt<\/span><br \/>\n<span style=\"color: #0000ff\"> gen cTrt = Trt &#8211; r(mean)<\/span><\/p>\n<p>Next we will define variables to contain the 295 random effects.\u00a0U will contain the 59 subject level effect and the E\u2019s will contain the random effects for periods within subject. I&#8217;ll fill them with random starting values.<br \/>\n<span style=\"color: #0000ff\">gen U = rnormal(0,0.5)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen E1 = rnormal(0,0.5)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen E2 = rnormal(0,0.5)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen E3 = rnormal(0,0.5)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen E4 = rnormal(0,0.5)<\/span><\/p>\n<p>We will need a program to evaluate the log-posterior. The program below does the job although I have made no attempt to make it efficient.<br \/>\n<span style=\"color: #0000ff\">program logpost<br \/>\nargs logp b<\/span><\/p>\n<p><span style=\"color: #0000ff\">local cons = `b'[1,1]<\/span><br \/>\n<span style=\"color: #0000ff\">local base = `b'[1,2]<\/span><br \/>\n<span style=\"color: #0000ff\"> local trt\u00a0 = `b'[1,3]<\/span><br \/>\n<span style=\"color: #0000ff\"> local bxt\u00a0 = `b'[1,4]<\/span><br \/>\n<span style=\"color: #0000ff\"> local age\u00a0 = `b'[1,5]<\/span><br \/>\n<span style=\"color: #0000ff\"> local v4\u00a0\u00a0 = `b'[1,6]<\/span><br \/>\n<span style=\"color: #0000ff\"> local tU\u00a0\u00a0 = `b'[1,7]<\/span><br \/>\n<span style=\"color: #0000ff\"> local tE\u00a0\u00a0 = `b'[1,8]<\/span><br \/>\n<span style=\"color: #0000ff\"> scalar `logp&#8217; = 0<\/span><br \/>\n<span style=\"color: #0000ff\"> tempvar eta mu1 mu2 mu3 mu4 LL<\/span><br \/>\n<span style=\"color: #008000\">* log-likelihood of y_1 to y_4<\/span><br \/>\n<span style=\"color: #0000ff\">gen `eta&#8217; = `cons&#8217; + `base&#8217;*cLB + `trt&#8217;*cTrt + `bxt&#8217;*cBXT + `age&#8217;*clnAge<\/span><br \/>\n<span style=\"color: #0000ff\">gen `mu1&#8242; = exp(`eta&#8217; + U + E1)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity poisson `logp&#8217; y_1 `mu1&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `mu2&#8242; = exp(`eta&#8217; + U + E2)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity poisson `logp&#8217; y_2 `mu2&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `mu3&#8242; = exp(`eta&#8217; + U + E3)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity poisson `logp&#8217; y_3 `mu3&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `mu4&#8242; = exp(`eta&#8217; + `v4&#8242; + U + E4)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity poisson `logp&#8217; y_4 `mu4&#8242;<\/span><br \/>\n<span style=\"color: #008000\">\u00a0* log-density of the random effects<\/span><br \/>\n<span style=\"color: #0000ff\">local sdE = 1\/sqrt(`tE&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; E1 0 `sdE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; E2 0 `sdE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; E3 0 `sdE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; E4 0 `sdE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> local sdU = 1\/sqrt(`tU&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; U 0 `sdU&#8217;<\/span><br \/>\n<span style=\"color: #008000\">\u00a0\u00a0 * priors<\/span><br \/>\n<span style=\"color: #0000ff\">logdensity normal `logp&#8217; `cons&#8217; 2.6 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; `base&#8217; 1 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; `trt&#8217; 0 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; `bxt&#8217; 0 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; `age&#8217; 0 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity normal `logp&#8217; `v4&#8242; 0 1<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity gamma\u00a0 `logp&#8217; `tU&#8217; 8 0.5<\/span><br \/>\n<span style=\"color: #0000ff\"> logdensity gamma\u00a0 `logp&#8217; `tE&#8217; 8 0.5<\/span><br \/>\n<span style=\"color: #0000ff\"> end<\/span><\/p>\n<p>As we will be working with some parameters in vectors rather than matrices, it makes sense to write our own sampler, which I will call mhsEPY because it is based on a Metropolis-Hastings algorithm. Let us reverse the natural order and see how it is called before we look at its contents.<\/p>\n<p><span style=\"color: #0000ff\">matrix b = (2.6 , 1, 0, 0, 0 ,0 , 4, 4)<\/span><br \/>\n<span style=\"color: #0000ff\"> mcmcrun logpost b using mcmcepi.csv , \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 param(cons base trt bxt age v4 tauU tauE) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 samp( mhsEPY , dim(8) ) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0 burn(5000) update(50000) thin(5) replace<\/span><\/p>\n<p>The matrix b only contains 8 parameters so those are the only ones that are saved to the file <strong>mcmcepi.csv<\/strong>. We tell <strong>mcmcrun<\/strong> to update using mhsEPY and we say that it will update 8 parameters, while in fact it updates the 8 and then updates the 295 random effects. If you like, you can think of the updating of the random effects as a form of Monte Carlo numerical integration.<\/p>\n<p>In my programs, samplers are called with three arguments; the name of the program for evaluating the log-posterior, the name of the matrix of parameters and the number of the first parameter to be updated at that stage. Here is my new sampler,<\/p>\n<p><span style=\"color: #0000ff\">program mhsEPY<\/span><br \/>\n<span style=\"color: #0000ff\"> args logpost b ipar<\/span><\/p>\n<p><span style=\"color: #0000ff\">local cons = `b'[1,1]<\/span><br \/>\n<span style=\"color: #0000ff\"> local base = `b'[1,2]<\/span><br \/>\n<span style=\"color: #0000ff\"> local trt = `b'[1,3]<\/span><br \/>\n<span style=\"color: #0000ff\"> local bxt = `b'[1,4]<\/span><br \/>\n<span style=\"color: #0000ff\"> local age = `b'[1,5]<\/span><br \/>\n<span style=\"color: #0000ff\"> local v4 = `b'[1,6]<\/span><br \/>\n<span style=\"color: #0000ff\"> local tauU = `b'[1,7]<\/span><br \/>\n<span style=\"color: #0000ff\"> local tauE = `b'[1,8]<\/span><br \/>\n<span style=\"color: #008000\"> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #008000\"> * Update first 8 parameters<\/span><br \/>\n<span style=\"color: #008000\"> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0mhsnorm `logpost&#8217; `b&#8217; 1, sd(0.1)<br \/>\nmhsnorm `logpost&#8217; `b&#8217; 2, sd(0.1)<br \/>\nmhsnorm `logpost&#8217; `b&#8217; 3, sd(0.25)<br \/>\nmhsnorm `logpost&#8217; `b&#8217; 4, sd(0.1)<br \/>\nmhsnorm `logpost&#8217; `b&#8217; 5, sd(0.1)<br \/>\nmhsnorm `logpost&#8217; `b&#8217; 6, sd(0.1)<br \/>\nmhslogn `logpost&#8217; `b&#8217; 7, sd(0.3)<br \/>\nmhslogn `logpost&#8217; `b&#8217; 8, sd(0.3)<\/span><\/p>\n<p><span style=\"color: #008000\"> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #008000\"> * Update E1, E2, E3 , E4<\/span><br \/>\n<span style=\"color: #008000\"> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #0000ff\"> tempvar eta nu1 nu2 nu3 nu4 LL1 LL2 LL3 LL4 lu1 lu2 lu3 lu4 \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> newE1 newE2 newE3 newE4 nu1p nu2p nu3p nu4p newLL1 newLL2 newLL3 newLL4<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `eta&#8217; = `cons&#8217; + `base&#8217;*cLB + `trt&#8217;*cTrt + `bxt&#8217;*cBXT + `age&#8217;*cAge<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu1&#8242; = `eta&#8217; + S + E1<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `LL1&#8242; = -exp(`nu1&#8242;) + y_1*`nu1&#8242; &#8211; 0.5*E1*E1*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newE1&#8242; = E1 + rnormal(0,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu1p&#8217; = `eta&#8217; + S + `newE1&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newLL1&#8242; = -exp(`nu1p&#8217;) + y_1*`nu1p&#8217; &#8211; 0.5*`newE1&#8217;*`newE1&#8217;*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `lu1&#8242; = log(runiform())<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace E1 = `newE1&#8242; if `lu1&#8242; &lt; `newLL1&#8242; &#8211; `LL1&#8242;<\/span><\/p>\n<p><span style=\"color: #0000ff\">gen `nu2&#8242; = `eta&#8217; + S + E2<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `LL2&#8242; = -exp(`nu2&#8242;) + y_2*`nu2&#8242; &#8211; 0.5*E2*E2*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\">gen `newE2&#8242; = E2 + rnormal(0,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu2p&#8217; = `eta&#8217; + S + `newE2&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newLL2&#8242; = -exp(`nu2p&#8217;) + y_2*`nu2p&#8217; &#8211; 0.5*`newE2&#8217;*`newE2&#8217;*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `lu2&#8242; = log(runiform())<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace E2 = `newE2&#8242; if `lu2&#8242; &lt; `newLL2&#8242; &#8211; `LL2&#8242;<\/span><\/p>\n<p><span style=\"color: #0000ff\">gen `nu3&#8242; = `eta&#8217; + S + E3<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `LL3&#8242; = -exp(`nu3&#8242;) + y_3*`nu3&#8242; &#8211; 0.5*E3*E3*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newE3&#8242; = E3 + rnormal(0,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu3p&#8217; = `eta&#8217; + S + `newE3&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newLL3&#8242; = -exp(`nu3p&#8217;) + y_3*`nu3p&#8217; &#8211; 0.5*`newE3&#8217;*`newE3&#8217;*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `lu3&#8242; = log(runiform())<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace E3 = `newE3&#8242; if `lu3&#8242; &lt; `newLL3&#8242; &#8211; `LL3&#8242;<\/span><\/p>\n<p><span style=\"color: #0000ff\">gen `nu4&#8242; = `eta&#8217; + `v4&#8242; + S + E4<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `LL4&#8242; = -exp(`nu4&#8242;) + y_4*`nu4&#8242; &#8211; 0.5*E4*E4*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newE4&#8242; = E4 + rnormal(0,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu4p&#8217; = `eta&#8217; + `v4&#8242; + S + `newE4&#8242;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newLL4&#8242; = -exp(`nu4p&#8217;) + y_4*`nu4p&#8217; &#8211; 0.5*`newE4&#8217;*`newE4&#8217;*`tauE&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `lu4&#8242; = log(runiform())<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace E4 = `newE4&#8242; if `lu4&#8242; &lt; `newLL4&#8242; &#8211; `LL4&#8242;<\/span><br \/>\n<span style=\"color: #008000\">*&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #008000\"> * Update U<\/span><br \/>\n<span style=\"color: #008000\"> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/span><br \/>\n<span style=\"color: #0000ff\"> tempvar nu LL newU newLL lu<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `nu&#8217; = `eta&#8217; + U + E1<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `LL&#8217; = -exp(`nu&#8217;) + y_1*`nu&#8217; &#8211; 0.5*S*S*`tauU&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + U + E2<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `LL&#8217; = `LL&#8217; -exp(`nu&#8217;) + y_2*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + U + E3<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `LL&#8217; = `LL&#8217; -exp(`nu&#8217;) + y_3*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + `v4&#8242; + U + E4<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `LL&#8217; = `LL&#8217; -exp(`nu&#8217;) + y_4*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newU&#8217; = U + rnormal(0,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + `newU&#8217; + E1<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `newLL&#8217; = -exp(`nu&#8217;) + y_1*`nu&#8217; &#8211; 0.5*`newU&#8217;*`newU&#8217;*`tauU&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + `nU&#8217; + E2<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `newLL&#8217; = `newLL&#8217; -exp(`nu&#8217;) + y_2*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + `newU&#8217; + E3<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `newLL&#8217; = `newLL&#8217; -exp(`nu&#8217;) + y_3*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `nu&#8217; = `eta&#8217; + `v4&#8242; + `newU&#8217; + E4<\/span><br \/>\n<span style=\"color: #0000ff\"> qui replace `newLL&#8217; = `newLL&#8217; -exp(`nu&#8217;) + y_4*`nu&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `lu&#8217; = log(runiform())<\/span><br \/>\n<span style=\"color: #0000ff\">qui replace S = `newS&#8217; if `lu&#8217; &lt; `newLL&#8217; &#8211; `LL&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> end<\/span><\/p>\n<p>Each of the parameters is updated in a M-H step so we have to find appropriate proposal distributions that produce good mixing and this will mean some trial and error while we play with the proposal standard deviations. In the code we take advantage of the conditional independence of the random effects, which means that we can run the updating calculations for the whole vector in one go.<\/p>\n<p>I chose to run a burn-in of 5,000 and a run of 50,000 thinned to 10,000. So I needed 55,000 updates of the 303 parameters and in Stata this is not quick. On my desktop it took about 42 minutes, which equates to about 46 seconds for every 1,000 updates. I could probably save a lot of time by replacing the calls to <strong>logdensity<\/strong> with code for the normal and gamma log-densities but even then the code would still be quite slow and probably we should be using Mata or WinBUGS for such a large problem.<\/p>\n<p>The trace plot is not very impressive<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/01\/trace.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-454\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/01\/trace-1024x753.png\" alt=\"trace\" width=\"620\" height=\"456\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/01\/trace-1024x753.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/01\/trace-300x221.png 300w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>And this is reflected in the large standard errors on the baseline, the treatment effect and interaction between them.<\/p>\n<pre>-----------------------------------------------------------\r\n Parameter n    mean   sd    sem   median     95% CrI\r\n ----------------------------------------------------------\r\n cons   10000  1.588 0.081 0.0037  1.589 ( 1.427, 1.738 )\r\n base   10000  0.910 0.155 0.0145  0.898 ( 0.651, 1.251 )\r\n trt    10000 -0.791 0.419 0.0526 -0.823 ( -1.540, 0.039 )\r\n bxt    10000  0.257 0.219 0.0417  0.262 ( -0.194, 0.657 )\r\n age    10000  0.363 0.335 0.0175  0.366 ( -0.294, 1.003 )\r\n v4     10000 -0.091 0.091 0.0022 -0.090 ( -0.268, 0.089 )\r\n sigmaU 10000  0.500 0.056 0.0010  0.495 ( 0.403, 0.622 )\r\n sigmaE 10000  0.401 0.034 0.0009  0.401 ( 0.336, 0.471 )\r\n ----------------------------------------------------------\r\n<\/pre>\n<p>In this table sigma=1\/sqrt(tau).<\/p>\n<p>Clearly his Stata program needs more work<br \/>\n(a) To make the code more efficient so that it runs faster<br \/>\n(b) To improve the mixing of the sampler<\/p>\n<p>If this were a real problem rather than an illustrative example I would probably abandon Stata at this point and switch to Mata. However, that is a bit defeatist, so in my next posting I will return to the Stata code and try to improve it.<\/p>\n<p>The data and programs for this analysis can be downloaded from\u00a0 <a href=\"https:\/\/www2.le.ac.uk\/Members\/trj\">https:\/\/www2.le.ac.uk\/Members\/trj<\/a>\u00a0by following the link to <strong>blog downloads<\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here we go again There has been quite a long gap since my last post because over the New Year and for a large part of January I was visiting my daughter who is working in Colombia. It was my first time in South America and I was immediately\u00a0struck by the beauty and the vitality [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[46,47,4],"class_list":["post-451","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-poisson","tag-random-effects","tag-stata"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/451","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=451"}],"version-history":[{"count":11,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/451\/revisions"}],"predecessor-version":[{"id":476,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/451\/revisions\/476"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=451"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=451"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=451"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}