{"id":417,"date":"2014-11-28T09:17:47","date_gmt":"2014-11-28T09:17:47","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=417"},"modified":"2025-02-26T13:21:38","modified_gmt":"2025-02-26T13:21:38","slug":"modelling-heart-biopsies","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/11\/28\/modelling-heart-biopsies\/","title":{"rendered":"Modelling heart biopsies"},"content":{"rendered":"<p>Last week I introduced the JAGS program as an alternative to WinBUGS and this week I\u00a0started with the intention of comparing JAGS and WinBUGS using a sample dataset. I decided to base the comparison on the biopsy example\u00a0taken from\u00a0the WinBUGS help files.<\/p>\n<p>Predictably, by the time that I had explained the model and fitted it by calling WinBUGS from Stata, I had already exceeded the quota of words that I allow myself each week, so I have split the posting in two. This week I will analyse the biopsy data in Stata by calling WinBUGS and next time I will make the\u00a0comparison with JAGS.<\/p>\n<p><strong>The example: Heart Biopsies<\/strong><\/p>\n<p>These data are supplied with WinBUGS as one of its examples and come from an early study of biopsies taken from transplanted hearts. The data\u00a0were originally analysed in,<\/p>\n<p>Spiegelhalter DJ and Stovin PG<br \/>\nAn analysis of repeated biopsies following cardiac transplantation.<br \/>\n<em>Statistic in Medicine<\/em> 1983 2(1):33-40<\/p>\n<p>During the first year after a heart transplant, patients have regular biopsies to check for rejection. Biopsies are done under local anaesthetic and involve passing a wire through a vein until it reaches the heart. Once there, 4 or 5 tiny tissue samples are taken from different parts of the\u00a0transplanted heart and later these are graded by a pathologist for cellular changes that are indicative of rejection. Rejection grading is usually done on a four point scale, none-mild-moderate-severe. A moderate or severe grade would probably lead to an adjustment of the patient&#8217;s immunosuppression treatment.<\/p>\n<p>At the time of this particular study the practice was to take 3 biopsies but, for a few subjects only 2 usable samples were obtained. In all, the dataset contains grades for 2 or 3 biopsies from 157 hearts. A typical data entry might be (1,2,0,0) meaning that there were 3 biopsies taken from this heart and one biopsy\u00a0was graded &#8216;none&#8217; and two were graded &#8216;mild&#8217;.<\/p>\n<p>Cellular changes related to rejection are not uniform across the heart and a serious concern is the false negative rate whereby a heart is being rejected but by chance the biopsies are all taken from healthy areas of tissue; several biopsies are taken at the same time in order to minimise this risk. The overall state of a heart is defined by the grade of its worst affected region (not the worst biopsy). So when three biopsies are taken from a &#8216;moderate&#8217; heart, the results might be (mild,mild,moderate) or (mild,moderate,moderate) or even (none,mild,mild) if the &#8216;moderate&#8217; region is missed.<\/p>\n<p>A model for these data has two components, a set of probabilities p1, p2, p3, p4 representing the proportion of hearts that are none, mild, moderate or severe and a\u00a0matrix of classification error probabilities, e[,], connecting the heart\u2019s grade with the\u00a0grade of a single biopsy.<\/p>\n<p><span style=\"color: #008080\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Biopsy Grade\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Constraint<\/span><br \/>\n<span style=\"color: #008080\"> Grade of Heart\u00a0\u00a0\u00a0\u00a0 None\u00a0\u00a0\u00a0\u00a0 Mild\u00a0\u00a0\u00a0\u00a0 Moderate\u00a0\u00a0\u00a0 Severe<\/span><br \/>\n<span style=\"color: #008080\"> none\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0<\/span><br \/>\n<span style=\"color: #008080\"> mild\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e21\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e22\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e21+e22=1<\/span><br \/>\n<span style=\"color: #008080\"> moderate\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e31\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e32\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e33\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0e31+e32+e33=1<\/span><br \/>\n<span style=\"color: #008080\"> severe\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e41\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0e42\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e43\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 e44\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0e41+e42+e43+e44=1<\/span><\/p>\n<p>Obviously a heart that is mildly affected could never produce a severely affected biopsy, so there are constraints on the e\u2019s and there are only six free parameters. We will make one further, rather strong, assumption, which is that the three biopsies are independent of one another given the overall grade of the heart.\u00a0The under this model\u00a0the probability of the data entry (1,2,0,0) is p2*3*e21*e22<sup>2<\/sup> + p3*3*e31*e32<sup>2<\/sup> + p4*3*e41*e42<sup>2<\/sup><\/p>\n<p>It would be an interesting exercise to try and set meaningful priors on the p&#8217;s and the e&#8217;s but for the purposes of this exercise I will follow the WinBUGS manual and use flat Dirichlet priors over all of the sets of probabilities that must sum to one.<\/p>\n<p>Here is the WinBUGS model file for this problem<\/p>\n<p><strong><span style=\"color: #0000ff\">model<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> {<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> for (i in 1:ns){<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 niops[i] &lt;- sum(biopsies[i, ])<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 true[i] ~ dcat(p[]) <\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 biopsies[i, 1:4] ~ dmulti(error[true[i], ], nbiops[i])<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 }<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> error[2,1:2] ~ ddirch(prior[1:2])<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> error[3,1:3] ~ ddirch(prior[1:3])<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> error[4,1:4] ~ ddirch(prior[1:4])<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> p[1:4] ~ ddirch(prior[]);<\/span><\/strong><br \/>\n<strong><span style=\"color: #0000ff\"> }<\/span><\/strong><\/p>\n<p>In this program, ns=157, biopsies[157,4] is the data matrix, true[157] is the vector of actual (unobserved) heart grades, p[4] is the vector of heart grade probabilities for this population, error[4,4] is the matrix of biopsy grade probabilities conditional on the true heart grade and prior[4] is a matrix of prior parameters. Notice that WinBUGS uses the same priors for p[] and for each row of error[]. This only makes sense because flat priors are used for all of the sets of probabilities and prior[] is set to (1,1,1,1) in the data file.<\/p>\n<p>Now let\u2019s see how\u00a0WinBUGS handles the structural zeros in the values of error[]. Specifying this matrix correctly is essential if the model is to fit. Part of error[] is structurally known and so needs to appear in the data file and part of error[] is to be estimated and so it also needs to appear in the initial values file. In each file the remaining part is replaced by a missing value, NA.<\/p>\n<p>So in the data file we need<br \/>\n1\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0 0<br \/>\nNA NA\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0 0<br \/>\nNA NA\u00a0 NA\u00a0\u00a0\u00a0 0<br \/>\nNA NA\u00a0 NA NA<\/p>\n<p>The initial values must fill in the missing spaces and an obvious choice would be<br \/>\nNA\u00a0\u00a0\u00a0\u00a0\u00a0 NA\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 NA\u00a0\u00a0\u00a0 NA<br \/>\n0.5\u00a0\u00a0\u00a0\u00a0\u00a0 0.5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 NA\u00a0\u00a0\u00a0 NA<br \/>\n0.333 0.333 0.333\u00a0 NA<br \/>\n0.25\u00a0\u00a0\u00a0 0.25\u00a0\u00a0 0.25\u00a0\u00a0\u00a0 0.25<\/p>\n<p>I copied and pasted the biopsy data from WinBUGS into a text file that I called <strong>biopsy.txt<\/strong>, I then used the <strong>wbsdecode<\/strong> command to read these data into Stata and I kept the 157&#215;4 dataset in the form of 4 Stata variables that I called grade1, grade2, grade3 and grade4. These data were saved in a file called <strong>biopsy.dta<\/strong>. Here is the Stata code that does that job,<\/p>\n<p><span style=\"color: #3366ff\"><strong>. wbsdecode using biopsy.txt , clear<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> . rename biopsies_# grade#<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> . keep grade*<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> . save biopsy.dta<\/strong><\/span><\/p>\n<p>Now we are in a position to write a Stata program for fitting the model using WinBUGS.<\/p>\n<p><span style=\"color: #3366ff\"><strong>*&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> * write data file<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> use biopsy.dta, clear<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> matrix error = I(4)<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> forvalues r=2\/4 {<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0 forvalues c=1\/4 {<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if `c&#8217; `r&#8217; | `r&#8217; == 1 matrix error[`r&#8217;,`c&#8217;] = .<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 else matrix error[`r&#8217;,`c&#8217;] = 1\/`r&#8217;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0 }<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> }<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> wbslist (p=c(0.25,0.25,0.25,0.25)) (matrix error, format(%11.8f) ) \/\/\/<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> using init.txt , replace<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> * write model file<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> wbsmodel biopsy.do model.txt<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> \/*<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> model<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> {<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> for (i in 1:ns){<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0\u00a0 nbiops[i] &lt;- sum(biopsies[i, ])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0\u00a0 true[i] ~ dcat(p[])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong>\u00a0\u00a0\u00a0\u00a0 biopsies[i, 1:4] ~ dmulti(error[true[i], ], nbiops[i])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> }<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> error[2,1:2] ~ ddirch(prior[1:2])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> error[3,1:3] ~ ddirch(prior[1:3])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> error[4,1:4] ~ ddirch(prior[1:4])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> p[1:4] ~ ddirch(prior[])<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> }<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *\/<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> * write script file<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> wbsscript using script.txt , replace \/\/\/<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> model(model.txt) data(data.txt) init(init.txt) \/\/\/<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> log(log.txt) burnin(1000) update(10000) \/\/\/<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> set(error p) coda(bs)<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> * run and read the results<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> wbsrun using script.txt<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> type log.txt<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> wbscoda using bs , clear<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> * inspect the results<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> *&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> mcmcstats *<\/strong><\/span><br \/>\n<span style=\"color: #3366ff\"> <strong> mcmctrace p*<\/strong><\/span><\/p>\n<p>Convergence is not a problem so I will only show the trace plots of p[].<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-420\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig1-1024x752.png\" alt=\"jags2_fig1\" width=\"620\" height=\"455\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig1-1024x752.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig1-300x220.png 300w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>Here are the results.<\/p>\n<pre> -------------------------------------------------------------\r\n Parameter     n  mean    sd    sem median     95% CrI\r\n -------------------------------------------------------------\r\n error_2_1 10000 0.587 0.066 0.0013 0.588 ( 0.456, 0.714 )\r\n error_2_2 10000 0.413 0.066 0.0013 0.412 ( 0.286, 0.544 )\r\n error_3_1 10000 0.342 0.046 0.0007 0.340 ( 0.256, 0.436 )\r\n error_3_2 10000 0.037 0.018 0.0002 0.035 ( 0.010, 0.078 )\r\n error_3_3 10000 0.621 0.048 0.0007 0.622 ( 0.522, 0.711 )\r\n error_4_1 10000 0.099 0.042 0.0005 0.094 ( 0.034, 0.197 )\r\n error_4_2 10000 0.022 0.023 0.0003 0.015 ( 0.001, 0.086 )\r\n error_4_3 10000 0.204 0.061 0.0008 0.198 ( 0.101, 0.337 )\r\n error_4_4 10000 0.675 0.073 0.0010 0.679 ( 0.523, 0.804 )\r\n p_1       10000 0.153 0.050 0.0011 0.155 ( 0.049, 0.246 )\r\n p_2       10000 0.311 0.055 0.0010 0.307 ( 0.216, 0.432 )\r\n p_3       10000 0.389 0.044 0.0006 0.388 ( 0.305, 0.477 )\r\n p_4       10000 0.147 0.030 0.0003 0.145 ( 0.094, 0.211 )\r\n ---------------------------------------------------------------<\/pre>\n<p><span style=\"color: #800000\"><strong><em>The only problem is that I do not like the answer.<\/em><\/strong><\/span><\/p>\n<p>The estimates show that this population of hearts splits approximately; none 15%, mild 31%, moderate 39% and severe 15%. If the heart is truly in severe rejection there is a 68% chance that\u00a0a biopsy\u00a0will be graded as &#8216;severe&#8217; and a 20% that it will be graded as &#8216;moderate&#8217; and a 12% that it will be graded &#8216;none&#8217; or &#8216;mild&#8217;, but &#8216;none&#8217; is much more likely than &#8216;mild&#8217;.<\/p>\n<p>For moderately rejected hearts there is a 62% of a &#8216;moderate&#8217; biopsy and a 34% chance of\u00a0a biopsy graded\u00a0&#8216;none&#8217; but almost no chance of\u00a0a biopsy that is\u00a0graded as &#8216;mild&#8217;.<\/p>\n<p>It is estimated that about 1\/3 of hearts show mild rejection, so it is not as though the mild category has been defined so narrowly that it is never seen in practice. These data give the picture of one or more regions of rejection with other regions with no rejection but nothing in between. In fact we hardly ever see mild biopsies unless the heart\u2019s grade is mild.<\/p>\n<p>It must be the case that,<br \/>\n<span style=\"color: #800000\">\u2022 the mild category is truly a rare biopsy in a heart that is being rejected,<\/span><br \/>\n<span style=\"color: #800000\"> \u2022 something has gone wrong with the fitting,<\/span><br \/>\n<span style=\"color: #800000\"> \u2022 the model is wrong,<\/span><br \/>\n<span style=\"color: #800000\"> \u2022 we have not used our prior knowledge properly.<\/span><\/p>\n<p>First we should look at the fit of the model but with discrete data of this type it is not obvious what scale we should use. In the end I decided to look at the average biopsy grade for each person and to contrast it with the predicted average under this model.<\/p>\n<p>The average grade can be calculated from the data file as<br \/>\n<strong><span style=\"color: #3366ff\">. gen y = (grade1+2*grade2+3*grade3+4*grade4)\/(grade1+grade2+grade3+grade4)<\/span><\/strong><\/p>\n<p>The predicted values of y, which I call ym, can be found by adding a couple of lines (shown in green) to the model file<br \/>\n<strong><span style=\"color: #3366ff\">model<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> {<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> for (i in 1:ns){<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 nbiops[i] &lt;- sum(biopsies[i, ])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 true[i] ~ dcat(p[])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 biopsies[i, 1:4] ~ dmulti(error[true[i], ], nbiops[i])<\/span><\/strong><br \/>\n<span style=\"color: #008080\"><strong>\u00a0\u00a0\u00a0\u00a0 y[i, 1:4] ~ dmulti(error[true[i], ], nbiops[i])<\/strong><\/span><br \/>\n<span style=\"color: #008080\"> <strong>\u00a0\u00a0\u00a0\u00a0 ym[i] &lt;- (y[i,1]+2*y[i,2]+3*y[i,3]+4*y[i,4])\/nbiops[i]<\/strong><\/span><br \/>\n<strong><span style=\"color: #3366ff\"> }<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> error[2,1:2] ~ ddirch(prior[1:2])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> error[3,1:3] ~ ddirch(prior[1:3])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> error[4,1:4] ~ ddirch(prior[1:4])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> p[1:4] ~ ddirch(prior[])<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> }<\/span><\/strong><\/p>\n<p>Now we can use <strong>mcmccheck<\/strong> to create some residual plots. Here is the predictive distribution for heart 119 with the observation (none,mild,moderate) and an observed mean of (1+2+3)\/3=2.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-421\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig2-1024x752.png\" alt=\"jags2_fig2\" width=\"620\" height=\"455\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig2-1024x752.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig2-300x220.png 300w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>The predicted values of 1 arise from the possibility that this person is actually moderate or severe but the affected region is missed entirely and the biopsies are (none,none,none). The score of 4 would come about if the heart were actual severe, in which case there would be the chance that the biopsies would be graded (severe,severe,severe) instead of the observed (none,mild,moderate). In reality this heart has a mean score of 2 and falls just below expectation so that the standardized residual is about -0.5.<\/p>\n<p>Here is the summary residual plot,<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-422\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig3-1024x752.png\" alt=\"jags2_fig3\" width=\"620\" height=\"455\" srcset=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig3-1024x752.png 1024w, https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2014\/11\/jags2_fig3-300x220.png 300w\" sizes=\"auto, (max-width: 620px) 100vw, 620px\" \/><\/a><\/p>\n<p>We must be careful not to judge these plots in the same way that we would the residuals from a linear model. The underlying scale is essentially discrete and so we would not expect normally distributed errors. Notice how the mean predictions suggest only 3 main groups with means falling between none and mild, mild and moderate and moderate and severe.<\/p>\n<p>The extreme residuals in the summary plot come from people with grades such as (mild,mild,mild) and it could be that grades such as (mild,mild,mild) actually reflect a lack of independence in the sampling. Perhaps sometimes the biopsies are taken from the same region of the heart instead of spreading them around.<\/p>\n<p>Here is a small simulation to show the impact of independence. Instead of sampling from an area\u00a0I will sample from a line of length 1000. In the centre of this line is an affected region of length 200 which is graded as y=2, the remainder of the line has a grade of y=1. First we take two independent samples at random<br \/>\n<strong><span style=\"color: #3366ff\">set obs 1000<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> gen y = 1<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> replace y = 2 in 400\/599<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> gen sample1 = .<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> gen sample2 = .<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> forvalues i = 1\/1000 {<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 local j1 = int(runiform()*1000+1)<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 qui replace sample1 = y[`j1&#8242;] in `i&#8217;<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 local j2 = int(runiform()*1000+1)<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\">\u00a0\u00a0\u00a0\u00a0 qui replace sample2 = y[`j2&#8242;] in `i&#8217;<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> }<\/span><\/strong><br \/>\n<strong><span style=\"color: #3366ff\"> tabu sample1 sample2<\/span><\/strong><\/p>\n<p>Since a proportion 0.2 of the line is affected, we will obtain two affected samples with probability 0.2*0.2=0.04. The actual results vary slightly because of sampling error but here is a typical set that shows the anticipated pattern,<\/p>\n<pre>                    sample2\r\n sample1    |   1            2     | Total\r\n -----------+----------------------+----------\r\n 1          | 638          182     | 820\r\n 2          | 142           38     | 180\r\n -----------+----------------------+----------\r\n Total      | 780          220     | 1,000<\/pre>\n<p>Now change the sampling so that sample1 is chosen at random but sample 2 is taken within \u00b150 of sample1.<br \/>\n<strong><span style=\"color: #3366ff\">local j2 = max(min(`j1&#8242; + int(runiform()*100+1)-50,1000),1)<\/span><\/strong><\/p>\n<p>Now a typical set of results is,<\/p>\n<pre>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sample2\r\n sample1    |   1          2       | Total\r\n -----------+----------------------+----------\r\n 1          | 758         26       |   784\r\n 2          |  26        190       |   216\r\n -----------+----------------------+----------\r\n Total      | 784        216       | 1,000<\/pre>\n<p>Grade 2 still occurs about 20% of the time but clearly (2,2) and (1,1) have become much more common and (1,2) and (2,1) are much rarer. We could get the reverse effect if we were to insist that the two samples did not come from within \u00b150 of each other.<\/p>\n<p>My suspicion is that for this particular data set, the assumption of independence of the sets of biopsies taken on the same occasion is an oversimplification. The oversimplified model will not predict the data very well and any attempt to interpret error[,] may be misleading.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week I introduced the JAGS program as an alternative to WinBUGS and this week I\u00a0started with the intention of comparing JAGS and WinBUGS using a sample dataset. I decided to base the comparison on the biopsy example\u00a0taken from\u00a0the WinBUGS help files. Predictably, by the time that I had explained the model and fitted it [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[43,44,42,4,5],"class_list":["post-417","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-dirichlet","tag-independence","tag-jags","tag-stata","tag-winbugs"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/417","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=417"}],"version-history":[{"count":8,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/417\/revisions"}],"predecessor-version":[{"id":441,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/417\/revisions\/441"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=417"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=417"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=417"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}