{"id":317,"date":"2014-10-03T09:42:19","date_gmt":"2014-10-03T09:42:19","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=317"},"modified":"2025-02-26T13:21:38","modified_gmt":"2025-02-26T13:21:38","slug":"using-r-with-stata-part-iii","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2014\/10\/03\/using-r-with-stata-part-iii\/","title":{"rendered":"Using R with Stata: Part III"},"content":{"rendered":"<p>This is another in a series of posting about using conducting advanced statistical analyses in Stata by sending a job to R and then reading the results back into Stata. Our task for testing this process is to use the R package DPpackage to fit a Bayesian Dirichlet process mixture (DPM) model for smoothing a scatter plot. The data for demonstrating the code consist of 298 IgG measurements made on children of different ages.<\/p>\n<p>In <a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/\">my last posting<\/a> I described how we can use Stata\u2019s \u2013shell- command to submit an R job and this time we will consider how\u00a0the R script\u00a0can be\u00a0created by\u00a0a Stata ado file so that a Stata user need know nothing of R when they fit the model.<\/p>\n<p>Previously, we worked with a text file called DPC.R than contains the R script. Repeating what we had last time, the file contains,<\/p>\n<p><span style=\"color: #0000ff\">setwd(&#8216;E:\/IGG&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> require(foreign)<\/span><br \/>\n<span style=\"color: #0000ff\"> require(DPpackage)<\/span><br \/>\n<span style=\"color: #0000ff\"> D &lt;- read.dta(&#8216;igg.dta&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> zbar &lt;- c(7,3)<\/span><br \/>\n<span style=\"color: #0000ff\"> Sigma &lt;- matrix(c(4,1,1,1),2,2)<\/span><br \/>\n<span style=\"color: #0000ff\"> MyPrior &lt;- list(alpha=5, k0=0.25, nu1=10, m1=zbar, psiinv1=7*solve(Sigma))<\/span><br \/>\n<span style=\"color: #0000ff\"> MyDesign &lt;- list(nburn=5000, nsave=5000, nskip=3, ndisplay=1000)<\/span><br \/>\n<span style=\"color: #0000ff\"> xp &lt;- c(1,2,3,4,5)<\/span><br \/>\n<span style=\"color: #0000ff\"> fitIgG &lt;- DPcdensity(y=D$igg, x=D$age, xpred=xp,ngrid=100,<\/span><br \/>\n<span style=\"color: #0000ff\"> compute.band=TRUE, type.band=&#8221;HPD&#8221;, prior=MyPrior, mcmc=MyDesign,<\/span><br \/>\n<span style=\"color: #0000ff\"> state=NULL, status=TRUE)<\/span><br \/>\n<span style=\"color: #0000ff\"> nClu &lt;- as.data.frame(fitIgG$save.state$thetasave[,7])<\/span><br \/>\n<span style=\"color: #0000ff\"> names(nClu) &lt;- &#8220;nclus&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> write.dta(nClu,file=&#8217;nclusters.dta&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> cDen &lt;- as.data.frame(cbind(fitIgG$grid,t(rbind(fitIgG$densp.m,<\/span><br \/>\n<span style=\"color: #0000ff\"> fitIgG$densp.l,fitIgG$densp.h))))<\/span><br \/>\n<span style=\"color: #0000ff\"> head(cDen)<\/span><br \/>\n<span style=\"color: #0000ff\"> names(cDen) &lt;- c(&#8220;IgG&#8221;,&#8221;m1&#8243;,&#8221;m2&#8243;,&#8221;m3&#8243;,\u201dm4\u201d,\u201dm5\u201d,<\/span><br \/>\n<span style=\"color: #0000ff\"> &#8220;l1&#8243;,&#8221;l2&#8243;,&#8221;l3&#8243;,&#8221;l4&#8243;,&#8221;l5&#8243;,&#8221;h1&#8243;,&#8221;h2&#8243;,&#8221;h3&#8243;,&#8221;h4&#8243;,&#8221;h5&#8221;)<\/span><br \/>\n<span style=\"color: #0000ff\"> write.dta(cDen,file=&#8217;cDensity.dta&#8217;)<\/span><\/p>\n<p>Last time we prepared this script, submitted it from within Stata and read the results into Stata for further processing. This approach is fine for those of us who know R but let&#8217;s now suppose that we want to prepare a program for those Stata users who know nothing of R. We will need to prepare a program that writes the script for them as well as executing it.<\/p>\n<p>There are lots of ways of going this but I will describe my preferred method.<\/p>\n<p>Within my PERSONAL folder I have a subfolder called RTEMPLATES and here I store the templates for the R scripts that run the jobs that I want to make available. Here is the template corresponding to DPC.R.<\/p>\n<p><span style=\"color: #0000ff\">setwd(&#8216;@&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\">if( !require(&#8216;foreign&#8217;) ) {<br \/>\ninstall.packages(&#8216;foreign&#8217;,repos=&#8217;http:\/\/cran.ma.imperial.ac.uk\/&#8217;,dep=TRUE)<br \/>\nif( !require(&#8216;foreign&#8217;) ) stop() }<br \/>\nif( !require(&#8216;DPpackage&#8217;) ) {<br \/>\ninstall.packages(&#8216;DPpackage&#8217;,repos=&#8217;http:\/\/cran.ma.imperial.ac.uk\/&#8217;,dep=TRUE)<br \/>\nif( !require(&#8216;DPpackage&#8217;) ) stop() }<\/span><br \/>\n<span style=\"color: #0000ff\">D &lt;- read.dta(&#8216;@&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> zbar &lt;- c(@)<\/span><br \/>\n<span style=\"color: #0000ff\"> Sigma &lt;- matrix(c(@),2,2)<\/span><br \/>\n<span style=\"color: #0000ff\"> MyPrior &lt;- list(alpha=@, k0=@, nu1=@, m1=zbar, psiinv1=@*solve(Sigma))<\/span><br \/>\n<span style=\"color: #0000ff\"> MyDesign &lt;- list(nburn=@, nsave=@, nskip=@, ndisplay=1000)<\/span><br \/>\n<span style=\"color: #0000ff\"> xp &lt;- c(@)<\/span><br \/>\n<span style=\"color: #0000ff\"> fitIgG &lt;- DPcdensity(y=D\u00a3@, x=D\u00a3@, xpred=xp, ngrid=@,<\/span><br \/>\n<span style=\"color: #0000ff\"> compute.band=TRUE, type.band=&#8221;HPD&#8221;, prior=MyPrior, mcmc=MyDesign,<\/span><br \/>\n<span style=\"color: #0000ff\"> state=NULL, status=TRUE)<\/span><br \/>\n<span style=\"color: #0000ff\"> nClu &lt;- as.data.frame(fitIgG\u00a3save.state\u00a3thetasave[,7])<\/span><br \/>\n<span style=\"color: #0000ff\"> names(nClu) &lt;- &#8220;nclus&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> write.dta(nClu,file=&#8217;nclusters.dta&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> cDen &lt;- as.data.frame(cbind(fitIgG\u00a3grid,t(rbind(fitIgG\u00a3densp.m,<\/span><br \/>\n<span style=\"color: #0000ff\"> fitIgG\u00a3densp.l,fitIgG\u00a3densp.h))))<\/span><br \/>\n<span style=\"color: #0000ff\"> names(cDen) &lt;- c(@)<\/span><br \/>\n<span style=\"color: #0000ff\"> write.dta(cDen,file=&#8217;cDensity.dta&#8217;)<\/span><\/p>\n<p>A template is almost the same as the script itself but with three minor changes; the symbol @ replaces items that are problem dependent, such as the values of the priors, these will be replaced\u00a0before the program is run;\u00a0the symbol \u00a3 replaces $ and there is a little extra code associated with loading packages. The \u00a3&#8217;s stop Stata from treating lines containing code, such as D$age, as if they contain a Stata global and the\u00a0extra code associated with loading packages\u00a0is intended to protect the Stata user from the need to install the package.<\/p>\n<p>In R, packages are installed over the internet and stored in a library; from there they can be loaded when needed. The extra code tries to load the package using R&#8217;s require() function but\u00a0if that fails, the most likely reason is that the package has never been installed into the user&#8217;s R library, so an attempt is made to do that. In my example, I try to install from the R mirror site at Imperial College in London. Good practice says that you ought to install from\u00a0your closest mirror site, so if your intended Stata user is not in southern England then you ought to replace this address with the address of a mirror site in their own country. You can find the addresses of the mirror sites by issuing the following command in R ,<\/p>\n<p><span style=\"color: #0000ff\">&gt;\u00a0M &lt;- getCRANmirrors()<\/span><\/p>\n<p>and then looking at M. If your program is intended for Stata users in many different countries then you would need to make the Stata user choose a mirror site, perhaps by creating an ado file that shows the user a list of available sites that you obtain via the getCRANmirrors function.<\/p>\n<p>I have also written a program called <strong>rtemplate.ado<\/strong> that fills in the missing elements of an R template with its arguments. So the command,<\/p>\n<p><span style=\"color: #0000ff\">rtemplate DPcdensity.txt temp.R a b c \u201c3,4\u201d<\/span><\/p>\n<p>would read the template DPcdensity.txt from RTEMPLATES and create a file temp.R in the current working directory.\u00a0In temp.R,\u00a0the first @ is replaced by \u201ca\u201d the second by \u201cb\u201d etc. At the same time pounds are converted into dollars. The source for this ado file is very simple,<\/p>\n<p><span style=\"color: #0000ff\">program rtemplate<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 gettoken template 0 : 0<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>gettoken outfile 0 : 0<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>tempname fhin fhout<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local personal : sysdir PERSONAL<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>file open `fhout&#8217; using &#8220;`outfile'&#8221;, write replace<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>file open `fhin&#8217; using &#8220;`personal&#8217;RTEMPLATES\/`template'&#8221;, read<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>file read `fhin&#8217; inLine<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>while !r(eof) {<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>tokenize `&#8221;`inLine'&#8221;&#8216; , parse(&#8220;@\u00a3&#8221;)<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local j = 1<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>while `&#8221;&#8220;j&#8221;&#8221;&#8216; != &#8220;&#8221; {<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>if `&#8221;&#8220;j&#8221;&#8221;&#8216; == &#8220;@&#8221; {<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span>gettoken item 0 : 0<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span>file write `fhout&#8217; `&#8221;`item'&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span>}<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span>else if `&#8221;&#8220;j&#8221;&#8221;&#8216; == &#8220;\u00a3&#8221; file write `fhout&#8217; &#8220;$&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\"><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>else file write `fhout&#8217; `&#8221;&#8220;j&#8221;&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local ++j<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\"><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>\u00a0\u00a0\u00a0\u00a0 <\/span>}<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><\/span>file write `fhout&#8217; _n<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\"><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>\u00a0\u00a0\u00a0\u00a0 <\/span>file read `fhin&#8217; inLine<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>}<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>file close `fhin&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>file close `fhout&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> end<\/span><\/p>\n<p>Now all I need to do is to write a program that calculates the missing items, @, and sends them to rtemplate.ado in the correct order. For instance,<\/p>\n<p><span style=\"color: #0000ff\">program r_DPcdensity<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>syntax varlist [ using\/ ], XPredict(numlist) PMean(name) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>PCOVariance(name) PAlpha(real) PK0(real) PNU1(real) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>[ Burnin(integer 1000) Updates(integer 5000) Skip(integer 3) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>NGrid(integer 50)\u00a0 ] <span style=\"color: #0000ff\">LOGfile(string)<\/span><\/span><\/p>\n<p><span style=\"color: #008000\">* current working directory<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>if &#8220;`using'&#8221; == &#8220;&#8221; tempfile using<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local wd = subinstr(&#8220;`c(pwd)'&#8221;,&#8221;\\&#8221;,&#8221;\/&#8221;,.)<\/span><br \/>\n<span style=\"color: #008000\"> * temporary data file contain x and y<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>tokenize `varlist&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local y &#8220;`1&#8242;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local x &#8220;`2&#8242;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>tempfile dfile<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local dfile = subinstr(&#8220;`dfile'&#8221;,&#8221;\\&#8221;,&#8221;\/&#8221;,.)<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>preserve<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>keep `varlist&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>save &#8220;`dfile'&#8221;, replace<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>restore<\/span><br \/>\n<span style=\"color: #008000\"> * mean of base distribution<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local a1 = `pmean'[1,1]<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local a2 = `pmean'[1,2]<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local mv = &#8220;`a1&#8242;,`a2&#8242;&#8221;<\/span><br \/>\n<span style=\"color: #008000\"> * covariance matrix of base distribution<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local a1 = `pcovariance'[1,1]<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local a2 = `pcovariance'[1,2]<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local a3 = `pcovariance'[2,2]<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local cv = &#8220;`a1&#8242;,`a2&#8242;,`a2&#8242;,`a3&#8242;&#8221;<\/span><br \/>\n<span style=\"color: #008000\"> * df of base distribution<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local df = `pnu1&#8242; &#8211; 2 &#8211; 1<\/span><br \/>\n<span style=\"color: #008000\"> * values of xpredict &amp; names for m,l,h<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local xlist &#8220;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local mlist &#8220;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local llist &#8220;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local hlist &#8220;&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local i = 0<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>foreach v of numlist `xpredict&#8217; {<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local ++i<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local mlist `&#8221;`mlist&#8217;,&#8221;m`i'&#8221;&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local llist `&#8221;`llist&#8217;,&#8221;l`i'&#8221;&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>local hlist `&#8221;`hlist&#8217;,&#8221;h`i'&#8221;&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>if &#8220;`xlist'&#8221; == &#8220;&#8221; local xlist &#8220;`v'&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>else local xlist &#8220;`xlist&#8217;,`v'&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>}<\/span><br \/>\n<span style=\"color: #008000\"> * make the script<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>rtemplate DPcdensity &#8220;`using'&#8221; &#8220;`wd'&#8221; &#8220;`dfile'&#8221; &#8220;`mv'&#8221; &#8220;`cv'&#8221; \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>&#8220;`palpha'&#8221; &#8220;`pk0&#8242;&#8221; &#8220;`pnu1&#8242;&#8221; &#8220;`df'&#8221; &#8220;`burnin'&#8221; &#8220;`updates'&#8221; \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>&#8220;`skip'&#8221; &#8220;`xlist'&#8221; &#8220;`y'&#8221; &#8220;`x'&#8221; &#8220;`ngrid'&#8221; `&#8221;`y&#8217;`mlist&#8217;`llist&#8217;`hlist'&#8221;&#8216;<\/span><br \/>\n<span style=\"color: #008000\"> * run the R script<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>cap rm `logfile&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> <span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span>execute R CMD BATCH `using&#8217; `logfile&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> end<\/span><\/p>\n<p>Now a Stata user can have all of the complexity of R hidden from them and all they need to do is to type commands such as,<\/p>\n<p><span style=\"color: #0000ff\">.\u00a0use igg.dta, clear<\/span><br \/>\n<span style=\"color: #0000ff\">. matrix mu = (7,3)<\/span><br \/>\n<span style=\"color: #0000ff\">. matrix sigma = (4,1\\1,1)<\/span><br \/>\n<span style=\"color: #0000ff\">. r_DPcdensity igg age using temp.R , xp(1 2 3 4 5) priorm(mu) priorcov(sigma) \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 priora(5) priork0(0.25) priornu1(10) log(log.out)<\/span><br \/>\n<span style=\"color: #0000ff\">. type log.out<\/span><br \/>\n<span style=\"color: #0000ff\">. use nclusters.dta, clear<\/span><br \/>\n<span style=\"color: #0000ff\">. histogram nclus , start(0.5) width(1) freq title(Number of Clusters)<\/span><\/p>\n<p>and they would fit a Bayesian DPM model with their chosen priors and plot a histogram of the number of clusters at each iteration of the sampler. The computation will be done in R but the Stata user need know nothing of that.<\/p>\n<p>The key to making the facilities of R available within Stata it to prepare good quality ado files with a structure similar to r_DPcdensity.ado and good quality help files that explain the options. The ado file given above is very basic and would benefit from more checking of the options and some helpful error messages. None the less, writing a good version of r_DPcdensity.ado is very, very much easier than rewriting the source code of the R\u00a0function DPcdensity.<\/p>\n<p>Next time I will complete this series of postings on using R with Stata by revisiting the way in which results are transferred from R to Stata. Rather than re-package the results into neat .dta files as I have done so far, I will show how the entire set of results can be dumped to a text file and read by Stata.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is another in a series of posting about using conducting advanced statistical analyses in Stata by sending a job to R and then reading the results back into Stata. Our task for testing this process is to use the R package DPpackage to fit a Bayesian Dirichlet process mixture (DPM) model for smoothing a [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[33,38,4],"class_list":["post-317","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-r","tag-r-packages","tag-stata"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/317","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=317"}],"version-history":[{"count":8,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/317\/revisions"}],"predecessor-version":[{"id":356,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/317\/revisions\/356"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=317"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=317"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=317"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}