Stata14 and the future of this blog

I have not posted for the last few weeks, not because I have nothing to say but rather because I have been thinking about the future of this blog.

I have had in mind for some time that there is a need to say something about the Bayesian analysis facilities that were introduced in Stata14 and while preparing for that topic I was forced to question the sense of advocating my own parallel set of Stata commands. When I wrote the book, Bayesian analysis with Stata, there were no Bayesian options in Stata and I was assured repeatedly that none were planned. Then, shortly after publication, there was a change of mind and some very basic Bayesian commands were added to the latest release of Stata.

Clearly there is no sense in having two competing systems and equally obviously StataCorp have the resources and experience to produce far better commands than I could. Although it must be said that the official additions to Stata14 are pretty basic and cannot produce the same variety of Bayesian analyses as my own commands.

It looks to me as though Stata have rushed out a very limited command for Metropolis-Hastings sampling without paying much attention to the way in which it will fit within the broader Bayesian software that they will eventually need to produce. While it is true that the Bayesian commands in Stata14 are limited, experience suggests that StataCorp will build quickly on those foundations and eventually they will produce something very impressive. I would not be surprised if they wrote a general Gibbs sampling command using Mata, similar to OpenBUGS or JAGS, or perhaps they will opt for HMC and write a Stan-like program in Mata.

StataCorp seem to have a policy of not linking Stata to other software. One can see the sense of this from an economic point of view as it makes users dependent on Stata and encourages them to buy the new releases. It also allows StataCorp to control the quality of the analyses that Stata produces. The downside is that developments are slower than they need to be and StataCorp is constantly engaged in reinventing the wheel. I think that the experience of more flexible approaches, as typified by R and Wikipedia, is that looser control has many advantages and quality is better maintained by the continual testing that results from heavy use.

Anyway, there is little point in my continuing to develop my own Stata commands for Bayesian analysis. It is a race that I cannot win.

There is, I think, still scope for a blog explaining Bayesian methods and illustrating different types of Bayesian analyses but this ought to be based on the official version of Stata and not my own commands. This creates a problem because the official commands are still too basic to do anything really interesting. In that sense, this blog is a few years ahead of its time.

There is one final factor that needs to be taken into account. I enjoy writing this blog, I certainly would not do it otherwise.

I have not made a final decision but I will probably leave this blog open but post less frequently and meantime give thought to starting a different blog to keep myself amused.

If I were to write another book, it would be called Bayesian methods in genetic epidemiology and I would try to follow the style of Bayesian analysis with Stata, by which I mean I would attempt to emphasise practical application and the understanding of the basic concepts, but not worry too much about the rigour of the explanation. There are many people analysing genetic data who have not had a formal training in statistics and it interests me to try to explain complex statistical ideas to such non-specialists, though I realise that this is challenging and my attempts will not always be successful.

Writing a book is a major undertaking and I do not have time for it at present but I am attracted by the idea of self-publishing a book as a blog and releasing it in regular instalments, rather as Dickens released Pickwick Papers (and no, I do not have any delusions about my writing style, which I know to be very limited).

The trouble is that Stata is not a good program for handling genetic epidemiology datasets, which are often huge. StataCorp made some initial decisions that have been overtaken by events and StataCorp has shown itself slow to adapt. Clearly the idea of only having one spreadsheet of data open at a time is too limiting and restricting that spreadsheet to a few thousand columns makes it virtually impossible to use Stata for many modern applications. I’m sure that Stata will evolve, but progress is so slow that I fear for its long-term market share.

Anyway, a serially released book called Bayesian methods in genetic epidemiology would have to use R, so it would not be a natural continuation of this blog. I will give myself a few months to think it over and if I have the energy, I will make a start on Bayesian methods in genetic epidemiology as a fresh blog in the New Year. Meantime this blog will continue, at least for a while, but perhaps with a posting every month rather than a posting every week.

4 responses to “Stata14 and the future of this blog”

andrea November 13, 2015 at 2:56 pm | Permalink

Dear John, while I fully understand why you want to post less frequently, let me say that your blog has been a constant source of inspiration for me and — I am sure — many other readers. Thank you for all the time you’ve spent sharing your knowledge about bayesian data analysis.
1. John November 16, 2015 at 9:20 am | Permalink
  
  Thank you so much for the kind words – although this particularly blog has run its course, I do plan to start another in the new year and hopefully you will find that interesting
Robert Grant January 6, 2016 at 7:19 pm | Permalink

Let me second that John, your work has been groundbreaking for Stata, diligent and generous. They would not have gone where they are now without your focus on Bayes. There’s a lot still to be done. Don’t give up!

And I keep meaning to say thank you for all the good ideas I’ve nicked from your code, so let me do so now in public.
1. John January 7, 2016 at 8:31 am | Permalink
  
  Thank you for the words of support but when Stata introduced their own command for Bayesian analysis I started to look for an alternative project and I soon became interested in the idea of a blog on Bayesian methods in genetic epidemiology. So my decision is not entirely a negative one. I do plan to start a new blog soon with much the same style as the old one.

Stata14 and the future of this blog

About John

4 responses to “Stata14 and the future of this blog”

Categories

Tags

Archives

Pages

Authors

Recent Comments