{"id":850,"date":"2015-09-04T08:01:59","date_gmt":"2015-09-04T08:01:59","guid":{"rendered":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/?p=850"},"modified":"2025-02-26T13:21:37","modified_gmt":"2025-02-26T13:21:37","slug":"google-flu-trends-part-ii","status":"publish","type":"post","link":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2015\/09\/04\/google-flu-trends-part-ii\/","title":{"rendered":"Google Flu Trends Part II"},"content":{"rendered":"<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2015\/08\/28\/google-flu-trends\/\">Last week I discussed Google Flu Trends<\/a> a project that provided weekly estimates of the number of flu cases in 25 different countries based on the frequency of internet searches for influenza related terms (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Google_Flu_Trends\">https:\/\/en.wikipedia.org\/wiki\/Google_Flu_Trends<\/a>). Sadly Google have stopped producing these estimates but historical data are still available for download. Presumably\u00a0Google gave\u00a0up when faced with criticism of the accuracy of the method. Much of that criticism appears to have been fuelled by a general unease\u00a0about the use of &#8216;big data&#8217;, anyway\u00a0I hope that\u00a0it did not come from statisticians, who should be\u00a0familiar with measurement error and surrogates.<\/p>\n<p>The plot below shows the estimates of the number of flu cases each week for Chile. The vertical lines denote periods of 52 weeks. As Chile is in the southern hemisphere, the winter peak in flu cases occurs around the middle of the year.<br \/>\n<a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/chile-e1439993972637.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-829\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/08\/chile-e1439993972637.png\" alt=\"chile\" width=\"600\" height=\"439\" \/><\/a><\/p>\n<p>The pattern seems to be, an annual cycle that is almost zero at the start and end of the year, on to which is imposed an annual spike corresponding to a flu epidemic. These spikes are of varying severity and duration. Perhaps there is also a drift upwards in the total number of\u00a0estimated cases per year, which I suppose to be a result of increased internet use rather than a genuine increase in the number of flu cases, though the latter is not impossible given a growing population and increased urbanisation.<\/p>\n<p>Anyway, our aim is to model these data in such a way that\u00a0enables us to\u00a0determine the starts of the annual epidemics. To that end, <a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/2015\/08\/28\/google-flu-trends\/\">in my last posting<\/a> I developed a Metropolis-Hastings sampler called <strong>mhsint<\/strong> that\u00a0updates integer based parameters, such as the week in which an epidemic starts or the week when it ends. What we need now is a Bayesian model for these data.<\/p>\n<p>The model that I am going to adopt just follows my description. I will suppose that the counts follow a Poisson distribution and that the log of the mean number of cases has a linear drift over time plus a sine term that has a frequency of \u03c0\/52 (\u22480.06), so that it rises and falls back to zero over 52 weeks<br \/>\n<span style=\"color: #0000ff\">log(mean) = a + b time\/1000 + c sin(0.060415*week)<\/span><\/p>\n<p>On top of this basic trend I&#8217;ll impose an annual epidemic of strength H<sub>i<\/sub> that starts in week S<sub>i<\/sub> and ends in week E<sub>i<\/sub>, where i denotes the year. So<br \/>\n<span style=\"color: #0000ff\">log(mean) = a + b time\/1000 + c sin(0.060415*week)+H<sub>i<\/sub> \u03b4(year==i &amp; week&gt;=S<sub>i<\/sub> &amp; week&lt;=E<sub>i<\/sub>)<\/span><\/p>\n<p>Here \u03b4 is 0 or 1 depending on whether the condition holds.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Preparing the data<\/strong><\/p>\n<p>The data can be downloaded from <a href=\"https:\/\/www.google.org\/flutrends\/about\/\">https:\/\/www.google.org\/flutrends\/about\/<\/a> by clicking on <strong>world<\/strong> or the country that interests you. There is a parallel dataset for Dengue Fever. I copied the portion of the file that contains the data and since it is comma delimited, I saved it as a .csv file that I called googleflu.csv.<\/p>\n<p>A little preprocessing is required to turn dates into week numbers and to drop incomplete years. Eventually we\u00a0obtain, year=1&#8230;12 corresponding to 2003&#8230;2014, week = 1&#8230;52 or 53 (it depends when the first day of the first week falls in that year) and time = 1&#8230;627 denoting the week number.<\/p>\n<p><span style=\"color: #0000ff\">insheet using googleflu.csv , clear<\/span><br \/>\n<span style=\"color: #0000ff\"> gen day = date(date,&#8221;YMD&#8221;)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen time = 1+(day-day[1])\/7<\/span><br \/>\n<span style=\"color: #0000ff\"> gen year = real(substr(date,1,4))<\/span><br \/>\n<span style=\"color: #0000ff\"> gen start = year != year[_n-1]<\/span><br \/>\n<span style=\"color: #0000ff\"> gen t = start*time<\/span><br \/>\n<span style=\"color: #0000ff\"> sort year time<\/span><br \/>\n<span style=\"color: #0000ff\"> by year: egen w = sum(t)<\/span><br \/>\n<span style=\"color: #0000ff\"> gen week = time &#8211; w + 1<\/span><br \/>\n<span style=\"color: #0000ff\"> drop w t start<\/span><br \/>\n<span style=\"color: #0000ff\"> drop if year==2002 | year==2015<\/span><br \/>\n<span style=\"color: #0000ff\"> drop if chile == .<\/span><br \/>\n<span style=\"color: #0000ff\"> replace year = year &#8211; 2002<\/span><br \/>\n<span style=\"color: #0000ff\"> keep year week time chile<\/span><br \/>\n<span style=\"color: #0000ff\"> save tempflu.dta, replace<\/span><\/p>\n<p><strong>Priors<\/strong><\/p>\n<p>The intercept parameter, called\u00a0a in my model formula, represents the log of the predicted number of cases at the start of the first year when it is the height of the Chilean summer, I&#8217;ll suppose that a ~ N(0,sd=1). A value of 0 would correspond to an expectation of\u00a01 case\u00a0per week.<\/p>\n<p>I&#8217;ve scaled the time by dividing by 1000 because I expect the trend\u00a0per week\u00a0to be\u00a0very small. If the coefficient b were 1 then after 1000 weeks (\u224820 years) the predicted number of cases would have drifted up by a factor of about 3. I&#8217;ll suppose that b ~ N(1,sd=1)<\/p>\n<p>Each year the sine term goes from 0 to a peak of 1 and back to 0. In the absence of an epidemic I would imagine that the background predicted number of cases might increase by a factor of\u00a0perhaps 10 between the summer and the winter, so\u00a0I&#8217;ll\u00a0make the coefficient of the sine term, c ~ N(2,sd=1).<\/p>\n<p>As to the the extent of the epidemics, I imagine that they might increase the background rate by a factor of 10 or even 100, but an epidemic cannot lower the number of cases, so H<sub>i<\/sub> must all be positive. I&#8217;ll\u00a0make H<sub>i<\/sub> ~ Gamma(2,0.5)<\/p>\n<p>It is more difficult to place a prior on the start and end of an epidemic because Si and Ei will not be independent. I&#8217;ll suppose that the epidemic is likely to start around week 20\u00a0and make<br \/>\n<span style=\"color: #0000ff\">P(S<sub>i<\/sub>=k) \u03b1 1\/(1+abs(k-20))\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 S<sub>i<\/sub> &lt; 1 &amp; S<sub>i<\/sub> &lt; 50<\/span><br \/>\nand<br \/>\n<span style=\"color: #0000ff\">P(E<sub>i<\/sub>=h|S<sub>i<\/sub>=k) \u03b1 1\/(1+abs(h-k-8))\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Ei&gt;S<sub>i<\/sub> &amp; E<sub>i<\/sub>&lt;=52<\/span><br \/>\nwhich would imply that I expect the epidemic to last 8 weeks.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Calculating the log-posterior<\/strong><\/p>\n<p>The code for calculating the log-posterior follows the description of the model. The only places where we must be careful are when guarding against an invalid model. This can happen when the limits of the epidemic move outside [1,52], or when the epidemic ends before it starts, or when the value of the log-posterior becomes infinite.<\/p>\n<p>It is an unfortunate aspect of Stata that infinity is represented by a large positive number so any time that the log-posterior become infinite, the Metropolis-Hastings algorithm will automatically accept the move. In this model, this\u00a0can happen if we try a very large value for the severity of the spike because the log-posterior depends on the exponential of this value and scalars in Stata can only\u00a0go up to about 10<sup>300<\/sup> before they exceed the\u00a0double precision of a computer.<\/p>\n<p>In the following code, the parameters are arranged in\u00a0a row\u00a0matrix, b, so that a, b and c come first, then the 12 H<sub>i<\/sub>&#8216;s the the 12 S<sub>i<\/sub>&#8216;s and finally the 12 E<sub>i<\/sub>s.<\/p>\n<p><span style=\"color: #0000ff\">program logpost<\/span><br \/>\n<span style=\"color: #0000ff\"> args lnf b<\/span><\/p>\n<p><span style=\"color: #0000ff\">local invalid = 0<\/span><br \/>\n<span style=\"color: #0000ff\"> tempvar eta lnL<\/span><br \/>\n<span style=\"color: #0000ff\"> gen `eta&#8217; = `b'[1,1]+`b'[1,2]*time\/1000+`b'[1,3]*sin(0.060415*week)<\/span><br \/>\n<span style=\"color: #0000ff\"> local hterm = 0<\/span><br \/>\n<span style=\"color: #0000ff\"> forvalues i=1\/12 {<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 local h = `b'[1,`i&#8217;+3]<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local hterm = log(`h&#8217;) &#8211; 2*`h&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local S = `b'[1,`i&#8217;+15]<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local E = `b'[1,`i&#8217;+27]<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 if `E&#8217; &lt;= `S&#8217; | `E&#8217; &gt; 52 | `S&#8217; &lt; 2 | `h&#8217; &gt; 250 local invalid = 1<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">local hterm = `hterm&#8217; &#8211; log(1+abs(`S&#8217;-20)) &#8211; log(1+abs(`E&#8217;-`S&#8217;-8))<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 replace `eta&#8217; = `eta&#8217; + `h&#8217; if year == `i&#8217; &amp; week &gt;= `S&#8217; &amp; week &lt;= `E&#8217; } <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">if `invalid&#8217; == 1 scalar `lnf&#8217; = -9e300 <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">else { <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">gen `lnL&#8217; = chile*`eta&#8217; &#8211; exp(`eta&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">qui su `lnL&#8217; <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">scalar `lnf&#8217; = r(sum) + `hterm&#8217; &#8211; `b'[1,1]*`b'[1,1]\/2 &#8211; (`b'[1,2]-1)*(`b'[1,2]-1)\/2 \/\/\/ <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">&#8211; (`b'[1,3]-2)*(`b'[1,3]-2)\/2 <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">if `lnf&#8217; == . scalar `lnf&#8217; = -9e300 <\/span><br \/>\n<span style=\"color: #0000ff\">\u00a0\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff\">} <\/span><br \/>\n<span style=\"color: #0000ff\">end<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><strong>Fitting the Model<\/strong><\/p>\n<p>In fitting the model I have used the new Metropolis-Hastings sampler, mhsint, that was introduced last time. There is a version of that function that can be downloaded from my webpage that allows a range parameter to control the maximum allowable move. In this analysis the range is set to 3 so the proposed moves for the start and end of the epidemic can\u00a0be randomly up or down by 1, 2 or 3.<\/p>\n<p><span style=\"color: #0000ff\">matrix b = (0, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 20,20,20,20,20,20,20,20,20,20,20,20, \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0 30,30,30,30,30,30,30,30,30,30,30,30)<\/span><br \/>\n<span style=\"color: #0000ff\"> matrix s = J(1,39,0.1)<\/span><br \/>\n<span style=\"color: #0000ff\"> mcmcrun logpost b using temp.csv, replace \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 param(a b c h1-h12 s1-s12 e1-e12) burn(1000) update(10000) thin(5) adapt \/\/\/<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 samplers( 3(mhsnorm , sd(s) ) 12(mhslogn , sd(s)) 24(mhsint , range(3)) ) jpost<\/span><\/p>\n<p>There are a\u00a0couple of coding points to note: I use the adapt option to tune the standard deviations of the MH samplers during the burnin and I set the jpost option since I am calculating the joint posterior and not the conditional posteriors of the individual parameters.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Examining the Results<\/strong><\/p>\n<p>The list of summary statistics for the parameters is rather long so I have only shown the Hi, Si, Ei values for 2 of the 12 years<\/p>\n<pre>----------------------------------------------------------------------------\r\nParameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 n\u00a0\u00a0\u00a0\u00a0 mean\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 sd\u00a0\u00a0\u00a0\u00a0 sem\u00a0\u00a0 median\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 95% CrI\r\n----------------------------------------------------------------------------\r\n a\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0\u00a0 -0.755\u00a0\u00a0 0.067\u00a0\u00a0 0.0095\u00a0\u00a0 -0.757 ( -0.903, -0.620 )\r\n b\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0 \u00a0 2.979\u00a0\u00a0 0.107  \u00a00.0105\u00a0\u00a0  2.980 (\u00a0 2.764,\u00a0\u00a03.194 )\r\n c\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2000\u00a0 \u00a0 2.186\u00a0\u00a0 0.062\u00a0\u00a0 0.0073\u00a0 \u00a0 2.188 (\u00a0 2.059,\u00a0 2.312 )\r\n h4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0  2000\u00a0\u00a0  0.398\u00a0\u00a0 0.193\u00a0\u00a0 0.0097\u00a0 \u00a0 0.365 (\u00a0 0.105,\u00a0 0.897 )\r\n h5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0 2000\u00a0\u00a0  1.648\u00a0\u00a0 0.068\u00a0\u00a0 0.0024\u00a0\u00a0  1.647 (\u00a0 1.512,\u00a0 1.788 )\r\n s4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 2000\u00a0\u00a0 28.418\u00a0\u00a010.649\u00a0\u00a0 1.5287\u00a0\u00a0 28.000 (\u00a0 3.000, 45.000 )\r\n s5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0 2000\u00a0\u00a0 19.851\u00a0\u00a0 0.358\u00a0\u00a0 0.0120\u00a0\u00a0 20.000 ( 19.000, 20.000 )\r\n e4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0 2000\u00a0\u00a0 50.090\u00a0\u00a0 2.849\u00a0\u00a0 0.1463\u00a0\u00a0 51.000 ( 42.000, 52.000 )\r\n e5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0  \u00a0\u00a02000\u00a0\u00a0 26.878\u00a0\u00a0 0.355\u00a0\u00a0 0.0118\u00a0\u00a0 27.000 ( 26.000, 27.000 )\r\n----------------------------------------------------------------------------<\/pre>\n<p>The intercept, a, and the two regression coefficients, b and c, are reasonably close to my prior estimates, although perhaps the upward drift is a little larger than I was expecting.<\/p>\n<p>Most of the epidemics are easy to detect but I have chosen to display results for\u00a0one year without\u00a0a clear epidemic and one with. The epidemic in year 4 is poorly defined. It has a low height and might start any time between week 3 and week 45. In contrast the epidemic in year 5 is much more clear cut. It raised the expected winter level by exp(1.6)\u22485 times and almost certainly started in week 20 and finished in week 27.<\/p>\n<p>Convergence is not a major problem although the estimated epidemic in year 4 does jump around and were we particularly interested in that year we would need a much longer run in order to capture its posterior. A more realistic solution might be to allow the model to have years in which there is no epidemic. Perhaps we will return to try that modification at a later date.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/trace1-e1441178876817.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-864\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/trace1-e1441178876817.png\" alt=\"trace1\" width=\"600\" height=\"436\" \/><\/a><\/p>\n<p>If we save the posterior mean estimates we can use them to plot a fitted curve<\/p>\n<p><span style=\"color: #0000ff\">insheet using temp.csv , clear<\/span><br \/>\n<span style=\"color: #0000ff\"> mcmcstats *<\/span><br \/>\n<span style=\"color: #0000ff\"> forvalues i=1\/39 {<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local p`i&#8217; = r(mn`i&#8217;)<\/span><br \/>\n<span style=\"color: #0000ff\"> }<\/span><br \/>\n<span style=\"color: #0000ff\"> use tempflu.dta, clear<\/span><br \/>\n<span style=\"color: #0000ff\"> gen eta = `p1&#8242; + `p2&#8217;*time\/1000 + `p3&#8217;*sin(0.060415*week)<\/span><br \/>\n<span style=\"color: #0000ff\"> forvalues i=1\/12 {<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local j = 3 + `I&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local h = `p`j&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local j = 15 + `i&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local S = `p`j&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local j = 27 + `i&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 local E = `p`j&#8221;<\/span><br \/>\n<span style=\"color: #0000ff\"> \u00a0\u00a0\u00a0\u00a0 replace eta = eta + `h&#8217; if year == `i&#8217; &amp; week &gt;= `S&#8217; &amp; week &lt;= `E&#8217;<\/span><br \/>\n<span style=\"color: #0000ff\"> }<\/span><br \/>\n<span style=\"color: #0000ff\"> gen mu = exp(eta)<\/span><br \/>\n<span style=\"color: #0000ff\"> twoway (line chile time ) (line mu time, lcol(red) lpat(dash)) , leg(off)<\/span><\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/fit1-e1441098970945.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-856\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/fit1-e1441098970945.png\" alt=\"fit1\" width=\"600\" height=\"436\" \/><\/a><\/p>\n<p>Here the black line\u00a0represents the data and the red line is the fitted curve based on the posterior means. We seem to pick up the flu epidemics quite well and it is clear that\u00a0the lack of an epidemic in year 4 and the strong epidemic in week 5 accord with the numerical results.<\/p>\n<p>Since the model uses a log-link function we can see the fit better if we look at the log-count and the log of the fitted values.<\/p>\n<p><a href=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/fit1log-e1441098999786.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-857\" src=\"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/files\/2015\/09\/fit1log-e1441098999786.png\" alt=\"fit1log\" width=\"600\" height=\"436\" \/><\/a><\/p>\n<p>The drift upwards is clearer although perhaps it was not linear over the whole range of the data. My overall impression is that the model has been relatively successful in capturing the trend.<\/p>\n<p>There is much more to do but this is enough for one week.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week I discussed Google Flu Trends a project that provided weekly estimates of the number of flu cases in 25 different countries based on the frequency of internet searches for influenza related terms (https:\/\/en.wikipedia.org\/wiki\/Google_Flu_Trends). Sadly Google have stopped producing these estimates but historical data are still available for download. Presumably\u00a0Google gave\u00a0up when faced with [&hellip;]<\/p>\n","protected":false},"author":134,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[81,80],"class_list":["post-850","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-bayesian-model","tag-google-flu-trends"],"_links":{"self":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/850","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/comments?post=850"}],"version-history":[{"count":9,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/850\/revisions"}],"predecessor-version":[{"id":865,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/posts\/850\/revisions\/865"}],"wp:attachment":[{"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/media?parent=850"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/categories?post=850"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/staffblogs.le.ac.uk\/bayeswithstata\/wp-json\/wp\/v2\/tags?post=850"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}