Lies, damned lies, and COVID-19 statistics?

A few days ago WHO reported that the UK had had over 300 000 confirmed cases of COVID-19, but now WHO is reporting the cumulative total is many fewer. How come?

Keith S. Taber

I have been keeping an eye on the way the current pandemic has been developing around the world by looking at the World Health Organisation website (at https://covid19.who.int) which offers regularly updated statistics, globally, regionally, and in those countries with the most cases.

An example of the stats. reported by the WHO (June 23rd 2020)
Note: on this day the UK Prime Minister reported: "In total, 306,210 people have now tested positive for coronavirus" which almost matches the figure shown by WHO (306 214) the next day.

Whilst the information is very interesting (and in view of what it represents, very saddening) there are some strange patterns in the graphs presented – reminding one that measurements can never just be assumed to precise, accurate and reliable. Some of the data looks unlikely to accurate, and in at least one case what is presented is downright impossible.

Questionable stats.

One type of anomaly that stands out is how some countries where the pandemic is active suddenly have a day with no new cases – before the level returning to trend.

This appeared to be the case in both Spain and Italy on 22nd March, and the two months later the same thing happened in Iran. One assumes this has more to do with reporting procedures than blessed days when no one was found to have the infection – although if that was the case should there not be some compensation in the following days (perhaps so in Spain above, but apparently not in Italy, and certainly not in Iran)?

Less easy to explain away is a peak found in the graph for Chile.

Suddenly for one day, 18th June, a much larger number of cases is reported: but then there is an immediate return to the baseline:

How is it possible that suddenly on one day there are seven times as many cases reported – as a blip superimposed on an otherwise fairly flat trend-line? Perhaps there is a rational explanation – but unfortunately the WHO site is rich in stats, but does not seem to offer interpretation or explanations *. Without a rationale, one wonder just how trustworthy the stats actually are.

Obviously false information

Even if there are explanations for some of these odd patterns due to the practicalities of reporting, and the ongoing development of systems of testing and reporting, in different jurisdictions, there is one anomaly that cannot be feasibly explained – where the data is surely, and clearly, wrong.

An example of the stats reported by the WHO (July 6th 2020)

So the graph above shows the nations with the most reported cases as of the last few days. This is a more recent update than the similar image at the top of this page. Yet, the cumulative total of confirmed cases for the United Kingdom in this figures is something like 20 000 cases LESS than the figure quoted in the EARLIER set of graphs. (Note that this has allowed the UK to have lower cumulative totals than either Chile or Peru – which would not have been the case without this reduction in cumulative total.)

The total number of confirmed cases in the UK is now (7th July) LESS that it was a week ago (see above). How come? Well, a close look the graph below explains this. The drop in cumulative numbers is due to the number of new cases that WHO gives as reported on 3rd July, when there were -29 726 new cases. Yes, that's right minus 29 thousand odd cases.

The WHO data show negative cases (-525 new cases) for the UK on May 21st as well, but on the 3rd July the magnitude of the negative number of confirmed cases is over three times as many as the highest daily number of positive new cases on any single day (April 12th, i.e., 8719 new cases).

I can imagine that if it was identified that a previous miscalculation had occurred it might be necessary to revise previous data. But surely an adjustment would be made to the earlier data: not the cumulative total corrected by interjecting a large negative number of cases on some arbitrary date in order to put the total right. [Note: the most recent data I can find on the UK government site cites 309,360 confirmed cases as of 26th June (2020-06-26 COVID-19 Press Conference Slides) so as of yet the UK data does not show the reduction in cumulative total being published by WHO.**]

Yet surely someone at WHO must have spotted that the anomaly is bizarre and brings their reports into question. The negative cases claimed for the UK on that one day are so great that the UK line has since burrowed into the graphics for completely different countries. (See below. On the day the UK graph was located above the graphic for Mexico, the UK line actually went down so far it actually crossed below the line for Mexico.)

Of course, each unit in these figures represents someone, a fellow human somewhere in the world, who has been found to be infected with a very serious, and sometimes fatal, virus. Fixating on the stats can distract from the real human drama that many of these cases represent. Yet, when the data reflect something so important, and when data are so valuable in understanding and responding to the global pandemic, such an obvious flaw in the data is disappointing and worrying.

*I could not find a link to send an email; a tweet did not get a response from @WHO; and an invitation to type my question on the website was met by the site bot with a suggestion to return to the data I was asking about.

** If I subsequently learn of the reason for the report of negative numbers of cases in these statistics, I will post an update here.

Update at 2020-07-12: duplicate testing

As of Saturday 11 July 2020 at 6:20pm
The UK government reports
Total number of lab-confirmed UK cases
288,953
Total number of people who have had a positive test result

So this is less than they were reporting a week earlier, despite their graph (for England, where most cases are because it is the most populous county of the UK) not showing any dip:

However, I did find this explanation:

"The data published on this website are constantly being reviewed and corrected. Cumulative counts can occasionally go down from one day to the next, and on some occasions there have been major revisions that have a significant effect on local, regional, National or UK totals. Data are provided daily from several different electronic data collection systems and these can experience technical issues which can affect daily figures, usually resulting in lower daily counts. The missing data are normally included in the data published the following day.

From 2 July 2020, Pillar 2 data [from "swab testing for the wider population" i.e., than just "for those with a clinical need, and health and care workers"] has been reported separately by all 4 Nations. Pillar 2 data for England has had duplicate tests for the same person removed by PHE [Public Helth England] from 2 July 2020. This means that the cumulative total number of UK lab-confirmed cases is now around 30,000 lower than reported on 1 July 2020."

https://coronavirus.data.gov.uk/about

So that explains the mystery – but duplicate reporting at that level seems extraordinary! It does not support confidence in official statistics. An error of c.10% suggests a systemic flaw in the methodology being used. It also makes one wonder about the accuracy of some of the figures being quoted for elsewhere in the world.