Nearly 16,000 Coronavirus cases went unreported in England.
Error Type: Data error; incorrect use of Excel.
Reported on: 5th October 2020
Public Health England (PHE) misreported COVID-19 cases in England. Over a period of eight days 15,841 positive cases of COVID-19 were not reported or passed into the contact-tracing process.
What caused the error?
Public Health England chose Excel as the tool to collate csv files containing the results from the companies carrying out the COVID tests. The process was designed to pull the csv files automatically into Excel, collate them and then upload them into the central system.
Now, whilst it does seem pretty crazy to choose Excel to do this for such a vital process, anyone who has worked in data will know just how common this is.
The real problem came about because the developers chose to use the .xls file format rather than the newer .xlsx file format. The old (pre 2007) .xls Excel format is limited to 65,000 rows, whilst the more recent .xlsx format can handle over a million.
The consequence of this is that as the csv files were collated they capped out at 65,000 rows, with any rows in excess of this being lost into the ether.
How could the error have been avoided?
When designing any process, and even more so for a process of national importance, developers should assess the tools available to them a choose the right software for the job.
In this case its fair to say Excel was the wrong choice, and certainly the pre 2007 version.