From engineering to medicine, professionals are often forced to rely on tests that are not always completely accurate. In this post, I shall look at the tests that millions of people were obliged to use during the pandemic, to check whether they had COVID-19. The two most common tests were Lateral Flow and PCR. Lateral Flow was quicker and more convenient, while PCR took longer (because the sample had to be sent to a lab) and was supposedly more accurate.
There was also a difference in the data collected from these tests. Whereas all the results from the PCR tests should have been available in the labs, the results from the lateral flow tests were only reported under certain circumstances. There was no obligation to report a negative test unless you needed access to something, and people sometimes chose not to report positive tests because of the restrictions that might follow. And of course people only took the tests when they had to, or wanted to. When people had to pay for the tests, this obviously made a big difference.
To compensate for these limitations, some random screening was carried out, which was designed to produce more reliable and representative datasets. However, these datasets were much smaller.
So what can we do with this kind of data? Firstly, it tells us something about the disease – whether it is distributed evenly across the country or concentrated in certain places, how quickly it is spreading. If we can combine the test results with other information about the test subjects, we may be able to get some demographic information – for example, how is the disease affecting people of different age, gender or race, how is it affecting different job categories. And if we have information from the health service, we can estimate how many of those testing positive end up in hospital.
This kind of information allows us to make predictions – for example, future demand for hospital beds, possible shortages of key workers. It also allows us to assess the effects of various protective measures – for example, to what extent does mask-wearing, social distancing and working from home reduce the rate of transmission.
Besides telling us about the disease, the data should also be able to tell us something about the tests. And the accuracy of the predictions provides a feedback loop, which may enable us to reassess either the test data or the predictive models.
In her book The Body Multiple, Annemarie Mol discusses the differences between two alternative tests for atherosclerosis, and describes how clinicians deal with cases where the two tests appear to provide conflicting results, as well as cases where there may be other reasons to question the test results. Instead of having a single view of the disease, she talks about its multiplicity or manyfoldedness.
But questioning the test results in a particular case, or highlighting particular issues with a given test, does not mean denying the overall value of the test. Most of the time we can continue to regard a test as useful, even as we are considering ways of improving it.
If and when we introduce a new or improved test, we may then wish to translate data between tests. In other words, if test A produced result X, then we would have expected test B to produce result Y. While this kind of translation may be useful for statistical purposes, we need to be careful about its use in individual cases.
For many people, the second discourse appears to undermine the first discourse. If we can’t always trust the data, can we ever trust the data? During the COVID pandemic, many rival interpretations of the data emerged; some people chose interpretations that confirmed their preconceptions, while others turned away from any kind of data-driven reasoning.
The COVID pandemic became a politically contentious field, so what if we look at other kinds of testing? In safety engineering, components and whole products are subjected to a range of tests, which assess the risk of certain kinds of failure. Obviously there are manufacturers and service providers with a commercial interest in how (and by whom) these tests are carried out, and there may be regulators and researchers looking at how these tests can be improved, or to detect various forms of cheating, but ordinary consumers don’t generally spend hours on YouTube complaining about their accuracy and validity.
Meanwhile even basic corporate reporting may be subject to this kind of multiplicity, as illustrated in my recent post on Data Estimation (July 2022).
So there is a level of complexity here, which not all data users may feel comfortable with, but which data professionals may not feel comfortable about hiding. In a traditional report, these details are often pushed into footnotes, and in an online dashboard there may be symbols inviting the user to drill down for further detail. But is that good enough?
Annemarie Mol, The Body Multiple: Ontology in Medical Practice (Duke University Press 2002)
Wikipedia: COVID-19 testing