How Good Descriptions Increase Our Scientific Understanding of the World

Understanding the Covid-19 Pandemic

By Johannes Findl, Javier Suárez

November 29, 2021

Scientists aim to understand the world around us. They want to know why planets move, how mountains are formed, why diseases spread, or how species evolve. One way to achieve understanding of how the world around us works is to find regularities, natural laws—like Newton’s theory of gravity—or to uncover how things influence each other—like Pasteur’s discovery that microbes cause diseases. In this process, scientists build theories or models to explain the phenomena around us. As a result, we can understand their behaviour, and in some cases even acquire technological control over them. Among philosophers and scientists, there is the widespread belief that to acquire understanding of phenomena, scientists need to explain how they behave. But are explanations the only way in which scientists can gain understanding of phenomena which surround us?

What is clear is that scientists are not only in the business of constructing explanations of phenomena. They often only classify and describe things around us—like the different kingdoms in biology. Another example is the periodic table developed by Mendeleev and Meyer. In the history of chemistry, the discovery of numerous new chemical elements and their properties is inseparably linked to the chemists’ extensive use of and reliance upon the periodic table’s description of the order of the elements.

In our recent work, Descriptive understanding and prediction in Covid-19 modelling, we investigated whether scientists could also gain understanding of phenomena by merely describing them, without the need of referring to an explanation that would ultimately account for their behaviour. This is a fairly controversial idea since one might think that only describing the world would by no means be sufficient for understanding it. Intuitively, understanding phenomena would also require appealing to a scientific explanation which accounts for them in terms of laws of nature or causal connections. Otherwise, scientists would be at the risk of conflating correlation with causation, and while the later provides understanding, the former clearly does not.

As a matter of fact, scientists are very conscious of this problem. They are careful to avoid the mistake of taking correlated events as if one were the cause of the other. Imagine Lucy has the flu and drinks orange juice. Suppose she recovers within one week. This does not mean that the orange juice cured her. Simply, the intake of orange juice was correlated— it occurred at the same time—as the recovery process. A description of the chain of events seems not to be good enough to discriminate between correlation and causation, because we would limit ourselves to taking notes about orange juice consumption and flu recovery. What we really want is an explanation. But this is not always the case in scientific research. Sometimes scientists simply gain better understandings of the world by building a good description of it, a phenomenon we call descriptive understanding.

Take how researchers built COVID-19 models as an example. We investigated the Institute of Health Metrics and Evaluation’s (commonly known as IHME, its English abbreviation ) statistical-epidemiological model of COVID-19 to analyse how and whether epidemiologists gained understanding of the pandemic in the very process of building and developing it. Statistical epidemiological models do not introduce causal knowledge about the spread of a disease, but simply describe and try to predict its trajectory on the basis of a limited and fairly general data set. In our particular case, the IHME scientists build their model to predict the trajectory of the mortality rate for some locations over time on the basis of trajectories that had already been observed in different locations, so that political actions could be taken in proportion to the peaks and declines of deaths expected to result from COVID-19.

Researchers built the IHME statistical model of COVID-19 assuming very general knowledge about disease spread, mostly inferred from knowledge of previous pandemics—like the Spanish flu—and from the data available from Wuhan City, which they then extrapolated to other locations. The model allowed making predictions about how the mortality rate would evolve if no restrictive measures—like partial or total confinements—were taken, as well as how the mortality rate would fluctuate if certain restrictive measures were introduced, for different locations around the world. These locations included big cities like London, New York, Berlin, or Madrid, among others.

Unfortunately, the early version of the IHME model yielded predictions for each of these cities that were soon to be found unwarranted, since they diverged too largely from what was later observed. A rational option, plausibly liked by those who believe that theories should be abandoned when they are proven false (à la Popper ), would have been to abandon these statistical models and instead build something else, for example causal epidemiological models. But the epidemiologists working at the IHME decided to modify the assumptions upon which their statistical models were built, adjusting the models in accordance with the newly available evidence.

One of these major adjustments occurred in April 2020. Scientists had realized that the assumption according to which the effects of restrictive measures in Wuhan would be similar for other locations was wrong. In fact, the effects of restrictive measures varied considerably across locations, in part due to sociological factors—people respond differently to restrictive measures in different places of the world—and in part due to structural factors of the locations themselves, such as population density, distribution of the population across the city, existence of a divide between residential and commercial areas, etc.

Therefore, epidemiologists adjusted their statistical models for each of the locations, so that the new versions of the model would include parameters specifically derived from local information. This resulted in 17 different statistical models—one per location—which were all derived from the first statistical model developed by the IHME. These models proved much more accurate than their predecessor in predicting the trajectory of the mortality rates.

These models did not convey any causal knowledge specific for COVID-19, nor knowledge concerning natural laws regarding pandemics in general. However, they conveyed understanding of the ways in which different aspects of each respective location would affect the evolution of the pandemics, namely: understanding of the way in which people’s compliance with the restrictive measures implemented by their governments would affect the mortality rate over time there. Thus, these models thus helped to generate understanding of epidemiological dynamics.

We coined the concept of descriptive understanding to refer to the type of understanding of the world that the IHME epidemiologists obtained by developing, updating, and revising their model. The label descriptive alludes to the model’s growing capacity to convey ever more accurate predictions despite the lack of causal or lawful components. This capacity is a common function—among many others—of purely descriptive work typically found in many sciences. For example, a good description of the circulatory system allows medical doctors to detect arteries when they are practicing surgeries, as they may predict where the artery will be found when they open the body. But this descriptive knowledge is not causal, as it does not allow specific interventions or associations apart from the mere prediction that allows them to be cautious and prevents them from being wrong in their medical practice.

We discovered that building the early versions of COVID-19 IHME model can be analysed in terms of obtaining and deepening this type of knowledge. While not conferring any causal knowledge about COVID-19, it allowed epidemiologists to see which of their assumptions were wrong. It allowed them to see that it was simply wrong to assume that the same restrictive measures would have the same effects in different locations, regardless of local features location with regard to which these may vary.

In learning that, though, epidemiologists gained new and better understanding of the pandemic: they comprehended its spread better than they did when they developed the first versions of the IHME model.

This new modality of understanding, descriptive understanding, will require further scrutiny by scientists and philosophers. While we are not sure that descriptive understanding will be found in every discipline, or can be considered a universal scientific achievement, we are however certain that it plays a significant role in many disciplines, and we invite the readers to explore its applications further!