9 - Modelling Process

Epidemiological modelling and its use to manage COVID-19

Insights into mechanistic models, by the DYNAMO team

Over the next few weeks, we will present some key elements of epidemiological modelling through short educational articles. These articles will help you to better understand and decipher the assumptions underlying the epidemiological models that are currently widely used, and how these assumptions can impact predictions regarding the spread of pathogens, particularly SARS-CoV-2. The objective is to discover the advantages and limitations of mechanistic modelling, an approach that is at the core of the DYNAMO team's work. The examples of models will be inspired by models used in crisis, but sometimes simplified to make them accessible.

#9 – The modelling process

In the previous articles, we have shown some of the major stages in the life of a model, based on concrete examples. We have seen the role of assumptions, knowledge and data in model development. Beyond the case of a specific disease, modelling is in fact part of a broader scientific approach to knowledge development, which we will explain in more detail here.

Modelling compared to other methods of knowledge acquisition

To better understand a biological system, three complementary approaches are used: observation, experimentation and modelling.

  • Observation studies the real system and provides knowledge and data in a specific context (a period, an area). However, observation biases exist: screening tests are imperfect, statistically reliable sampling is not always feasible, and the conditions of observation (observer, weather, sample storage, data entry) may be variable. In addition, it is impossible to observe the system exhaustively (some processes are said to be unobservable) and to know the impact of variations inherent in observing any biological system. Extrapolation to another period or area is therefore complex without large-scale, long-term longitudinal monitoring, which are rare and difficult to maintain due to their cost in time and resources, especially in the current context in research.
  • Experimentation provides very precise information on a particular process under controlled conditions. It allows the observation of processes for which no routine collection protocol is available (e.g. monitoring the intra-host immune response) at the cost of simplifying the experimental system (model species, living conditions, etc.). In addition, the amount of data collected is generally low due to the very high costs of these experiments.
  • Modelling studies the system as a whole, in an integrative manner. This approach requires less time and resources, while allowing the comparison of a very wide range of situations, all other things being equal. It is even possible to evaluate measures that do not yet exist, without ethical restrictions since the system is virtual. However, it is a simplified representation of reality, based on hypotheses, and fed with knowledge and data (observational or experimental). This approach must be informed by interactions between disciplines and with the end-users of the model outcomes.

Continuous back-and-forth between model and knowledge/observational/experimental data

Le processus de modélisation

Developing a model is essentially an iterative process, which must incorporate new knowledge, new observational data, but also take into account the evolution of target questions that the model is designed to answer. Classically, there are at least 3 consecutive phases:

  • Step 1 (blue) - A model cannot answer everything, it is necessary to specify its use and check that the modelling approach is relevant to answer the question asked. As in observational studies, it is necessary to delimit the system to be studied: what are the hypotheses regarding the mechanisms involved? What knowledge (often heterogeneous and diffuse) is available? From this, the conceptual scheme of the model can be built, which lists its state variables, the transitions between states (model structure), the functions and parameters required, but also the underlying assumptions and what the model will be able to predict (see article #1). This collective step brings together all the disciplines required by the issue. For Covid-19, for example, bringing together epidemiologists, virologists, infectious diseases specialists, immunologists, will identify the key processes in the dynamics of human infection, available data and knowledge, as well as the main gaps in knowledge. Controlling spread is a more complex issue than just understanding its transmission, with implications beyond biology, hence involving more disciplines. Achieving this is an important collective step that should not be overlooked!
  • Step 2 (turquoise) - This conceptual model is then implemented, either as a system of equations if it is simple enough (mathematical model), or in computer code (simulation model). The outputs of the model are evaluated, by comparing them with expert opinions, or with observations if data are available. The model can be adjusted at this stage, by estimating the most uncertain parameters (see article #8), or by revising some of its assumptions.
  • Step 3 (orange) - The model is then usable. The sensitivity of the model to the variation of its parameters is analyzed to identify the most detrimental gaps in knowledge and the potential levers to control the system. If the predictions of the model are robust to uncertainties and the model considered relevant (following the above analyses and also discussions with its end users, often the managers of the modelled system), it is used via numerical experiments to compare scenarios, identify optimal situations, understand the impact of a change in practices, etc. The model is then used in a different way according to its predictive capabilities. The new questions brought by this step, or the production of new knowledge or data, may require the modelling process to be continued from step 1.

Building a model: a world of compromise

Under no circumstances can the design of a model be the result of unilateral choices:

  • The model is co-constructed with biologists and end-users: it is therefore necessary to ensure that the model is transparent, with no hidden assumptions in the code, to facilitate interactions between disciplines, and that it is readable by end-users during the co-construction process.
  • The relevance of the predictions must be ensured, as well as their validity domain, their robustness with respect to the uncertainties on the parameters and the structure of the model, and this from the various disciplines involved.
  • The more complicated the model, the less easy to determine its parameters... but often, the more realistic and therefore the more readable by biologists and end-users. Parsimony is required, while maintaining a good level of flexibility and modularity so that the model does not have to be reprogrammed each time its assumptions are revised. This also makes it easier to update the model when new knowledge is produced. On the other hand, it often decreases the performance of the codes, although technical solutions exist today to remedy this.

 In article #10, we'll see how to analyze a simulation model.