We assessed public health use of R0, the basic reproduction number, which estimates the speed at which a disease is capable of spreading in a population. These estimates are of great public health interest, as evidenced during the 2009 influenza A (H1N1) virus pandemic.
We reviewed methods commonly used to estimate R0, examined their practical utility, and assessed how estimates of this epidemiological parameter can inform mitigation strategy decisions.
In isolation, R0 is a suboptimal gauge of infectious disease dynamics across populations; other disease parameters may provide more useful information. Nonetheless, estimation of R0 for a particular population is useful for understanding transmission in the study population. Considered in the context of other epidemiologically important parameters, the value of R0 may lie in better understanding an outbreak and in preparing a public health response.
During the spring of 2009, the 2009 H1N1 influenza pandemic began in North America and quickly spread around the world, sparking great interest in potential mitigation strategies for the first influenza pandemic in more than 40 years. Research focused on interventions such as social distancing that could be applied before a specific monovalent H1N1 vaccine became available in the fall of 2009. During the initial wave of the 2009 H1N1 outbreak, teams of modelers from around the world gathered available data from Mexico to estimate several of the novel virus’s characteristics.1,2 Efforts focused on the rapid estimation of the basic reproduction number, or R0, of this virus. R0 is a theoretical parameter that provides some information regarding the speed at which a disease is capable of spreading in a specific population. First estimates were published online by early May 2009.1,2 Estimates of R0 continue to be published from other countries and as more data become available.3–11
As an indicator of the interest in publications concerning R0, an early publication on the pandemic potential of the 2009 H1N1 strain by Fraser et al.1 has garnered 654 citations as of February 7, 2013. Although the influenza pandemic explains much of the recent interest in the basic reproduction number, this interest is not limited to the field of influenza. Web of Science searches on the terms “reproduction number” or “reproductive number” revealed that there have been 710 publications on this topic from 2009 through February 7, 2013, across various disciplines, with most articles being published in journals covering infectious diseases and mathematical modeling. Table A (available as a supplement to this article at http://www.ajph.org) shows breakdown by journal. If the search is expanded to include data from previous years, it is clear that there has been exponential growth by calendar year in the number of publications on this topic (Figure 1). Why is there such growing interest in R0 among the disciplines interested in the dynamics of infectious diseases? To help better understand the interest in the basic reproduction number among public health officials, infectious disease researchers, and theoretical modelers, we reviewed the derivation of R0 and its history.
We present a basic epidemiological compartmental model (a susceptible–infected–recovered or SIR model with S, I, and R representing the 3 compartments) described by Kermack and McKendrick.12 In this relatively simple model designed to describe epidemics, individuals start as susceptible to a particular pathogen and then progress to the other 2 compartments if infected. The model is defined by a system of 3 ordinary differential equations (ODEs):
The scientific community largely underappreciated the implications of the Kermack–McKendrick model until the late 1970s, when Anderson and May13 used the model to study strategies for controlling infectious diseases. R0 is a parameter of importance for gauging the disease dynamics because it indicates when an outbreak might happen based on the threshold value of 1.0. More generally, if the effective reproduction number Re = R0 × (S/N) is greater than 1.0, we predict that the disease continues its spread; the effective reproduction reflects the fact that, as proportion of susceptible individuals decreases (S/N), disease transmission slows. From this simple mathematical perspective, epidemiologists frequently consider the basic reproductive number one of the most vital parameters in determining whether an epidemic is “controllable.”14,15 The objective of any public health response during an influenza pandemic, for example, is to slow or stop the spread of the virus by employing mitigation strategies that either (1) reduce R0 by changing the transmission rate (e.g., via school closure) or the duration of infectiousness (e.g., through antiviral use) or (2) reduce Re by reducing the number of susceptible individuals (e.g., by vaccination).
First we considered the information available by estimating R0 in simple models such as the one described by equation 1. Modelers can alter the SIR model by adding or removing additional “compartments.” For example, we can remove the recovered class (R) for diseases in which recovered individuals return to the susceptible class, thus converting it to an SIS model that can be used for diseases such as the common cold.16 We could also add other compartments, such as an “exposed” class (E) if the disease has a significant latent period relative to the infectious period, yielding the SEIR model, which is often used for influenza.17 With additional modifications to the base model, compartmental models can rapidly become complex. We restrict our discussion to SIR or SEIR models because they are useful for demonstrating essential characteristics of R0, and, importantly, for each of these models, R0 = β/γ.
The difference between the equations for SIR and SEIR models is simply the addition of a fourth ODE to those presented in equation 1 that describes the dynamics of the exposed (or latent) class of individuals.17 This ODE adds an additional parameter, ν, that represents the rate at which individuals move from the latent class to the infected class; it is helpful to note that that ν is inversely proportional to the latent period of a disease (i.e., for disease with a long latent period, ν is small). Examples of the disease dynamics produced by SIR and SEIR models with an R0 = 1.5 are shown in Figure 2, illustrating that, even for simple models, the model chosen drives the predicted disease dynamics even when the same basic reproductive number is used. This illustrates that we must understand the compartments in use, the time spent in each compartment, and whether each of these compartments is relevant for the disease in question. Furthermore, another disease characteristic, called the generation time, needs to be known before we can utilize R0 to predict the resulting time dynamics of an outbreak.
A number of characteristics of an epidemic may be of interest to public health officials and policymakers in formulating possible responses. The overall dynamics of the epidemic is only one set of characteristics. Others may include attack rate, illness duration, generation time, time to peak incidence, and even properties including the phase of an influenza pandemic, as defined by the World Health Organization or a national government.18 Therefore, we asked how much estimates of the basic reproductive number assist public health response initiatives by providing information about timing and severity of a disease outbreak.
To assess the utility of estimates of R0 in controlling influenza pandemics, we solved SIR and SEIR models over a broad range of input parameters. The key parameters in these models were the population size, N; the transmission rate, β; the latency period, ≈ 1/ν; the recovery rate, γ; and the basic reproductive number, R0 = β/γ. Our strategy in these analyses was to fix R0 at some value and analyze how the system dynamics changed when altering the other parameters, providing an opportunity to determine if the information gained from R0 estimates was relevant. We chose R0 = 1.5 as a baseline because it is a value that has been applied to previous influenza pandemics and is also near the mean estimate for the 2009 H1N1 pandemic.
The overall attack rate, the percentage of individuals who will get sick during an outbreak in a given population, may be the one disease characteristic of most interest to public health authorities, and the attack rate is the characteristic that appears to be most plausibly predicted by using estimates of R0. Figure 3 shows attack rates for a given basic reproductive number using SIR and SEIR models; the plotted curve has been formally derived and shown to be a transcendental equation.19 In more complex epidemiological models, however, it is unclear how to estimate R0; and predicting attack rates may no longer be possible using R0 alone.
To predict the duration of an epidemic, R0 may also be applied successfully (Figure 4a), as the epidemic duration (measured here as the time between the occurrence of 5% and 95% cumulative incidence) is not dependent on N. However, this relationship does not hold for SEIR models (Figure 4b), as the latency period has dramatic effects on the persistence of the epidemic. As the latency period increases, the generation time increases, protracting the duration of the epidemic. Conversely, as the latency period decreases, the model behaves increasingly like an SIR model.
Moreover, even if the overall generation time, defined here as the sum of the infectious and latent periods (1/γ + 1/ν), is known, the individual values of γ and ν may still be unknown. Without specific values for these individual parameters, it is fair to question the validity of the generation time, which subsequently affects the utility of a given model. A review of the generation times and reproductive number estimates for the 2009 H1N1 pandemic concluded that, when the individual components of the generation time were assessed, researchers underestimated generation time in outbreaks in Canada and Mexico.20 These findings suggest that estimation of these individual parameters is needed before epidemic speed can be accurately calculated.
Likewise, the speed with which an epidemic reaches certain benchmarks is critically dependent on N, β, and γ (Figure 5), not simply on R0. As population size N increases, the time it takes to reach a cumulative incidence rate of, for example, 5%, increases concomitantly. An intuitive explanation is that it takes longer for 5% of a large population to become infected than the same percentage of a small population.
In a similar way, the transmission rate and recovery rate play a crucial role in the overall speed of the epidemic (Figure 5). As the transmission rate increases, the pace of epidemic spread increases dramatically, while the recovery rate decreases to maintain a fixed R0; decreasing the recovery rate corresponds to a decrease in generation time and an increase in wave speed (as long a latency period is held constant). Focusing on the time to peak incidence, we examined the effect of allowing R0 to change by varying the transmission and recovery rates (Figure 6). We found that, regardless of the value of R0, the time to peak incidence depended on the individual values of β and γ; however, these effects are minimized for small reproductive numbers and maximized for large reproductive numbers.
In the discussed framework of simple SIR and SEIR models, we concluded that the basic reproductive number alone provides little information regarding the duration, generation time, speed of epidemic, and overall timing of an infectious disease such as influenza. Rather, we propose that the values of individual parameters are more critical to understanding the disease dynamics and may be more valuable to policy officials in mounting an effective public health response.
Numerous modifications can be made to the basic SIR and SEIR models discussed thus far. These models assume that all individuals belong to 1 large panmictic (well-mixed) population in which all individuals are equally likely to come into contact with each other.21 Typically, this assumption is not reasonable for most human populations, which are often highly structured, with subgroups of individuals more likely to interact with one another than with those in other subgroups. Thus, epidemiological models often use age-structured populations (for example, see Inaba and Nishiura22). Such models require equations similar to those in equation 1 for each age group describing disease transmission within that age group and among other age groups. Another common method for incorporating population structure is to include variables such as household, workplace, school, and community groups in a model.15
Many models may also include a metapopulation structure (a collection of connected populations)3 to describe the dynamics of a disease in multiple cities, where the metapopulation dynamics explain the transmission of disease from one city to the next. Demographic factors are also frequently added to compartment models to make models more realistic. Typical demographic terms included are birth, death, immigration, and emigration (which obviously occur in nearly every population).
Beyond just mimicking more realistic populations, modelers can introduce additional complexities into these dynamic models. Notably, public health interventions are often included in models to judge the potential impact of a particular intervention or combination of interventions.23,24 Compartmental models can encompass a variety of interventions, including use of antivirals, vaccines, masks, hand washing, school closure, social distancing, isolation, and quarantine. Each of these interventions requires tailoring the sometimes numerous ODEs to incorporate new compartments and parameters. It is also noteworthy that many compartmental models also include multiple virus strains.25 A given pathogen often has many different genetic variants (often referred to as strains) circulating in populations. Compartmental models can be modified to include various strains and consequences thereof such as strain facilitation or interference. Virtually any of the aforementioned modifications will change the predicted disease dynamics.
As illustrated previously for simple disease models, estimating R0 does not necessarily allow for useful inference. With increasing complexity, estimating all of the parameters of a model can become overwhelming. Often it is difficult to firmly establish the few parameters that are vital for a basic SEIR model. In practice, parameter values frequently originate from a handful of studies that may not be broadly applicable.26 In light of the difficulty of simply measuring the broad, population-level parameters of a disease, we find it unlikely that age, population, intervention, or strain-specific parameters could be estimated quickly enough to be of use for tailoring specific public health responses. Even in the case of the 2009 H1N1 pandemic, the rapid availability of R0 estimates1–8 was unlikely to have greatly influenced public health response planning, particularly given the variability of these estimates.
Finally, for complex ODE models, and in particular for stochastic simulation models that follow individuals over time, it is not always clear exactly how to calculate the basic reproduction number and how it should be interpreted. For example, a model that includes age structure, population structure, and vaccination status could easily have more than 100 parameters. What does R0 represent in such a model? This issue is an active area of research, with several methods having recently been proposed to address the question (see Heffernan et al.27 for a review of this topic). The salient point is that there are different methods, and that each method can potentially produce a different estimate of R0. As a consequence, using the basic reproduction number to predict an attack rate is dependent on the model employed (Figure 2).
After considering these issues, we believe that the estimation of the basic reproductive number, R0, for a particular disease epidemic has limited practical value outside the population from which the disease data originated. For example, epidemiologists have employed R0 in understanding the 1918 influenza pandemic, making myriad estimates by applying various models, resulting in a broad range of published values.28 This variability highlights the difficulties associated with measuring the basic reproductive number of an epidemic, even when working with a considerable body of epidemiological data from a pandemic that occurred well in the past. These difficulties are amplified in situations such as the 2009 H1N1 pandemic in which data are continually updated and are highly dependent upon the surveillance system implemented. Each surveillance system has unique strengths and weaknesses that must be accounted for, particularly the context of the country in which the system is employed.
To be more specific, substantial heterogeneity has been observed in R0 values across different regions of the world. There is little evidence to suggest that a reproductive number for one geographic area is applicable to another, and many studies conducted within the same region have yielded a wide range of results, with an even wider range among early estimates. For example, among individual states in India, the reproductive number for 2009 H1N1 ranged from 1.03 to 1.7529; likewise, estimates in Peru spanned from 1.2 to 2.2 depending on the specific region studied.8,9 Even close geographic neighbors had disparate R0 estimates; China estimated a mean R0 of 1.68,10 whereas Japan initially approximated a mean R0 of 2.3, which was later reduced to 1.21 to 1.35.11 Correspondingly, in Canada the mean estimate was 1.31,5 whereas public health officials in the United States initially estimated R0 between 2.2 and 2.3, which was subsequently refined to 1.7 to 1.8 with additional data collection.6 On the other hand, not all subsequent estimates of R0 were downwardly biased. Fraser et al. were among the first to estimate the R0 in Mexico, proposing a basic reproductive number of 1.4 to 1.6.1 Just several months later, another team estimated the R0 was between 2.3 and 2.9.7
Statistical realities also hinder the ability to infer overall attack rates with R0 estimates. For example, one widely cited study estimated that the 1918–1919 influenza pandemic had an R0 of approximately 2.0.30 However, after incorporating estimates of variance, the 95% confidence interval ranged from 1.4 to 2.8. Figure 6 shows that this range of R0 predicts attack rates between approximately 51% and 92%. Error around lower estimates of the basic reproductive number produce even more dramatic ranges in attack rates because of the asymptotic behavior of attack as R0 approaches 1.0 (Figure 3). As previously discussed, current estimates of the 2009 H1N1 have ranged from as low as 1.03 to more than 2.9,1–11 which roughly corresponds to a range of attack rates between approximately 6% and 93%. Such a broad range of possible attack rates complicates policy decisions and hinders effective public health interventions.
Overreliance on early estimates of R0 made in one country can lead to policy decisions in another country that may be suboptimal for that country. Moreover, disparate estimations of R0 may drive inadequately informed policies. For example, the Mexican Ministry of Health implemented a mandatory 18-day school closure in Mexico City on April 24, 2009, which was extended to the remainder of the country on April 27, 2009, with schools reopening on May 11, 2009. Chowell et al. analyzed the effect of this brief but intensive public health intervention, calculating an R0 of 1.8 to 2.1, 1.6 to 1.9, and 1.2 to 1.3 for the spring, summer, and fall waves of the epidemic, respectively. Chowell et al. concluded that the Mexico City intervention may have been responsible for a 29% to 37% decrease in transmission during the closure,31 although cost-effectiveness was not a consideration in this study. Meanwhile, with the high initial R0 estimates from Mexico, the US Centers for Disease Control and Prevention (CDC) initially recommended school closures, particularly in border states, including Texas, which ultimately closed 800 schools, affecting 491 000 students.32 On April 28, 2009, the CDC advised that schools close if even 1 suspected or confirmed case of H1N1 was reported, in hopes of reducing transmission to neighboring communities. However, with the influx of new data indicating a lower risk of severe illness and death, the CDC rescinded its recommendation on May 5, 2009, urging schools to remain open.33 We assert that reliance on early approximations of R0, particularly those calculated in a disparate population, can lead to misinformed policy decisions.
Public health responders in the United Kingdom used approaches based on modeling in near real time to determine appropriate public health interventions during the 2001 foot and mouth disease outbreak.23 Policies enacted on the basis of modeling results were controversial and were the subject of debate in peer-reviewed literature. Uncertainty centered around how beneficial the recommended intervention was and whether less radical strategies—given that the intervention, mass culling, was viewed as economically damaging—might have been equally or even more effective.34–39 Although we can only speculate how alternative interventions would have compared with those instituted, this debate highlights the uncertainty over public acceptance of approaches based on reducing effective reproductive ratios.
Even if accurate attack rates could be gauged from R0 estimates, many of the more critical public health questions would remain unanswered. For example, case fatality, hospitalization, and absenteeism rates are essentially independent of disease dynamics (and thus from what can be derived from R0 estimates), yet they are the key determinants of morbidity and mortality during an infectious disease outbreak. The 2009 H1N1 pandemic aptly illustrates this point: high attack rates did not produce correspondingly high levels of morbidity because of the relatively mild severity of infection and the low attack rates among the populations typically at greatest risk for serious influenza complications, particularly adults older than 65 years.
Another important consideration is the potential evolution of a pathogen during the course of an epidemic. In pandemics in which attack rates are high but adverse outcomes are rare, mitigation strategies must consider the possibility of the virus mutating to a more virulent form. If the 2009 H1N1 virus had genetically changed over time to become more virulent, the impact of the virus would have increased dramatically; yet such evolution is largely ignored in the types of dynamic models most used by modelers.40 For pathogens such as influenza that experience rapid evolution, with unknown portions of the population not susceptible to various circulating strains, estimating a viable R0 is a daunting—if not impossible—task.
Estimation of reproductive ratios can potentially provide valuable insights during epidemics. These reproductive ratios, particularly the effective reproductive number, measure the spread of a disease through a population, with higher values indicative of more rapid circulation. We emphasize that estimation of reproductive ratios from data in a particular population is still useful for that population. This parameter is, in its essence, the exponential growth rate of an ongoing epidemic and thus provides information about the current rate of transmission in the study population. The key issues for the interpretation of R values are the period for which an estimate is valid (e.g., does the estimate need to be updated weekly, based on patterns in surveillance data?), the applications thereof for theoretical practices (e.g., determining the propagation of the disease or the potential intervention impact), and how well an estimate for one population applies to another. For example, can an estimate made in California apply to Nevada? Is an R0 for the United Kingdom relevant to New Zealand? Another critical consideration is the applicability of the estimate to a particular population (e.g., was data collected representative of an entire metropolitan area or just an unobserved subcommunity?). Other issues, such as acquiring the data necessary to establish consistent estimates with narrow confidence intervals, also need to be resolved before growth rates made with specific R values can be reliably used in public health decision-making.
If mechanisms existed to estimate real-time or near-real-time values of Re > 1, public health officials could determine if specific intervention strategies—such as school closure or quarantine—were working to alter disease dynamics and whether such interventions should be sustained. Comparison of reproductive numbers before and after public health interventions has been conducted retrospectively for the 1918 influenza pandemic,41 but the effectiveness of an intervention measured from past epidemics is unlikely to apply to a contemporary epidemic (because of changes in social structure, environment, pathogens, etc.). We must recognize that, although the ability to estimate reproductive numbers in real time could be advantageous, the effectiveness of a specific intervention may vary temporally and geographically owing to changes in environment, population structure, viral evolution, and immunity; hence, estimates made in one region may not be applicable to another.
Additional factors essential to the dynamics of infectious diseases, including the transmission rate, are less frequently estimated than R0 (Table 1). Latency period and the period of infectiousness were, however, estimated in some regions for the 2009 H1N1 pandemic.5,42 Tuite et al., in particular, provided a noteworthy example of estimating the individual parameters during the H1N1 pandemic, rather than solely focusing on R0.5 Ideally, those parameters that are the most strain-specific and are direct targets of public health interventions should be the focus of studies conducted during outbreaks with major health and economic implications. Even for these parameters there always exists an interaction between the host and the virus that must be addressed with appropriate statistics.
Parameter | Definition Used Here | Relevance | Public Health Interest | Intrinsic or Specific to |
Transmission rate | The product of the contact rate and the risk of infection; also known as the effective contact rate | Determines peak incidence, time to peak incidence, and duration of an epidemic; low transmission rates reduce overall attack rate | As this is a product of the contact rate and risk of infection, interventions targeted at these 2 parameters affect the transmission rate. | Population × strain × host |
Recovery rate | The reciprocal of the duration of being infectious (i.e., 1/D), thus making this a direct consequence of the “infectious period” | Determines peak incidence, time to peak incidence, and duration of an epidemic; higher recovery rates reduce overall attack rate | Use of antivirals theoretically reduces the infectious period, which in turn lessens disease impact. | Strain × host |
Latent period | Time from being infected until being infectious | Determines peak incidence, time to peak incidence, and duration of an epidemic; long latent periods extend the epidemic duration | For interventions such as quarantine, latent periods play an important role in the duration of quarantine necessary. | Strain × host |
Incubation period | Time from being infected until onset of symptoms | Not relevant to disease dynamics, but is relevant to public health | Incubation period is important for disease surveillance and estimation of when a disease was first introduced to a population. | Strain × host |
Contact rate | The probability of 2 sympatric individuals contacting each other; also known as the total contact rate | In combination with the risk of infection, determines how effectively a disease will be transmitted in a population | Interventions such as school closure and quarantine reduce contact rates. | Population |
Risk of infection | The probability of an infection being transferred to a naive individual; also known as the infectivity or the secondary attack rate | In combination with the contact rate, determines how effectively a disease will be transmitted in a population | Interventions such as hand washing and using face mask reduce infectivity. | Strain × host |
Virulence | The pathogenicity of a disease | Critical to determining the severity of a disease epidemic (e.g., loss of life) | Reducing virulence via use of antivirals lessens disease impact. | Strain × host |
Basic reproductive number | Emergent property of disease models; may be a consequence of all previously listed parameters (and others not listed) depending on the specific model | Broad comparisons of models being used by different modeling groups; if larger than unity, indicates a disease is spreading | Real-time estimation provides current impetus of epidemic. In limited circumstances (depending on accuracy), might be of useful in predicting overall attack rates. | Model × population × strain × host |
The basic reproductive ratio is a complicated property of an epidemic specific to the underlying model used to estimate it, the population being studied (in terms of contact patterns and demography), the host, the pathogen, and often the specific strain of the pathogen. Thus, although R0 is an intuitive property of an epidemic, it is not especially useful in determining potential utility of control measures. However, when considered as part of a collection of estimated epidemic characteristics, R0 may be useful in making public health decisions.
Although infectious disease modelers appreciate the issues discussed in this article, those who apply the results of mathematical models during public health responses may not have a thorough understanding of such issues. If estimates of R0 are to be used in determining public health responses, the limitations of such estimates need to be clearly communicated to policymakers. Beyond those dedicated to calculating R0, resources must also be devoted to estimating other epidemic parameters such as transmission rates, infectious periods, or latent periods that have more relevance to the public health response to infectious disease outbreaks, including influenza pandemics. The analysis of Tuite et al.5 serves as a paradigm of effective analysis of 2009 H1N1 data by highlighting a variety of transmission parameters in addition to R0. Further population-based evidence on how the basic reproductive number relates to disease dynamics holds great promise for optimizing public health interventions for the study population.
Acknowledgments
B. Ridenhour is funded through the Eck Institute for Global Health at the University of Notre Dame and through a contract from the Centers for Disease Control and Prevention via the Intergovernmental Personnel Act. J. M. Kowalik is funded by the University of Notre Dame. D. K. Shay is funded by the Centers for Disease Control and Prevention.
We would like to thank our colleagues at Notre Dame and the Centers for Disease Control and Prevention, and the anonymous reviewers, all of whom improved this study with their useful comments and suggestions.
Human Participant Protection
Human participant protection was not required because this research did not involve human participants.