The Impact of Social Behavior and Testing Efficacy on the Covid-19 Infection Rate in the US – an Application of Bayes Theorem

*By Nathaniel Mathew*

With 235,000 deaths and 9.6 million cases of infection in early fall of 2020, the US stood disempowered before the rising rates of Covid-19 infection across the nation. Several factors contributed to the rise in infections across the U.S. such as ill-preparedness to combat the infection, slow defense strategies, insufficient education among the public, strained medical resources, and stormy politics of the nation’s leaders. While the immediate causes may seem many, the underlying pattern that connects it all is social behavior driven by public perception.

Current Infection Trends

Figure 1: Infection Rates in the US till early fall

As evident from Fig 1, the US has seen two distinct waves of coronavirus infection with the third one in its expanding zone as of early fall. The first cases were recorded in early March and the number of cases peaked in early April 2020. A month later, the infection rate dipped, but only to rise again by mid-July. The numbers fell to a low of 25,166 by September 07, 2020 only to further rise to the highest it has ever been by autumn.

As of early fall, the US has the highest number of confirmed infections across the world – over one-fourth the global count despite accounting for only 4% of the total population.

The Role of Bayes Theorem

      In situations where the factors that collectively contributed to the outcome are varied in nature, a simple but powerful statistical equation can be introduced to shed some light on the impact on infection rates. Unlike most mathematical absolutes, this equation involves the calculation of the probability of an event A happening given the occurrence of event B. This fundamental concept was first introduced by Reverend Bayes in his paper “An Essay Toward Solving a Problem in the Doctrine of Chances.” It was posthumously published and called ‘Bayes Theorem’.

The theorem is stated below:

This theorem shows that the posterior outcome (left-hand-side of the equation) is directly proportional to the likelihood of an event occurring multiplied by the prior value.

Detection of Probability of an Individual Testing Positive using Bayes Theorem

At the onset of the viral infection, effective ways to determine infection in the public weren’t present. However, by mid-March, all fifty states were able to begin testing using kits obtained from the CDC or commercial labs. Due to the rapid and unstandardized ways of test production (1), the tests were subject to inaccuracies and defects, and therefore could not be considered 100% conclusive.

Bayes theorem can be used to determine the probability of an individual not having Covid-19 after being tested positive for the virus. This paper studies the impact of variations in each influencing factor on the probability of an individual in the US being Covid-19 positive given a positive test result. Five independent scenarios are considered to understand the impact. In the Current Scenario, early fall infection rate and test efficacy numbers are taken into consideration making this scenario the control against which all other scenarios will be compared. In Scenario 1, the average infections per day are significantly
reduced from 150,000, in the Current Scenario, to 50,000 considering better containment and social distancing measures. In Scenario 2, the number of false positives is brought down from 1 in 500 to 1 in 1000 considering greater standardization and improved accuracy of tests. In Scenario 3, the number of false negatives is brought down to 5% considering greater standardization and improved accuracy of tests.  In Scenario 4, the optimal conditions from Scenarios 1, 2, and 3 are considered which includes reduced incident of Covid-19 infection in the population and increased test efficacy resulting in reduction of false positives and false negatives.

Current Scenario: Based on Early Falls Data

    The US population rests at approximately 331 million of which an average of 150,000 Covid-19 positive cases are reported on a daily basis as of early fall. Assuming that the infectiousness of the virus in each individual last for around 14 days, the average US population that will be infected at the end of 14 days can be assumed to be 2.1 million. This means that the percentage of population infected can be estimated to be 0.63% while the population not infected is at 99.37%. Assuming that the tests are 85% accurate with only 0.5% testing false positive, as outlined in Table 1, the probability of Covid-19 infection in an individual can be determined using Bayes theorem.

Table 1: Based on early fall data

Color Key:

The assumptions can be represented in the following manner:

On applying these assumptions to Bayes theorem, the probability of an individual having Covid-19 given their last test is positive can be represented as:

Upon substituting the above equation with the estimated values, you get:

Based on this application of Bayes theorem, even if a person tests positive, he or she has only a 52.05% chance of being infected by Covid-19.

Scenario 1: Based on Hypothetical Reduction in Daily Infection Rate

Considering that the US population is at 331 million, if the rate of daily Covid-19 infections can be brought down from the 150,000 per day, as in the Current Scenario, by better social distancing and hygiene practices to 50,000 per day, then the average US population that will be infected at the end of 14 days can be assumed to be 0.7 million. This means that the percentage of population infected can be estimated to be 0.21% while the population not infected is at 99.79%. Assuming that the tests are 85% accurate with only 0.5% testing false positive, as outlined in Table 2, the probability of Covid-19 infection in an individual can be determined using Bayes theorem.

Table 2: Based on hypothetical reduction in daily infection rate

The assumptions can be represented in the following manner:

Bayes theorem for determining the probability of infection in a Covid-19 positive individual can be represented in the following manner:

Upon substituting the above equation with the estimated values, you get:

Based on this application of Bayes theorem, even if a person tests positive, he or she has only a 26.49% chance of being infected by Covid-19. This Scenario shows that if the infected population is reduced without significant changes in test efficacy, the possibility of being infected after being tested positive is halved when compared to the Current Scenario.

Scenario 2: Based on Early Falls Data Considering Reduced Incidence of False Positives

The US population rests at approximately 331 million of which an average of 150,000 Covid-19 positive cases are reported on a daily basis. Assuming that the infectiousness of the virus for each individual last for around 14 days, the average US population that will be infected can be assumed to be 2.1 million. This means that the percentage of population infected can be estimated to be 0.63% while the population not infected is at 99.37%. Assuming that the test efficacy has increased from the Current Scenario of 1 in 500 to only 1 in 1000 testing false positive, as outlined in Table 3, the probability of Covid-19 infection in an individual changes from the value expressed in the Current Scenario.

Table 3: Based on Early Falls Data considering reduced false positive occurrence

Color Key:

The assumptions can be represented in the following manner:

On applying these assumptions to Bayes theorem, you get the following equation:

Upon substituting the above equation with the estimated values, you get:

Based on this application of Bayes theorem, even if a person tests positive, he or she has a 15.56% chance of not being infected by Covid-19. Thus, it is evident that if the number of false positives is brought down to 1 in a 1000, the reliability of tests goes up by around 1.6 times as when compared with the Current Scenario.

Scenario 3: Based on Early Falls Data Considering 95% Accuracy in Test Results

The US population rests at approximately 331 million of which an average of 150,000 Covid-19 positive cases are reported on a daily basis. Assuming that the infectiousness of the virus for each individual last for around 14 days, the average US population that will be infected can be assumed to be 2.1 million. This means that the percentage of population infected can be estimated to be 0.63% while the population not infected is at 99.37%. Assuming that the test accuracy has increased from 85% to 95% of tests giving accurate positive detection, as outlined in Table 4, the probability of Covid-19 infection can be calculated as given below.

Table 4: based on early falls data considering 95% accuracy in test results

Color Key:

The assumptions can be represented in the following manner:

On applying these assumptions to Bayes theorem, you get the following equation:

Upon substituting the above equation with the estimated values, you get:

Based on this application of Bayes theorem, even if a person tests positive, he or she has a 45.18% chance of not being infected by Covid-19. This shows that the effect of improvement in accuracy of tests on the probability of Covid-19 infection in an individual is negligible with only a slight improvement of 2.77% over the Current Scenario value of 52.05%.

Scenario 4: Based on Optimal Conditions

Considering that the US population is at 331 million, if the rate of daily Covid-19 infections can be brought down from 150,000, in the Current Scenario, to 50,000 per day by better social distancing and hygiene practices, then the average US population that will be infected at the end of 14 days can be assumed to be 0.7 million. This means that the percentage of population infected can be estimated to be 0.21% while the population not infected is at 99.79%. Assuming that the test accuracy has increased from 85% in the Current Scenario with 95% of tests giving accurate positive detection and only 1 in 1000, as opposed to 1 in 500 in the Current Scenario, showing false positives, as outlined in Table 5, the probability of Covid-19 infection can be calculated as given below.

Table 5: Based on Optimal Conditions

Color Key:

The assumptions can be represented in the following manner:

On applying these assumptions to Bayes theorem, you get the following equation:

Upon substituting the above equation with the estimated values, you get:

Based on this application of Bayes theorem, even if a person tests positive, he or she has a 33.19% chance of not being infected by Covid-19. This scenario shows that when optimal conditions of reduced infection rate with increased efficacy of tests by the reduction of false positives and false negatives is considered, the accuracy of Covid-19 detection is 1.28 times better than that in the Current Scenario.

What Makes Covid-19 so Deadly?

Covid-19 is a strain of coronavirus that has never been previously identified in humans [2]. It is deadly because, first, SARS-CoV-2 is capable of attacking several points in a cell to gain entry. Once it enters, it converts the cell into a virus manufacturing machine to produce more viruses. Second, it has a powerful mutation correction system that prevents weakness in the virus. This is also the reason why existing antiviral drugs are not effective against it. Third, the virus easily infects the nose, throat and lungs of a victim making it possible for the virus to spread through saliva droplets even before the victim starts showing symptoms [3].

On January 23, 2020, the WHO presented an estimated R0 of 1.8 to 3.5 for the novel coronavirus [4]. In simple terms, R0 refers to the number of people a single infected person has the potential to infect. This depends on the biology of the virus, social interactions between people, socioeconomic class, air conditioning, weather, and many more factors. If R0 is less than 1, the infection will gradually disappear on its own. However, the higher its value is over 1, the greater the chances of exponential rates of infection [5].

Factors that Accelerated the Infection Rate

Following the rise in numbers of Covid-19 positive cases in the US in March, the lockdown helped contain a great part of it and bring down the numbers from 35,000 to 18,744. However, as the states started releasing lockdown measures towards April end and early May, countrywide cases began to rise again.

One specific factor that has contributed to the spread is the reopening of schools. A recent study by the CDC shows that almost 100,000 cases were reported between August and September – exactly around the time that college students started returning to colleges. In fact, weekly cases among 18-22-year-olds increased by as much as 55% [6].

According to early fall data, Texas, Florida, Arizona and California are the country’s latest epicenters for virus infections. The sudden release of lockdown measures coupled with the simultaneous opening of bars and restaurants, gyms, beaches, and amusement parks caused young people to flock in. This trend keeps in line with the sudden increase in infections among the younger population [6].

Of the 50 states of the USA, only 11 made the use of masks mandatory. In the other states, the public was not educated adequately about the need to wear masks, use sanitizers, or maintain social distancing. Another factor that contributed to the sudden rise was the outbreak of the virus in institutions such as nursing homes and prisons. San Francisco Bay Area’s San Quentin Prison saw a massive outbreak with more than a 1000 infected [6].

Big cities across the nation also witnessed mass protests against the injustice of George Floyd’s death. The data on infection spread due to these protests is, however, insufficient [6].

An upcoming cause for concern is the onset of the winter season when people tend to spend more time indoors. The tendency to inhabit low ventilation lodgings can result in a greater spread of the virus, especially since the virus survives better in cold conditions. There is further fear that this will collide with the US influenza season causing more damage [7].

Though there is talk about the rates increasing due to increased testing, the flip side to the argument is that if the rates were actually going down, an increasing proportion of tests would have come up negative. According to the WHO, states must have a positive case rate of 5% and below before they release lockdown measures [8].

More than anything else, a gross downplaying of the severity of the situation by politicians and key government officials resulting in a false sense of security and complacent behavior is one of the major reasons behind the uncontrolled rise of Covid-19 cases in the US [9].

The Effect of Public Perception on Social Behavior

As the rate of infection continues to rise unchecked and the prospects of greater spread looms over the upcoming winter, the question of who or what was responsible for the uncontrolled outbreak surfaces with pressing importance. To what extent did the change in public perspective result in the current dynamic increase in the number of cases as opposed to the first wave? Is it possible that after the first wave of fear subsided and more information about methods of spread and age-associated fatality became available, people lowered their guard and disregarded social distancing protocols? Or maybe they mimicked top politicians in their disregard for using masks as tools of protection and containment? Could it be that the heat drove people in masses to engage in outdoor activities once the lockdown was withdrawn? Or perhaps it was isolation fatigue that drove people to gather in crowds and socialize? Maybe all of these reasons contributed to some degree. What can be agreed upon is that social behavior influenced by changing public perception played a major role in the rapid escalation of infections.

Proposed Bayes Theorem Application in Predicting Covid-19 Infection Across a Larger Population

Bayes theorem can be further applied to determine the rate of infection in specific communities by considering the various factors that increase chances of infection in the population.

The rate of infection of Covid-19 is not only dependent on viral susceptibility but is also affected by social interactions, culture, socioeconomic class, weather, use of protection such as masks and sanitizers, etc. These can all be called likelihoods.

If data can be collected within a control group of a certain locality, culture, economic class, political affinity, and age group of people based on their level of outdoor activities and adherence to social distancing protocol, the likelihood of infection among the control group can be determined. This information when combined with existing rates of infection can be used to make projections about the updated rate of infection within a larger population that is a scaled-up version of the control group. Using the new and updated infection rate, steps can be taken to prepare for and contain the upcoming spread within the target community.

Conclusion

Bayesian statistics is a melody of hierarchical probability changes where each variable is influenced by fluctuating parameters chronologically. Thus, each variation can influence the end result. What makes Bayesian statistics a perfect application for this pandemic situation is its capacity to accommodate the randomness and multitude of factors influencing the outcome to provide a safe range of 95% probability window to work with until new data surfaces. 

Bayes theorem has varied applications. It has been used to generate models for epidemiology, conduct search-and-rescue operations, locate sunken ships and missing aircrafts, identify the author of a literary work, and even to win wars. A form of Bayes theorem was used to crack the Enigma code that was used by Germans during the Word War II, bringing the war to a quick end and saving many lives. It’s potential role in bringing the current pandemic under control is monumental.

References

1. Shmerling, Robert H. Which test is best for COVID-19? Harvard Health Publishing. [Online] August 10, 2020. https://www.health.harvard.edu/blog/which-test-is-best-for-covid-19-2020081020734

2. About Covid-19. World Health Organization. [Online] 2020. http://www.emro.who.int/health-topics/corona-virus/about-covid-19.html

3. Castelli, Paolo Rossi. The reason why the virus that causes Covid-19 is so infectious. IBSA Foundation. [Online] May 28, 2020. https://www.ibsafoundation.org/the-reason-why-the-virus-that-causes-covid-19-is-so-infectious/

4. Viceconte, Guilio & Petrosillo, Nicola. COVID-19 R0: Magic number or conundrum? National Institutes of Health. [Online] Feb 24, 2020. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7073717/

5. Zimmer, Katarina. Why R0 Is Problematic for Predicting COVID-19 Spread. The Scientist. [Online] July 13, 2020. https://www.the-scientist.com/features/why-r0-is-problematic-for-predicting-covid-19-spread-67690

6. Coronavirus: Why are infections rising again in US? BBC News. [Online] October 8, 2020. https://www.bbc.com/news/election-us-2020-54423928

7. Freedman, David H. Winter will make the pandemic worse. Here’s what you need to know. MIT Technology Review. [Online] October 8, 2020. https://www.technologyreview.com/2020/10/08/1009650/winter-will-make-the-pandemic-worse/

8. Coronavirus: What’s behind alarming new US outbreaks? BBC News. [Online] June 30, 2020. https://www.bbc.com/news/world-us-canada-53228134

9. Summers, Juana. Timeline: How Trump Has Downplayed The Coronavirus Pandemic. NPR News. [Online] October 2, 2020. https://www.npr.org/sections/latest-updates-trump-covid-19-results/2020/10/02/919432383/how-trump-has-downplayed-the-coronavirus-pandemic

Related Posts