6/6/2020: My model is underpredicting deaths by quite a bit. My conclusion: infection rates are underreported by a factor of 4x - 8x.

3/28/2020: Added a note about methodology and some analysis of how well my model is working against real data.

In the San Francisco Bay area, we have been ordered to "shelter in place": we are staying at home and limiting contact with other people. This drastic measure is taken in an effort to Flatten the Curve of the spread of the virus. The idea is if we can slow down the transmission rate, we won't overwhelm our medical system with patients needing critical care all at once. Instead of peaking early and very highly, the curve will flatten out.

Flatten the Curve

Image: Drew Harris

Worst Case Analysis

I wondered what the "Without Protective Measures" curve might mean in terms of real numbers. So I started researching and based on studies giving ranges of percentages of people in certain age groups that would die and need hospitalization because of COVID19, population stats regarding the US and best and worst case estimates of the number of people that would get the disease, I came up with the following ranges.

  • Range of number of people needing to be hospitalized: 4 - 16 million
  • Range of number of fatalities: 1 - 5 million

Here is my raw data:


My methodology is simple: given the percentage of people in various age cohorts in the US and deriving death and hospitalization rate ranges due to COVID19 for those same cohorts, we can predict the ranges of both deaths and hospitalizations in each age cohort. Sum them up and you get the worse case if everybody gets the disease. I factored in some predictions of the minimum and maximum percentage of total population that would get COVID19, assuming this arrange applies equally to all age cohorts.

How Good Is This Model? (Updated 6/6/2020)

As of June 6, 2020, my model predicts between 12k and 25k deaths if the current total number of infections is correct. However, at this time, we have had 109K deaths from 1.9M infections. I therefore conclude that we are underreporting the number of cases. In order for my model to get close to predicting the actual 109K deaths, the number of actual infected must be between 8M and 17M.

The current infection rate (as of 3/2020) in the US is 0.04%. Applying this percentage to the age cohort data gives a range of deaths of 720 - 2,906. The current actual number of deaths is 2,026, so my model is working and we might see the true number of deaths when this all plays out closer to the high end (5 million) if we do nothing to bring the rates down …

Going Beyond Worst Case

The above is absolutely worst case and I don't have a lot of confidence in any of the factors, say maybe I am 50% confident. So maybe the real numbers of fatatilites will range from 500K - 10M and hospitalizations from 2M - 32M.

At any rate, lots of people will die from corona virus, but that isn't what scares me about COVID19. What scares me is the number of people needing treatment in hospitals. If we hit the high end of 32M, we don't have enough hospital beds or staff to handle that all at once. So it is imperative to spread out or "flatten the curve" of the COVID19 transmission rates. That means measures like those taken in San Francisco should be taken in all big metro areas.

However, I don't think these measures should be permanent. The damage to jobs and the economy is not worth it. We need to maybe cycle the "shelter in place" rules on some schedule. A study by the Imperial College of London says it might have to look something like this:

A graph of weekly ICU cases over time.

Periodic bouts of social distancing keep the pandemic in check.

Even that looks awful, however. Months of shut in followed by a few brief weeks of freedom is not my idea of life. However, it might not end up being so bleak according to this analysis:

The Hammer and the Dance.

Highly recommend the above.