In an yet-to-be-published interview with an overseas media outlet, one of my data scientists touches on the shortcomings associated with the theoretical approaches used by epidemiologists to predicting COVID-19 outcomes, and why our more parsimonious empirical framework appeared to do a better job:
"In all modelling, there is usually a tension between model simplicity and its faithfulness to the truth. Too simple and the wrong conclusion may be reached. Too faithful and the model may be too complex that no one can understand and hence it can be mis-used.
From our granted limited exposure to epidemiology, we find that academic researchers in this field can be quite caught up in building models that are grounded in theory, resulting in models with so many abstract parameters such as R0, incubation periods etc that can quickly overwhelm non-domain experts.
As a result, non experts, including the public, financial market participants, and policy makers are left to the mercy of epidemiologists and their apparent black box models. Most sophisticated people do not trust black boxes they do not understand.
A simpler, parsimonious alternative must be found that also does not sacrifice too much on faithfulness. That is where quants and data scientists can contribute, as we often have to condense complex situations into simple yet useful model approximations that non domain-expert decision makers can understand, walking that fine line between model simplicity and faithfulness.
In Coolabah’s COVID forecasting model, we replaced abstract theoretical assumptions such as R0 etc, with a simple question: if the US implements containment interventions with the same, 75%, 50% and 25% efficacy as Italy, China or South Korea, when will its daily cases peak?
This choice of substitute country and its efficacy haircut is left to the qualitative judgement of sophisticated users, or can be quantitatively chosen by the model itself as a default. Such simple empirical approach was also surprisingly faithful as discussed in our SSRN published paper.
Our method allows almost anyone to produce an informed forecast, or at least a reasonable range of forecasts, bounding the date of US peak case count to be between early and mid-April.
This bounded range was first published by Coolabah on March 23rd, at a time when few had any idea when the US infections would peak. The consensus was that the peak was months away. Emotionally for any financial market participant, the contrast between having a partial but informed bound on the situation, versus the panic of having no bound at all is night and day."
Writing in the AFR today I explain in simpler terms why many COVID-19 forecasts, to the extent that they actually existed, were so far wide of the mark. Click on that link to read the column or AFR subs can click here. Excerpt enclosed:
On March 23 we published one of the most aggressive forecasts globally for the peak in new COVID-19 infections, which was dismissed by some as wildly optimistic. One crucial difference between our forecasting methodology and those used by the epidemiologists advising governments was that we had adopted an empirical approach whereas most epidemiological projections were based on theoretical simulations.
We started building real-time infection and fatality data tracking systems for every country in the world in February 2020, which updated automatically on a 15-minute basis. This allowed us to carefully monitor the evolution of the pandemic.
It was clear by late February that it was going global and markets would start failing due to their inability to properly price the distribution of risks surrounding this one-in-100-year shock. That is to say, we were going to see liquidity and solvency crises that could only be cauterised by extreme monetary and fiscal policy support, which unfortunately arrived three to five weeks late.
We then moved on to construct a novel forecasting framework that leveraged the evolution of real-time infections in the target country in question combined with assumptions regarding how effective that government’s containment regime would be. More precisely, the system allowed us to superimpose the average containment experience of all nations globally or any specific country that had experienced an earlier outbreak with a pre-selected hair cut to proxy for the target country’s relative containment efficacy.
In the case of Australia and the US, we could, for example, model their future infection paths with reference to both their historic trajectory and then an assumption that they would be, say, 50 per cent to 75 per cent as effective as, say, China, South Korea or Italy with their containment policy.
In March, this allowed us to conclude that new infection cases would peak in Australia and the US in early-to-mid April, months ahead of what many epidemiologists were claiming. It turned out that Australia peaked on March 28 or 29, while the US reached its zenith on April 10 or 11. While almost everyone had an opinion on how the pandemic unfolded, it was surprising to us that few – if any – investors had developed formal forecasting capabilities.
We hypothesised that markets would respond very positively to being able to look through these peaks to the other side of the liquidity and economic “bridge” that was being furnished by fiscal and monetary policy until a vaccine was available.
Australia and New Zealand are leading the world with the flattening of both their infection and death curves. It is imperative that we do likewise with an expeditious return to work for those sub-50-year-old members of our population who face effectively no risks while ramping up testing, contact tracing and protections for the vulnerable, 65-plus-years cohort.
Successful exits from containment are likely to be the next big regime change for markets that allows them to look through the horrific short-term economic data releases that would otherwise suggest we are about to endure a depression. Turning this into a one-to-two month national holiday and then reversion to a virus-mitigating “new normal” is the key policy goal.
(VIEW LINK) 2x">(VIEW LINK) 2x">(VIEW LINK) 2x">
Kudos for predicting a key turning point for the US, and I guess your high level goal was actually predicting that rather than the literal peak in daily reported new cases. But, I'm not sure your criticism of other predictions is fully valid since the actual peak happened two weeks after your prediction, on April 24th, according to the following sources: https://coronavirus.jhu.edu/data/new-cases https://www.worldometers.info/coronavirus/country/us/ Also, we can't rule out the possibility that the actual peak is still coming. Testing seems likely to increase, and that's clearly a major limiting factor in reported cases the US (unlike Australia). I'm probably nit picking from your point of view. But, forecasts based on other techniques may actually have been more accurate when considering that they had slightly different goals to yours.