Did 1.27M people die from AMR in 2019?

John Lees included in reviews science statistics

2022-02-07 703 words 4 minutes

Contents

I would answer ‘I don’t know’. If I was being less trite, I would add that I’m more confident saying that it was between 100k and 10M – whichever way you look at it, vast numbers that are growing larger, and which require action on multiple fronts.

The authors of the study ‘Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis’ (also called ’the GRAM study’) have actually attempted to estimate this. Their paper was published a couple of weeks ago, and you can read it here: https://doi.org/10.1016/S0140-6736(21)02724-0.

The authors of this study combine a lot of data from many sources (which I am sure was very difficult to collect); run a complex series of regression models, chaining their outputs together; predict DALYs lost, and run counterfactuals vs no-AMR to make predictions of the global burden of AMR. Ultimately they estimated that in 2019 ‘1·27 million (95% UI 0·911–1·71) deaths attributable to bacterial AMR’, which is unfortunately at the upper end of extrapolations from the 2016 O’Neill report.

I really want to make one criticism here: there is no way it was possible to estimate this to three significant figures, and the uncertainty interval is almost certainly too narrow. These UIs are generated from the model used, and only reflect uncertainty from the data, if the given model is true (and which probably explains the tight and smooth UIs in figure 2).

The authors themselves offer reflection on uncertainties, assumptions and approximations used in the discussion, and have a whole table in the appendix listing known modelling limitations. It’s easy to identify more than are explicitly addressed (e.g. not modelling vaccine replacement in S. pneumoniae) – some which would suggest an overestimate, some which would suggest an underestimate. This analysis is incredibly complex and tries to account for so many factors that I’m at a loss to guess whether this is more likely over- or under-estimated – but I am confident in saying the result is more uncertain than reported.

A related gripe is that reporting this level of precision makes the study accuracy look higher than it actually was. Various credulous press coverage then report only the central estimate with no UIs at all, which really does make it look like someone counted all of these deaths, and got exactly 1.27M (which was then rounded down to 3sf).

Perhaps just saying about 1-2 million deaths would have been clearer?

Ok, maybe more than one criticism

I’m an outsider to global burden of disease estimates, and economic modelling in general, so realise that I might be missing the point with some of these questions, but I also wondered:

Won’t the errors from each source/regression, and especially by chaining results together, compound? Was this taken into account?
Will this result and model’s accuracy ever be checked retrospectively (e.g. when we have better data)? Hopefully at least for some regions, and then we could see what the major sources of error in the model where. But, could we do this now for a region with lots of data, by adding errors and sparsity into it? Or for one species/disease?
The sensitivity analysis (section 4.7 of the appendix, ‘model validation’) reports 0.7-1 AUC, which isn’t exactly stunning, but it also looked like there wasn’t any real validation set used, just a random 20% subsample of all of the data.
One press release I saw mentioned ‘celebrating the global collaboration and 100’s of data partnerships that made this study possible’ which I’m sure is a great thing to come out of this.
But, where is the data? Can anyone else use it? Has any effort been made to provide it to researchers through an access committee?
Relatedly, where is the model code? After a bit of searching I found this repo, but I’m not sure it’s a) the right one or b) that it’s able to be reused at all.
This is very minor, but what’s the ‘grand total’ in table 1? How does it make sense to sum sample sizes and study years from totally different sources? Seems like it’s to impress us with the figure 471300319 (which is only useful for estimating how much memory you’re likely to need to load the data).