/images/jl11_lots.jpg

John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Traitors null hypothesis

The rest of the UK and I have been watching the Celebrity Traitors over the past few weeks, which is basically Werewolf.

The game starts with three traitors and sixteen faithful. Each night the traitors secretly choose a faithful to kill (who does not appear the next morning). There is a round table where the group as a whole vote for who they think is a traitor, and by majority vote that person is banished, whereupon they reveal whether they were a traitor or a faithful. If at the end, a single traitor remains, the traitors win.

Lessons from a Nobel Laureate (on deep learning and academia/industry)

Last week I saw John Jumper give a presentation on his team’s development of AlphaFold 1/2/3. He started explaining how AlphaFold1 worked, essentially a convolutional neural network. He spent most of the talk explaining the many changes required to produce AlphaFold2, which went from mediocre predictive accuracy to changing the whole field.

I took away two main points.

There was no single trick that lead to the big performance gain of AF2 over AF1. No single bit of intuition that lead to such a huge shift in performance. Instead, lots of good ideas each of which increased performance slightly.

Using fixed parameters/constants in mcstate likelihoods

NB: mcstate is being replaced by monty which will make this easier

How do you use fixed parameter values during the pMCMC with mcstate and an odin model?

I was first asked this question in 2021 and I don’t think we’ve ever managed to more clearly document it, but the relevant part of the documentation is https://mrc-ide.github.io/mcstate/reference/pmcmc_parameters.html.

To get this to work you need to make your transform function a closure. The transform function always needs to take as input a list of the pmcmc_parameters() being inferred by mcstate, and output a list of pars as used by the odin.dust model, and can’t take any other options. But, one can use a closure to bind other arguments of a more general function to fixed values.

Dogs and the central limit theorem

At school/University I had vaguely heard of the Central Limit Theorem (CLT) but never properly understood it or looked it up. My understanding was pretty much along the lines of ‘most measurements are normally distributed’.

This isn’t quite right (although as we’ll see not totally wrong). The CLT actually states that sample means taken from a distribution of measurements are normally distributed, irrespective of how the full distribution of all the measurements is distributed. So if we take large enough batches of samples of a measurement repeatedly and calculate their average, those averages will be normally distributed (with the same variance as the full distribution).

Some of my petty pedantry and why it doesn't matter

I already have at least one of these on this blog (which incidentally is my most popular post).

Vein meme
When p < precision of a double

Why it doesn’t really matter: p<2.2e-16 is pretty significant!

Using a hyphen between clauses-like this-looks tacky. Use these resplendent em-dashes–like this–instead.

It’s so easy to do, just type two hyphens -- or Alt-hyphen.

Reflections and rants after three years of being a journal editor

Over the past three years I’ve served as an Editor for the American Joᴜrnαl of Traffic and Transportation Engineering1 journal Microbial Genomics. Now my term has ended, here are some reflections on what I’ve learnt about the publishing process during my tenure from the editorial side. I hope this might be useful for authors and reviewers, especially earlier in their careers.

Microbial Genomics (‘MGen’) is run by the UK Microbiology Society, started in 2014, and I believe is doing rather well in terms of submissions.