/images/jl11_lots.jpg

John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Dogs and the central limit theorem

At school/University I had vaguely heard of the Central Limit Theorem (CLT) but never properly understood it or looked it up. My understanding was pretty much along the lines of ‘most measurements are normally distributed’.

This isn’t quite right (although as we’ll see not totally wrong). The CLT actually states that sample means taken from a distribution of measurements are normally distributed, irrespective of how the full distribution of all the measurements is distributed. So if we take large enough batches of samples of a measurement repeatedly and calculate their average, those averages will be normally distributed (with the same variance as the full distribution).

Some of my petty pedantry and why it doesn't matter

I already have at least one of these on this blog (which incidentally is my most popular post).

Vein meme
When p < precision of a double

Why it doesn’t really matter: p<2.2e-16 is pretty significant!

Using a hyphen between clauses-like this-looks tacky. Use these resplendent em-dashes–like this–instead.

It’s so easy to do, just type two hyphens -- or Alt-hyphen.

Reflections and rants after three years of being a journal editor

Over the past three years I’ve served as an Editor for the American Joᴜrnαl of Traffic and Transportation Engineering1 journal Microbial Genomics. Now my term has ended, here are some reflections on what I’ve learnt about the publishing process during my tenure from the editorial side. I hope this might be useful for authors and reviewers, especially earlier in their careers.

Microbial Genomics (‘MGen’) is run by the UK Microbiology Society, started in 2014, and I believe is doing rather well in terms of submissions.

Video games in 2024 -- a good year

The games I played and liked last year were almost all really good. This time I’ve given points out of five for fun and memorableness/creativeness, which I think are the two main things I care about, and an overall score.

Perfect Tides screenshot
Welcome to Fire Island/Perfect Tides. (Perfect Tides)

Perfect Tides is a coming of age story set in the year 2000, on a seasonal island in the US. This is a good setting that nicely represents the boredom/lameness of many teen hometowns. You play as Mara, who is a teenager struggling through high school.

Mini-review of 'Compressive Pangenomics Using Mutation-Annotated Networks' (PanMAN)

This is a mini-review (just highlighting some initial thoughts) of this preprint:

Compressive Pangenomics Using Mutation-Annotated Networks
Sumit Walia, Harsh Motwani, Kyle Smith, Russell Corbett-Detig, Yatish Turakhia
https://www.biorxiv.org/content/10.1101/2024.07.02.601807v2

This is an extension of the idea of mutation annotated trees which were successfully applied to SARS-CoV-2 in the UShER (and related) software. Phylogenies are stored as a sequence of mutations on each branch from the root, rather than keeping all sequences. This is not unlikely the use of ancestral recombination graphs to represent genotypes and evolutionary history in e.g. human genomes.

Powerpoint, Biorender and AI

This illuminating tweet from Michael Baym on how restrictive the biorender license is reminded me of a little post I’d meant to write.

You’ve probably noticed a profileration of Biorender images in talks, papers, grants and especially graphical abstracts you’ve seen in recent years.

I think I’ve seen these really increasing in prevalance over the past couple of years, maybe because of institutional subscriptions, maybe because we have been seeing increasing numbers of our peers using it and don’t want to be left out.