John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Host/pathogen data for 'Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis' available on EGA

I have recently gotten round to adding the human data (and links to pathogen data, which has been available on the ENA since publication) to the managed access European Genome-Phenome Archive. The sharing of human genotype data is a little more fraught than bacterial genome data due to patient ethics and other issues, but the EGA offers a good solution for protecting this while making the data as open as possible.

WE, Arcade Fire (2022) – A reverse in trajectory?

The first three (Funeral, Neon Bible, The Suburbs) I first saw Arcade Fire performing Neon Bible at Glastonbury (though sadly only on the BBC broadcast). At the time it was released, I was working at the local supermarket at the weekends pushing trolleys around in the car park. I had a cheap – iRiver if I recall – MP3 player with space for a handful of albums that I would illicitly listen to.

Using the new Microreact API

(the excellent) Microreact has recently had a major new release which has a few breaking changes. One that hit me is that the API has changed. The previous API was pretty simple, and allowed anonymous POST requests with a blob of CSV, tree and optionally network to return a stable URL. The new API requires a token for authorisation and addition to your account (which seems sensible), and also adds deletion and updating of instances (which is also useful).

Annual conference, Microbiology Society (2022, Belfast)

I recently attended the Annual Conference of the Microbiology Society, which was held in Belfast. This was my first time attending this meeting, and I was a bit nervous that as a genomics researcher/someone who wouldn’t know a colony from his elbow I might not be able to follow much. This proved to be unfounded, and I was really happy to see that genomics is becoming a routine part of many microbiology studies, rather than a separate area (machine learning if anything seems to be the new bogeyman – I look forward to the hype settling down).

Model flexibility and number of parameters

This post is some thoughts I had after reading ‘Real numbers, data science and chaos: How to fit any dataset with a single parameter’ by Laurent Boué. arXiv:1904.12320 The paper above shows that any dataset can be approximated by the following single-parameter function: $\sin^2 (2^{x \tau} \mathrm{arcsin} \sqrt{\alpha})$ Where $x$ is an integer, $\tau$ is a constant which controls the level of accuracy, and $\alpha$ is a real-valued parameter which is fit to the dataset in question.

Quantify everything, all of the time

I recently read the article by Wu et al in Nature Biotechnology (you can also find similar articles in pretty much all of the Nature journals) which analysed data on participants at some virtual meetings over the past couple of years, and came to the conclusion that ‘Virtual meetings promise to eliminate geographical and administrative barriers and increase accessibility, diversity and inclusivity’. Which sounds great! Of course there are certainly some good things to come out of virtual meetings, and many unresolved issues with in person conferences.