John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Expression modules in S. pneumoniae

John Lees published on 2018-04-24 included in Bioinformatics Pneumococcus

I recently read a pre-print from the Veening lab where they had reconstructed various (22 total) physiological conditions in vitro and then measured expression levels with RNA-seq. I thought it was a great bit of research, and would encourage you to read it here if you’re interested: https://doi.org/10.1101/283739

They’ve also done a really good job with data availability, having released a browser for their data (PneumoExpress), and they have put their raw data on zenodo.

Installing phyx without sudo

John Lees published on 2017-11-24 included in Bioinformatics Linux Phylogenetics

I saw this phylogenetics package today, phyx: https://github.com/FePhyFoFum/phyx

To install without admin rights/sudo I needed to do the following (my software is installed in my home ~/software, rather than e.g. /usr, /usr/local):

Compile armadillo as follows

cmake -DINSTALL_PREFIX=$(HOME)/software
make
make install

Compile nlopt as follows

./configure --with-cxx --without-octave --without-matlab --prefix=$(HOME)/software
make
make install

Compile phyx as follows (slightly hacky, maybe there’s a ‘proper’ way)

setup.py not found using pip install

John Lees published on 2017-11-20 included in Bioinformatics Python

Trying to install PyVCF under a python (3) virtual environment gave me the following error:

(venv)johnlees@hpc:~$ pip install pyvcf
 Downloading/unpacking pyvcf
 Downloading PyVCF-0.6.8.linux-x86_64.tar.gz (1.1MB): 1.1MB downloaded
 Saved /tmp/downloadcache/PyVCF-0.6.8.linux-x86_64.tar.gz
 Running setup.py egg_info for package pyvcf
 Traceback (most recent call last):
 File "", line 16, in 
 FileNotFoundError: \[Errno 2\] No such file or directory: '~/venv/build/pyvcf/setup.py'
 Complete output from command python setup.py egg_info:
 Traceback (most recent call last):

File "", line 16, in

FileNotFoundError: \[Errno 2\] No such file or directory: '~/venv/build/pyvcf/setup.py'

The solution was to upgrade setuptools:

Firth regression in python

John Lees published on 2017-11-08 included in Scripts Statistics

Marco Galardini and I have recently reimplemented the bacterial GWAS software SEER in python. As part of this I rewrote my C++ code for Firth regression in python. Firth regression gives better estimates when data in logistic regression is separable or close to separable (when a chi-squared contingency table has small entries).

I found that although there is an R implementation logistf I couldn’t find an equivalent in another language, or python’s statsmodels. Here is a gist with my python functions and a skeleton of how to use them and calculate p-values, in case anyone would like to use this in future without having to write the optimiser themselves.

Running BSLMM in gemma

John Lees published on 2017-08-23 included in Bioinformatics

In GWAS the Bayesian Sparse Linear Mixed Model (BSLMM) is a hybrid of the LMM, which assumes all SNPs have an effect size drawn from a normal distribution (closer to ridge regression), and sparse regression which finds a few SNPs with non-zero effect sizes.

In their paper on this model Zhou et al show that this hybrid method can have better prediction accuracy than either individual model on its own (which are special cases in their model), and can also estimate the proportion of variance explained by polygenic and sparse effects.

Tanglegrams can be misleading

John Lees published on 2017-02-22 included in Bioinformatics Phylogenetics

Tanglegrams are a visual method to compare two phylogenetic trees with the same set of tip labels. This can be useful for comparing trees produced by different methods on the same alignment, or on different alignments of the sample set. Tanglegrams work by connecting the matching tips of the trees, then rotating subtrees to minimise the number of crossings. The algorithm was published in 2011, and continues to be used in a range of publications (for example genomic epidemiology).