/images/jl11_lots.jpg

John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Creating a conda package with compilation and dependencies

I’ve just finished, what was for me, a difficult compiler/packaging attempt – creating a working bioconda package for seer. You can look at the pull request to see how many times I failed: https://github.com/bioconda/bioconda-recipes/pull/11263

(I would note I have made this package for legacy users. I would direct anyone interested in the software itself to the reimplementation pyseer)

The reason this was difficult was due to my own inclusion of a number of packages, all of which also need compilation, further adding to the complexity. While most of them were already available in some form in anaconda it turned out there were some issues with using the defaults. I want to note some of the things I had to modify here.

conda build: libarchive: cannot open shared object file: No such file or directory

I was getting the following error, attempting to run conda-build on a package, using a conda env:

Traceback (most recent call last):
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/bin/conda-build", line 7, in <module>
from conda_build.cli.main_build import main
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/conda_build/cli/main_build.py", line 18, in <module>
import conda_build.api as api
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/conda_build/api.py", line 22, in <module>
from conda_build.config import Config, get_or_merge_config, DEFAULT_PREFIX_LENGTH as _prefix_length
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/conda_build/config.py", line 17, in <module>
from .variants import get_default_variant
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/conda_build/variants.py", line 15, in <module>
from conda_build.utils import ensure_list, trim_empty_keys, get_logger
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/conda_build/utils.py", line 10, in <module>
import libarchive
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/libarchive/__init__.py", line 1, in <module>
from .entry import ArchiveEntry
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/libarchive/entry.py", line 6, in <module>
from . import ffi
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/site-packages/libarchive/ffi.py", line 27, in <module>
libarchive = ctypes.cdll.LoadLibrary(libarchive_path)
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/ctypes/__init__.py", line 426, in LoadLibrary
return self._dlltype(name)
File "/nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /nfs/users/nfs_j/jl11/pathogen_nfs/large_installations/miniconda3/envs/conda_py36/lib/libarchive: cannot open shared object file: No such file or directory

The /lib directory does contain libarchive, both as a dynamic (.so) and static (.a) library. There turned out to be two relevant environment variables:

Linear scaling of covariances

In our software PopPUNK we draw a plot of a Gaussian mixture model that uses both the implementation and the excellent example in the scikit-learn documentation:

GMM ellipse example
Gaussian mixture model with mixture components plotted as ellipses

My input is 2D distance, which I first use StandardScaler to normalise each axes between 0 and 1, which helps standardise methods across various parts of the code. This is fine if you then create these plots in the scaled space, and as it is a simple linear scaling it is generally trivial to convert back into the original co-ordinates:

Expression modules in S. pneumoniae

I recently read a pre-print from the Veening lab where they had reconstructed various (22 total) physiological conditions in vitro and then measured expression levels with RNA-seq. I thought it was a great bit of research, and would encourage you to read it here if you’re interested: https://doi.org/10.1101/283739

They’ve also done a really good job with data availability, having released a browser for their data (PneumoExpress), and they have put their raw data on zenodo.

Installing phyx without sudo

I saw this phylogenetics package today, phyx: https://github.com/FePhyFoFum/phyx

To install without admin rights/sudo I needed to do the following (my software is installed in my home ~/software, rather than e.g. /usr, /usr/local):

Compile armadillo as follows

cmake -DINSTALL_PREFIX=$(HOME)/software
make
make install

Compile nlopt as follows

./configure --with-cxx --without-octave --without-matlab --prefix=$(HOME)/software
make
make install

Compile phyx as follows (slightly hacky, maybe there’s a ‘proper’ way)

setup.py not found using pip install

Trying to install PyVCF under a python (3) virtual environment gave me the following error:

(venv)johnlees@hpc:~$ pip install pyvcf
 Downloading/unpacking pyvcf
 Downloading PyVCF-0.6.8.linux-x86_64.tar.gz (1.1MB): 1.1MB downloaded
 Saved /tmp/downloadcache/PyVCF-0.6.8.linux-x86_64.tar.gz
 Running setup.py egg_info for package pyvcf
 Traceback (most recent call last):
 File "", line 16, in 
 FileNotFoundError: \[Errno 2\] No such file or directory: '~/venv/build/pyvcf/setup.py'
 Complete output from command python setup.py egg_info:
 Traceback (most recent call last):

File "", line 16, in

FileNotFoundError: \[Errno 2\] No such file or directory: '~/venv/build/pyvcf/setup.py'

The solution was to upgrade setuptools: