/images/jl11_lots.jpg

John Lees' blog

Pathogens, informatics and modelling at EMBL-EBI

Threads, vectors and references in C++11 on OS X

I was trying to compile some C++ of the form

std::vector<std::thread> threads;
for (int i = 0; i<num_threads; ++i) {
   threads.push_back(std::thread(logisticTest, kmer_lines[i], samples);
}

with function prototype

void logisticTest(Kmer& k, const std::vector<Sample>& samples);

on OS X 10.10 with clang++ - Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)

Upgrade OS X 10.8 -> 10.10 (Yosemite) breaks perl, homebrew, gvim

After upgrading from OS X 10.8 (Mountain Lion) to 10.10 (Yosemite), I found that gvim no longer worked and exited with a cryptic dyld message similar to dyld: Symbol not found:.

The first thing I tried was uninstalling it with homebrew, then reinstalling:

brew uninstall macvim
brew install macvim

But I got a Trace/BPT trap: 5 during make. Trying to fix this by doing the things suggested by brew doctor and installing openssl gave me essentially the same errors.

A hierarchical Bayesian model using multinomial and Dirichlet distributions in JAGS

I am currently trying to model the state of a genetic locus in bacteria (which may be one of six values) using a hierarchical Bayesian model. This allows me to account for the fact that within a sample there is heterogeneity, as well as there being heterogeneity within a tissue type.

This is also good because:

  • I am able to incorporate results from existing papers as priors.
  • From the output I can quantify all the uncertainties within samples and tissues, check for significantly different distributions between condition types.

I think what I have ended up working with is probably something that could also be called a mixture model.

Installing Cactus (progressiveCactus)

When installing Cactus (using the progressiveCactus repository) I encountered the following issues during compiling:

  1. easy_install not found Solution: Needed to remove my ~/.pydistutils.cfg

  2. Dependencies/includes not being found Solution: add CXXFLAGS=-I"<install_location>/include/" and CFLAGS=-I"<install_location>/include/" to <install_location>/share/config.site

  3. kyototycoon not compiling as kyotocabinet functions not found (as in issue 27) Solution: (as in my comment to the issue)entering the kyototycoon directory and running configure with different flags, then make:

    ./configure --prefix=~/software --with-kc=~/software
    
    make

    where ~/software is the prefix I am installing to (with subdirs bin, lib, include, man etc)

Diagnosing results/status of lots of LSF jobs

Over the past few months I’ve found myself running large numbers of jobs over an LSF system, for example assembling and annotating thousands of bacterial genomes or imputing thousands of human genomes in 5Mb chunks.

Inevitable some of these jobs fail, and often for a number of reasons. I thought it might be helpful to share some of the commands I’ve found useful for diagnosing the jobs that have finished. The commands apply to IBM platform LSF (bsub), but I imagine have slightly wider applicability

Parallel MCMC

On github: https://github.com/johnlees/pMCMC

Parallel implementation of MCMC using MPI - coded by Hákon Jónsson, John Lees and Tobias Madsen

Code is available as C++ Under testing implementations in R and Perl do not provide speedups due to execution overheads, but are included as easier to read ‘pseudocode’ if required.

Details can be found in this draft paper: pMCMC

Acknowledgements

This work was completed for the Oxford Summer School in Computational Biology 2012 (http://www.stats.ox.ac.uk/research/genome/summer_school_2013/osscb12), and was supervised by Geoff Nicholls, Joe Herman and Jotun Hein.