Running BSLMM in gemma

2017-08-23 329 words 2 minutes

Contents

In GWAS the Bayesian Sparse Linear Mixed Model (BSLMM) is a hybrid of the LMM, which assumes all SNPs have an effect size drawn from a normal distribution (closer to ridge regression), and sparse regression which finds a few SNPs with non-zero effect sizes.

In their paper on this model Zhou et al show that this hybrid method can have better prediction accuracy than either individual model on its own (which are special cases in their model), and can also estimate the proportion of variance explained by polygenic and sparse effects.

They implement this in their software gemma, which I (and many others) have used to perform GWAS using a LMM. It’s easy to use once you’ve got the data formatted, though memory usage is quite high (>50Gb for N =~ 5000; M =~ 600 000. Perhaps I should be LD-pruning?)

gemma -bfile data_in -bslmm 1 -o data_out_bslmm_1

However, when running this command I got one of two errors at either the Eigen-decomposition stage, or the UtX stage. ‘Segmentation fault’ or ‘Invalid instruction’. I’ve used the gemma static executable (the default download) before without any issues, so it looked to me like it was probably something with the linear algebra libraries.

Downloading the source, it turns out gemma relies on blas, lapack, gsl and atlas (and optionally arpack). When I tried to compile using the default Makefile I got the following errors at the linking stage:

/software/hgi/pkglocal/gcc-4.9.1/lib/gcc/x86_64-unknown-linux-gnu/4.9.1/../../../../lib64/libgfortran.a(write.o): In function `write_float'
:
/nfs/humgen01/teams/hgi/software/src/gcc-4.9.1/x86_64-unknown-linux-gnu/libgfortran/../.././libgfortran/io/write_float.def:1300: undefined
reference to `signbitq'
/nfs/humgen01/teams/hgi/software/src/gcc-4.9.1/x86_64-unknown-linux-gnu/libgfortran/../.././libgfortran/io/write_float.def:1300: undefined
reference to `finiteq'
/software/hgi/pkglocal/gcc-4.9.1/lib/gcc/x86_64-unknown-linux-gnu/4.9.1/../../../../lib64/libgfortran.a(write.o): In function `determine_en
_precision':
/nfs/humgen01/teams/hgi/software/src/gcc-4.9.1/x86_64-unknown-linux-gnu/libgfortran/../.././libgfortran/io/write_float.def:1213: undefined
reference to `finiteq'
/software/hgi/pkglocal/gcc-4.9.1/lib/gcc/x86_64-unknown-linux-gnu/4.9.1/../../../../lib64/libgfortran.a(write.o): In function `write_float'
:
/nfs/humgen01/teams/hgi/software/src/gcc-4.9.1/x86_64-unknown-linux-gnu/libgfortran/../.././libgfortran/io/write_float.def:1300: undefined
reference to `isnanq'
collect2: error: ld returned 1 exit status
make: *** [bin/gemma] Error 1

Fortunately these scary errors were familiar to me from wrangling with statically compiling seer. To fix this, compile gemma from source and run the BSLMM without errors add -lquadmath to the following line in the Makefile

LIBS_LNX_S_LAPACK = /usr/lib/lapack/liblapack.a -lgfortran -lquadmath /usr/lib/atlas-base/libatlas.a /usr/lib/libblas/libblas.a -Wl,--allow-multiple-definition

Happy modelling