Contents

Compiling and installing MaSuRCA/MSRCA assembler

Contents

After reading GAGE-B (dx.doi.org/10.1093/bioinformatics/btt273) which is an evaluation of the performance of various pieces of de-novo assembly software I was convinced to try and get MaSuRCA (http://www.genome.umd.edu/masurca.html) working even if it took a lot of effort, as the results looked very promising.

The compilation didn’t work for me, the problem being the automatically generated Makefiles had an error in them where the compiler name was missing in the executed statement. This proved too complex for me to fix quickly, and instead I went with the following solution: EDIT 4/8/14: This solution is unlikely to work. See bottom of article for why

  1. Download and compile the Celera Assembler separately (http://wgs-assembler.sourceforge.net/) following the instructions provided. You can probably even just download a pre-compiled binary
  2. Copy the contents of wgs- over the CA directory in the MaSuRCA install. If you downloaded just the binary the important part is the Linux-amd64/bin/ directory (or whatever it is for your architecture)
  3. Comment out or delete line 29 of the install.sh file: (cd CA/src && make LD\_RUN\_PATH="$LIBDIR")
  4. Run the install.sh script

Which should then work, as the problem with the Makefiles only exists for the CA/src directory

When running the assembler, you’ll need the CA/bin files installed in bin/masurca/../CA/Linux-amd64/bin where /bin/masurca is the directory where the masurca binaries were installed to. This is the default provided by the script, so in most cases shouldn’t be a problem

EDIT: MaSuRCA v2.2.1 comes with CA v6.1, so download this binary/source rather than the newest one (v8 onwards). Some of the command line options are different and the assemble.sh script won’t work. I’m going to try and fix this, and if I have any success will post the result here

EDIT 4/8/14: After reading the MaSuRCA v2 paper (dx.doi.org/10.1093/bioinformatics/btt476) I see that the authors modify the CA code slightly as part of their assembler. Without using this version the assemble.sh script won’t run, as the overlaps are not build correctly using the super-reads. Therefore the code needs to be compiled by the provided method, but as mentioned the makefiles don’t work. I did manage to find a pre-compiled binary that works on my system on their ftp site, which I would recommend as the best solution for now