Display env variable, tmux and zsh over ssh

I have been using zsh within tmux, and found upon reattaching tmux X forwarding wasn’t working. For example when trying to launch gvim I’d get the error:

E233: cannot open display

The problem, a quick google determined, is that each time I ssh into my sever a new $DISPLAY environment variable is set. When I run ‘tmux attach’ the new $DISPLAY variable is passed through (see http://stackoverflow.com/questions/8645053/how-do-i-start-tmux-with-my-current-environment) so any new windows within tmux will have the correct environment. However the environment of any existing windows can’t be changed, causing the problem.

The best solution I found was proposed by Alex Teichman here: http://alexteichman.com/octo/blog/2014/01/01/x11-forwarding-and-terminal-multiplexers/
However I had two problems:

  1. It doesn’t seem to work with zsh rather than bash. I guess this is due to the behaviour of preexec() being different, but I couldn’t quickly work this out from the zsh manual.
  2. It maybe felt slightly inelegant to update $DISPLAY every single time a command is run

My solution is pretty similar. I add the following to ~/.zshrc:

echo $DISPLAY > ~/.display.txt
alias up_disp='export DISPLAY=`cat ~/.display.txt`'

This writes the correct $DISPLAY variable to a hidden file when a session is started (i.e. when I connect to the server). When I find forwarding isn’t working, I just run up_disp in that window.
Not the perfect solution, but it works ok for me

Compiling Stampy v1.0.23 for use with cortex – error: unrecognized command line option ‘-Wl’

To assemble illumina sequence data I am currently trialling assembly with cortex. To be able to use their Perl script to automate the pipeline between reads in and variant calls requires vcftools and stampy to be installed, and you provide the installation paths as input to the script.

However when running make using the default downloaded stampy makefile I got the following error from g++ (v4.8.1):

g++ `python2.7-config --ldflags` -pthread -shared -Wl build/linux-x86_64-2.7-ucs4/pyx/maptools.o build/linux-x86_64-2.7-ucs4/c/map
utils.o build/linux-x86_64-2.7-ucs4/c/alignutils.o build/linux-x86_64-2.7-ucs4/readalign.o build/linux-x86_64-2.7-ucs4/algebras.o build/linux-x86_64-2.7-ucs4/frontend.o -o maptools.so
g++: error: unrecognized command line option ‘-Wl’

The solution was straightforward to find, as ever thanks to stackoverflow: http://stackoverflow.com/questions/21305309/g-doesnt-recognize-the-option-wl
All you need to do is edit lines 44 and 46 in the makefile, replacing the space after -Wl with a comma:

 43 ifeq ($(platform),linux-x86_64)
 44    g++ `$(python)-config --ldflags` -pthread -shared -Wl,$(objs) -o maptools.so
 45 else
 46    g++ `$(python)-config --ldflags` -pthread -dynamiclib -Wl,$(objs) -o maptools.so
 47 endif

As you can see from the surrounding if statement, this is only an issue on 64-bit linux platforms

I also tried compiling cortex with icc, but the compilation failed after a lot of errors. Rather than pursuing this further, I used gcc and only got warnings of unused variables in compilation

Impute your whole genome from 23andme data

23andme is a service which types 602352 sites on your chromosomal DNA and your mtDNA. It is possible, by comparing to a reference panel in which all sites have been typed, to impute (fill in statistically) the missing sites and thus get an ‘estimation’ of your whole genome.

The piece of software impute2 written by B. N. Howie, P. Donnelly, and J. Marchini gives good accuracy when using the 1000 Genome Project as a reference. However, there is some difficulty in providing the data in the right input format, using all the correct options and interpreting the output from this piece of software.

EDIT: As pointed out by lassefolkersen in the comments, this has now been nicely implemented at impute.me

I have written a tool to allow people with a small amount computational experience (but not necessarily any biological/bioinformatics knowledge) to run this tool on their 23andme data to get their whole genome output, which can be found at my github: https://github.com/johnlees/23andme-impute

To use this tool, you will need to do the following steps:

  1. Download your ‘raw data’ from the 23andme site. This is a file named something like genome_name_full
  2. Download the impute2 software from https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#download and follow their instructions to install it
  3. Put impute2 on the path, i.e. run (with the correct path for where you extracted impute2):
    echo “export PATH=$PATH:/path/to/impute2” >> ~/.bashrc
  4. Download the 1000 Genomes reference data, which can be found on the impute2 website here:
    https://mathgen.stats.ox.ac.uk/impute/data_download_1000G_phase1_integrated.html
  5. Extract this data by running:
    gunzip ALL_1000G_phase1integrated_v3_impute.tgz
    tar xf ALL_1000G_phase1integrated_v3_impute.tar
    (you will then probably want to delete the original, unextracted archive file as it is quite large)
  6. Download my code by running:
    git clone https://github.com/johnlees/23andme-impute
  7. Run ./impute_genome.pl to impute your whole genome!

The options required as input for impute_genome.pl should be reasonably straightforward, run with -h to see them, or look at the README.md on github.

As the analysis will take a lot of resources, I recommend against using the run command. I think –print or –write will be best for most people, and you can then run each job one at a time or in parallel if you have access to a cluster.

If you have any problems with this, please leave a message in the comments and I’ll try my best to get back to you.

A new direction for leesjohn

Since October 2013 I have stopped using Fedora, and instead use machines running Ubuntu 12.04/13.10, Windows 8 and OS X 10.8.5. As these OSs have a larger user base than Fedora, many of the issues I encounter are well documented and easy to fix (i.e. there is a stackexchange post as one of the top three google results), hence there haven’t been many things for me to post under the original remit of this blog.

Of course, when I do encounter an undocumented OS based issue as I go about my business I’ll still try and post it on leesjohn. However I expect this to be much less common than previously, and the new computing based issues I find myself having to deal with are:
Interactions and differences between OS X and Ubuntu when working with them simultaneously
Working with Ubuntu without a sudo account (e.g. installing software, using custom libraries)
Use of radio software (e.g. Rivendell, Cuedex, Jack)

I have now changed area from physics to bioinformatics, and think there is scope to share many of the scripts and programs I write for this, as well as solutions to issues I encounter in the area. So I have finally gotten round to setting up a github account (https://github.com/johnlees) to share as much of the code I write as possible.

From now on leesjohn will primarily be to document the scripts in these repositories, and to share some original tools. I’ve already committed some things, which you can see at:
https://github.com/johnlees/bioinformatics (for bioinformatics tools)
https://github.com/johnlees/config (for software related configuration)
Hopefully this will make it very easy if someone ever does want to use some of the stuff I’ve written.

In the next few weeks I am hoping to write some posts about some of the more useful/general things in these repositories. I am also planning on making a wrapper script to allow you to use impute2 (http://mathgen.stats.ox.ac.uk/impute/impute_v2.html) to infer your whole genome from the ‘raw data’ you get if you have had a 23andme done (https://www.23andme.com/) – which as far as I can tell is not something yet available in the public sphere, but something I think many of 23andme’s clients could be interested in.

A latex bibliography style I like (Nature style in biblatex)

This isn’t really a specific question, but I needed to make a bibliography in latex with the following requirements:

  • The citations take up as little space as possible, so should probably be superscript
  • The citations should be correctly grouped (i.e. 1-3, 6 not 6, 2, 3, 1)
  • The bibliography can take up any amount of space
  • The citations should be linked to their bibliography entry (i.e. hyperref compatible)
  • bib entries contain unicode characters
  • I want the entries to look like the Elsevier standard, though Nature is also fine
  • I want DOIs, properly displayed and hyperlinked, not monospaced

The style=nature option supplied to biblatex in the preamble (see http://ctan.org/pkg/biblatex-nature) achieves most of this but you don’t get DOIs, there seemed to be some problems with unicode characters (particularly Polish names, see http://www.terminally-incoherent.com/blog/reference/latex-reference/) and there were some problems displaying URLs well

Rather than trying to hack together a biblatex.cfg based on the nature style which I didn’t understand/couldn’t be bothered to read through I instead was able to use a standard biblatex style with some options when loading the package:

\usepackage[style=numeric-comp,
maxcitenames=2,
maxnames = 5,
firstinits=true,
uniquename=init,
sorting=none,
url=false,
isbn=false,
eprint=false,
texencoding=utf8,
bibencoding=utf8,
autocite=superscript,
backend=biber
]{biblatex}

This gets pretty close, but I also needed to use the following biblatex.cfg (create this file in the same directory as the .tex file):

% Number in parenthesis
\renewbibmacro*{volume+number+eid}{%
%  \setunit*{\addcomma\space}% NEW
  \printfield{volume}%
%  \setunit*{\adddot}% DELETED
%  \setunit*{\addcomma\space}% NEW
  \iffieldundef{number}
    {}
    {\bibopenparen
     \printfield{number}%
     \bibcloseparen}
  \setunit{\addcomma\space}%
  \printfield{eid}}

% Field formats for the bibliography environment (get rid of square brackets)
\DeclareFieldFormat{labelnumberwidth}{#1\adddot}

%Get rid of in:
\renewbibmacro{in:}{}

%Get rid of pp.
\DeclareFieldFormat[article,inproceedings,incollection]{pages}{#1}

%Make volume number emboldened
\DeclareFieldFormat[article,inproceedings,incollection]{volume}{\textbf{#1}}

%Journal name in non-italics
%\DeclareFieldFormat[article,inbook,incollection,inproceedings,patent,thesis,unpublished]{journaltitle}{#1}

%No quotes around article name
\DeclareFieldFormat
  [article,inbook,incollection,inproceedings,patent,thesis,unpublished]
  {title}{#1\isdot}

%Bibliography in smaller font size, and unjustified
\renewcommand{\bibfont}{\normalfont\small\raggedright}

%Hyperlinks in serif font
\def\UrlFont{\normalfont}

%DOI lower case, normal font
\renewcommand*{\mkbibacro}[1]{%
  \ifcsundef{\f@encoding/\f@family/\f@series/sc}
    {#1}
    {\MakeLowercase{#1}}}

%Colon after author names
\renewcommand{\labelnamepunct}{\addcolon\space}

Which got me what I wanted:

bibliography

DPPC (Dipalmitoylphosphatidylcholine) DSPC and DMPC in Latex

Using chemfig I was able to represent DPPC (Dipalmitoylphosphatidylcholine) and other lipids in Latex by using the following code

\newcommand\setpolymerdelim[2]{\def\delimleft{#1}\def\delimright{#2}}
\def\makebraces[#1,#2]#3#4#5{%
\edef\delimhalfdim{\the\dimexpr(#1+#2)/2}%
\edef\delimvshift{\the\dimexpr(#1-#2)/2}%
\chemmove{%
\node[at=(#4),yshift=(\delimvshift)]
{$\left\delimleft\vrule height\delimhalfdim depth\delimhalfdim
width0pt\right.$};%
\node[at=(#5),yshift=(\delimvshift)]
{$\left.\vrule height\delimhalfdim depth\delimhalfdim
width0pt\right\delimright_{\rlap{$\scriptstyle#3$}}$};}}
\setpolymerdelim[]


\begin{figure}
\small
\setatomsep{1.5em}
\chemfig{N^+(-[:180,1.1]H_3C)(-[:90,1.3]CH_3)(-[:270,1.3]CH_3)(-[:-30]-[:30]-[:-30]O-[:30,1.3]P^+(<[:50,1.5]O\rlap{${}^-$})(<:[:130,1.5]O\rlap{${}^-$})(-#(1pt,)[:330,1.3]O-[:30]-[:-30](-[:270]O-[:-30](=[:270]O)(-[@{downleft,0.8}:30]CH_2-#(1pt,1pt)[@{downright,0.3}:-30,1.2]CH_3))(-[:30]-[:-30]O-[:30](=[:90]O)(-[@{upleft,0.8}:-30]CH_2-#(1pt,1pt)[@{upright,0.3}:30,1.2]CH_3))))}
\makebraces[10pt,13pt]{n}{downleft}{downright}
\makebraces[6pt,15pt]{n}{upleft}{upright}
\label{fig:lipids}
\end{figure}

The crucial line is:
\chemfig{N^+(-[:180,1.1]H_3C)(-[:90,1.3]CH_3)(-[:270,1.3]CH_3)(-[:-30]-[:30]-[:-30]O-[:30,1.3]P^+(<[:50,1.5]O\rlap{${}^-$})(<:[:130,1.5]O\rlap{${}^-$})(-#(1pt,)[:330,1.3]O-[:30]-[:-30](-[:270]O-[:-30](=[:270]O)(-[@{downleft,0.8}:30]CH_2-#(1pt,1pt)[@{downright,0.3}:-30,1.2]CH_3))(-[:30]-[:-30]O-[:30](=[:90]O)(-[@{upleft,0.8}:-30]CH_2-#(1pt,1pt)[@{upright,0.3}:30,1.2]CH_3))))}

You’ll also need to include the following in the preamble

\usepackage{chemfig}

Which produces something that looks like this:

Image

Samsung Galaxy S3 with Fedora

The new Android phones no longer work as USB mass storage devices, and instead use MTP. Not that I really know what this is, or its advantages over the previous system.

Fortunately a very helpful blog post at http://tacticalvim.wordpress.com/2012/12/08/mounting-nexus-4-via-mtp-in-fedora-17/ guided me most of the way (the nexus 4 and galaxy S3 are very similar)
I had to make a couple of changes as the device ids were different, but it’s essentially the same instructions so have a look there first

Firstly install simple-mtpfs:

sudo yum -y install fuse fuse-libs libmtp simple-mtpfs

Check it’s worked with:

ls -l /dev/libmtp*

Which should return a link between libmtp and somewhere in bus/usb. Then create /etc/udev/rules.d/99-galaxyS3.rules, with the following content:

ACTION!="add", GOTO="galaxyS3_rules_end"
ENV{MAJOR}!="?*", GOTO="galaxyS3_rules_end"
SUBSYSTEM=="usb", GOTO="galaxyS3_usb_rules"
GOTO="galaxyS3_rules_end"

LABEL="galaxyS3_usb_rules"

# Galaxy SIII I-9300
ATTR{idVendor}=="04e8", ATTR{idProduct}=="6860", SYMLINK+="libmtp-%k", ENV{ID_MTP_DEVICE}="1", ENV{ID_MEDIA_PLAYER}="1"

LABEL="galaxyS3_rules_end"

Get the correct idVendor (VID) and idProduct (PID) by running simple-mtpfs -l

You can then set commands to mount and unmount by adding the following to your /.bashrc, where ~/mnt/galaxyS3 is the directory your phone’s storage will be mounted to:

alias S3mount="simple-mtpfs ~/mnt/galaxyS3"
alias S3umount="fusermount -u ~/mnt/galaxyS3"

The commands on the right of course can be used to mount and unmount

You’ll need to reboot to get it to work. I had to unplug and plug the phone too. troyengel reports on his tacticalvim blog that he had to run the S3mount command 2-3 times to get it to work

If you’re having trouble I’d recommend looking at the simple-mtpfs documentation

Installing Times New Roman in fedora

I imported a pdf into inkscape I made in gnuplot which used Times New Roman as a font, however Times wasn’t installed so it substituted the font for sans.
I have a pretty shaky knowledge of how fonts work, and why it was that it worked in gnuplot but Times isn’t available from other programs, but my solution was as follows:

Follow the instructions to use the script to install all msttcorefonts at:
http://blog.andreas-haerter.com/2011/07/01/install-msttcorefonts-fedora.sh

wget "http://blog.andreas-haerter.com/_export/code/2011/07/01/install-msttcorefonts-fedora.sh?codeblock=1" -O "/tmp/install-msttcorefonts-fedora.sh"
chmod a+rx "/tmp/install-msttcorefonts-fedora.sh"
su -c "/tmp/install-msttcorefonts-fedora.sh"

After rebooting this worked, but some of the fonts used by webpages had screwed up (which I believe were trebuchet and verdana)

I fixed this by (as root) navigating to /usr/share/fonts, ensuring out of the newly installed fonts only the times ones are available and refreshing the font cache:

cd /usr/share/fonts
mkdir mstt-times
cp msttcorefonts/times* mstt-times
mv msttcorefonts .msttcorefonts
fc-cache -v

This made a lot of text unrendered in the browser, but after rebooting everything worked as I wanted it to, and I was able to automatically use Times correctly in inkscape

I also found the following page useful: http://www.pwsdb.com/pgm/?p=172

Using custom text when using \ref in latex

I couldn’t find how to do this easily, but perhaps this is because I used rubbish search terms.

I eventually found my answer on http://en.wikibooks.org/wiki/LaTeX/Labels_and_Cross-referencing (which ended up telling me lots of useful things about the hyperref package I didn’t know)

First source the hyperref package in the preamble

\usepackage{hyperref}

You’ll probably want to provide some options to make it look nicer. See the manual linked from the ctan page: http://www.ctan.org/pkg/hyperref

You can then add references choosing the text yourself with a command of the format

\hyperref[label-name]{link-text}

It helps to illustrate this with an example. In my case I have a figure 4, composed of 3 sub-figures 4a, 4b and 4c (though these are simply part of the same image, not specified as separate figures in latex). My figure is labelled ‘SEM’ and I want to reference figure 4c including a hyperlink to the figure it appears in. I can do this using:

\hyperref[fig:SEM]{\ref*{fig:SEM}c}

This sends the link to the SEM figure, and puts as the hyperlinked text ‘4c’. Using \ref in the curly brackets ensures the figure number is updated if it changes from 4, which is the usual behaviour we desire.

Another thing I came across on the wikibooks page was the \autoref command provided by hyperref. This looks like a better idea than using \ref and constantly typing figure, and could straightforwardly be included in the above example by changing \ref to \autoref