Docker Tutorial 1: Palying with Docker to deploy genetics software
This is a tutorial of using docker to set up bioinforamatics and genetics analysis tools.
1. First let's install the docker
I am running the Ubuntu 14.04.03 LTS server version in VMware Fusion Pro. First confirm the Linux Kernel version with uname -r
and my return is 3.19.0-25-generic
. Which will be fine since the prerequirement is version higher than 3.10
The curl is preinstalled in the system, otherwise use
sudo apt-get install curl
to get one.
The installation of Docker is done in one step
curl -sSL https://get.docker.com/ | sh
Verify the installation from the log message
...
cgroup-lite start/running
Setting up docker-engine (1.8.2-0~trusty) ...
docker start/running, process 3512
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
Processing triggers for ureadahead (0.100.0-16) ...
+ sudo -E sh -c docker version
Client:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.8.2
API version: 1.20
Go version: go1.4.2
Git commit: 0a8c2e3
Built: Thu Sep 10 19:19:00 UTC 2015
OS/Arch: linux/amd64
If you would like to use Docker as a non-root user, you should now consider
adding your user to the "docker" group with something like:
sudo usermod -aG docker psytky03
Remember that you will have to log out and back in for this to take effect!
Create a user groud named docker and add the user to this group.
sudo usermod -aG docker psytky03
Logout and back again.
Run a HelloWorld test image
docker run hello-world
Here is the output
Hello from Docker.
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker Hub account:
https://hub.docker.com
For more examples and ideas, visit:
https://docs.docker.com/userguide/
docker run -it ubuntu bash
2. Sign up a DockerHub account
Dockerhub is similar to the concept of github for pushing and pulling docker images
https://hub.docker.com/login/
Here I got a username as "psytky03"
3. Make the first Dockerfile
Dockerfile is the blueprint to tell docker how to create an image. It contains the basic information such as which platform the image is based on, a full collection of the commands for installation of the software, the system path et.c
Here I am going to install two tools: the latest
Eigensoft ver 6.01 for pricinple component analysis (PCA) and
Plink ver 1.9.
The contents of this dockerfile looks like this
FROM ubuntu
MAINTAINER Psytky03
RUN sudo apt-get update
RUN sudo apt-get -y install wget git unzip
RUN sudo apt-get -y install libgsl0ldbl gfortran-4.4
RUN git clone https://github.com/DReichLab/EIG.git
RUN sudo apt-get -y install wget unzip python
RUN wget https://www.cog-genomics.org/static/bin/plink150903/plink_linux_x86_64.zip
RUN unzip plink_linux_x86_64.zip -d plinkbin
ENV PATH $PATH:/EIG/bin:/plinkbin
RUN mkdir data
4. Build the Docker image
mkdir my_first_docker_image
cd my_first_docker_image/
nano Dockerfile
docker build -t psytky03/eigandplink .
It will take for a while, finally the log report shows the image is built without error
Successfully built 87f7aefa7968
Check the built image with docker images
command
docker images
------------------------------------------------------------------------------------------
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
psytky03/eigandplink latest 6fd94c3b6e6e 3 minutes ago 392.7 MB
ubuntu latest 91e54dfb1179 5 weeks ago 188.4 MB
hello-world latest af340544ed62 7 weeks ago 960 B
The size of this image is 392 MB
Verify the Eigensoft and Plink
docker run psytky03/eigandplink plink --help
docker run psytky03/eigandplink eigenstrat
docker run -it psytky03/eigandplink bash
docker ps -a
Run Plink with data in the host machine
wget https://www.cog-genomics.org/static/bin/plink150903/plink_linux_x86_64.zip
unzip plink_linux_x86_64.zip -d plink1.9
Now we use docker run -v
to bridge the folder in the host machine to the data
folder in the container:
docker run -v /home/psytky03/plink1.9:/data psytky03/eigandplink \
plink --file data/toy --make-bed --out data/test
Check the plink1.9
folder and you should be able to see the test.bed test.bim test.fam
files.
5. Push the image to DockerHub
docker login
docker push psytky03/eigandplink
6. Pull back the image at another Linux machine
docker pull psytky03/eigandplink