training

Containers and Workflow Pipelines for reproducible and automated data analysis

Containers and Workflow Pipelines for reproducible and automated data analysis

advanced bioinformatics
programming
online
ELIXIR
Location:

Online

Start date:

20 May 2021

General context

The first day is dedicated to Containers (Docker & Singularity) which are great tools for code portability and reproducibility of your analysis. You will learn how to use containers and how to build a container from scratch, share it with others and how to re-use and modify existing containers. After an extensive explanation on Docker containers, at the end of the first day, Singularity will be highlighted as well. 

On the second day, you will learn how to use Nextflow for building scalable and reproducible bioinformatics pipelines and running them on a personal computer, cluster and cloud. Starting from the basic concepts we will build our own simple pipeline and add new features with every step, all in the new DSL2 language.  

 

Objectives

Containers

  • Learn the concept of and the difference between Docker & Singularity containers 
  • Write a Docker recipe, build and run a Docker image and containers
  • Pull and push Docker container to / from Docker hub
  • Docker files and layers; Docker cashing
  • Working with volumes
  • Pull Docker containers as a Singularity image

Pipelines

  • Understand Nextflow's basic concepts: channels, processes, modules, workflows, etc. 
  • Write and run a Nextflow pipeline 
  • Write and modify config files for storing parameters related to computing hardware as well as pipeline dependent parameters
Event intended for

Bioinformaticians with no or little knowledge of containers or workflow pipelines. 

Required skills

You're familiar with doing bioinformatics on the command-line. 

Trainers

Alexander Botzki

Alexander Botzki is head of the VIB Bioinformatics Core

Contact Alexander Botzki :
Tuur Muyldermans

Tuur Muyldermans is a bioinformatics trainer at the VIB Bioinformatics Core and ELIXIR Belgium. 

Contact Tuur Muyldermans :

Program

This workshop will be organised online on Thursday 20 and 27 May 2021. 

 
Introduction to containers
-
  • History of containers, what are containers and why should we use them? 
  • Containers vs. virtual machines
Docker Containers
-
  • Docker architecture
  • Docker files, Docker image layers, Docker caching
  • Docker hub: get images from and load images to: push, pull
  • Build and run Docker image (interactively) from existing recipe
  • Docker “recipes”:
    • Understand the main sections: FROM, RUN, ADD, ENV, CP
    • Write, build and run a basic recipe
  • Working with volumes, tags,
Singularity Containers
-
  • Differences between Singularity and Docker: why and when to use one or the other. Pros and cons.
  • Singularity recipes
  • Pull and run an image with Singularity from Docker hub
  • Volumes in Singularity
  • Use a Singularity image interactively
Nextflow workflow pipelines
-
  • Theoretic approach to processes, channels, operators, modules and workflows; the basics of Nextflow
  • Run a simple Nextflow pipeline 
  • Modifying a pipeline and rerun processes
  • Obtain a thorough understanding of config files
  • Write a simple Nextflow pipeline (example: RNA-seq)
  • Including environment managers in Nextflow pipelines (Conda, Docker & Singularity)

Practical info

Extra information

This workshop will be organised online on Thursday 20 and 27 May 2021.