training

Containers and Workflow Pipelines for reproducible and automated data analysis

advanced bioinformatics
programming
online
ELIXIR
live training

Containers and Workflow Pipelines for reproducible and automated data analysis

Target Audience:
All scientists
Location:

Online

Duration:

20 and 27 May

General context

The first day is dedicated to Containers (Docker & Singularity) which are great tools for code portability and reproducibility of your analysis. You will learn how to use containers and how to build a container from scratch, share it with others and how to re-use and modify existing containers. After an extensive explanation on Docker containers, at the end of the first day, Singularity will be highlighted as well. 

On the second day, you will learn how to use Nextflow for building scalable and reproducible bioinformatics pipelines and running them on a personal computer, cluster and cloud. Starting from the basic concepts we will build our own simple pipeline and add new features with every step, all in the new DSL2 language.  

 

Objectives

Containers

  • Learn the concept of and the difference between Docker & Singularity containers 
  • Write a Docker recipe, build and run a Docker image and containers
  • Pull and push Docker container to / from Docker hub
  • Docker files and layers; Docker cashing
  • Working with volumes
  • Pull Docker containers as a Singularity image

Pipelines

  • Understand Nextflow's basic concepts: channels, processes, modules, workflows, etc. 
  • Write and run a Nextflow pipeline 
  • Write and modify config files for storing parameters related to computing hardware as well as pipeline dependent parameters
Event intended for

Bioinformaticians with no or little knowledge of containers or workflow pipelines. 

Required skills

You're familiar with doing bioinformatics on the command-line. 

Trainers

Alexander Botzki

Alexander Botzki is heading the Technology training unit at VIB, the Flemish Institute of Biotechnology, Belgium. The main mission of this unit is providing technology training in domains of VIB Technologies, Bioinformatics & AI, Software Development, and Research Data Management. Between 2014 and 2022, he was head of the VIB Bioinformatics Core. From September 2009 to July 2014, he was responsible for the roll-out of E-Notebook (electronic lab notebook) to VIB's researchers within 75 research groups.

Before joining VIB, Alexander worked on various computational biology projects for Algonomics (bought by Lonza, 2008-2009) and DevGen (now Syngenta, from 2006-2008). During Alexander's PostDoc at Sanofi Aventis in Strasbourg, he executed various virtual screening campaigns on the compound selection of the merged enterprise. He received his doctoral degree with the group of Prof. Dr. Armin Buschauer (University of Regensburg, Germany) on 'Structure-based design of hyaluronidase inhibitors'.

Contact Alexander Botzki :

Program

This workshop will be organised online on Thursday 20 and 27 May 2021. 

 
Introduction to containers
-
  • History of containers, what are containers and why should we use them? 
  • Containers vs. virtual machines
Docker Containers
-
  • Docker architecture
  • Docker files, Docker image layers, Docker caching
  • Docker hub: get images from and load images to: push, pull
  • Build and run Docker image (interactively) from existing recipe
  • Docker “recipes”:
    • Understand the main sections: FROM, RUN, ADD, ENV, CP
    • Write, build and run a basic recipe
  • Working with volumes, tags,
Singularity Containers
-
  • Differences between Singularity and Docker: why and when to use one or the other. Pros and cons.
  • Singularity recipes
  • Pull and run an image with Singularity from Docker hub
  • Volumes in Singularity
  • Use a Singularity image interactively
Nextflow workflow pipelines
-
  • Theoretic approach to processes, channels, operators, modules and workflows; the basics of Nextflow
  • Run a simple Nextflow pipeline 
  • Modifying a pipeline and rerun processes
  • Obtain a thorough understanding of config files
  • Write a simple Nextflow pipeline (example: RNA-seq)
  • Including environment managers in Nextflow pipelines (Conda, Docker & Singularity)

Practical info

Extra information

This workshop will be organised online on Thursday 20 and 27 May 2021.