Reproducible data science with Snakemake

Reproducible data science with Snakemake



Start date:

11 January 2021

General context

The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a description of required software, which will be automatically deployed to any execution environment.

With over 200k downloads on Bioconda and on average >5 new citations per week, Snakemake is a widely used and accepted standard for reproducible data science that has powered numerous high impact publications.

Required skills
  • Basic experience in Python programming
  • A laptop with Linux, MacOS, or Windows with WSL



9h00-9h45: Intro

9h45-12h00: Practical session with exercises

12h00-13h00: Lunch break

13h00-13h45: Advanced talk

13h45-17h00: Practical session with exercises