Bioinformatics workflows for life sciences using snakemake

Ratings 3.27 / 5.00

What You Will Learn!

what is snakemake
install snakemake
build a basic snakemake workflow
understand a snakemake rule, the structure of a rule (input, output, shell, script)

Description

Course on snakemake. snakemake is a modern workflow language that is widely used in academic and industrial circles to build reproducible, legible, portable, interoperable and efficient pipelines in bioinformatics and beyond. The course closely follows the basic bioinformatics workflow described in the official snakemake tutorial but takes a step-by-step approach and delves deeply into each feature of the snakemake language. It covers:

- installation

- Snakefile

- rules

- directives: input, output, shell, script

- target files

- creation of a directed acyclic graph

This course does not cover:

- benchmarking

- conda directive

- snakemake profiles for cluster computers

- temporary files

- parameters

- resources

At the end of this course, you will be able to build a basic bioinformatics pipeline. This knowledge will be sufficient to make a positive difference in your day-to-day life as a bioinformatician. It will also prepare you for my advanced course on snakemake.

The course is primarily intended for bioinformaticians but it can also be useful for people from other fields who want to build pipelines.

The course can also be used as an introduction to the field of bioinformatics. In it, I use the concepts of "reads", "alignment", "BAM" files, "VCF" files, variant calls. However, note that I do not spend much time explaining those concepts and focus primarily on the snakemake language.