Introduction

Last updated on 2023-03-22 | Edit this page

Overview

Questions

  • Why not use a general purpose language like bash or python for writing workflows?
  • What problems do workflow languages solve?
  • What are the strengths and weaknesses of Snakemake?
  • What will be covered in this workshop?

Objectives

  • Understand why one would use a workflow language like Snakemake
  • Understand the goals of the workshop

Workflow Challenges


Using a general purpose language like bash or python can be challenging:

  • Not re-running the whole pipeline every time
  • Adapting the pipeline to run in different environments
  • Providing dependencies for tools
  • Tracking progress of the workflow

Workflow language features


  • Portable
  • Reproducible
  • Scalable
  • Reusable

List of workflow languages: https://workflows.community/systems

Strengths of Snakemake


  • Readability
  • Only creates missing or out of date files
  • Flexible control over which files are created
  • Is python with some additional rule syntax
  • Dynamic branching
    • Workflow isn’t fixed at start up.
    • Outputs of commands can be used determined what happens next.

Weaknesses of Snakemake


  • Requires learning rule based logic instead of procedural logic
  • Requires some python code/knowledge for typical workflows

Class Plan


  • Create Snakemake Workflow that
    • Runs R scripts for filtering and final analysis
    • Runs Machine Learning Components
    • Efficiently processes many files at the same time
    • Reuses an existing Snakemake workflow