Project Setup

Last updated on 2023-03-21 | Edit this page

Estimated time: 6 minutes

Overview

Questions

  • How do I create a directory with the necessary files for this class?

Objectives

  • Connect to OSC
  • Create a directory
  • Open a Web Terminal in that Directory
  • Copy lesson files into place
  • Copy cached singularity images into place

Log into OSC


  • Visit https://ondemand.osc.edu/.
  • Login with your credentials.
  • From the top menu select Files -> Home Directory
  • Click the New Directory button
    • Enter SnakemakeWorkflow for the directory name
  • Click “SnakemakeWorkflow” in the Name column the list of files and folders
  • Click “Open in Terminal”

Copy Lesson Files


Some files are provided for this lesson. These files need to be copied into the “SnakemakeWorkflow” subdirectory within your home directory. You should be within the SnakemakeWorkflow directory before running this step.

Verify Directory

Run the following command to ensure you are in the appropriate directory:

BASH

pwd

The output should look similar to the following:

OUTPUT

/users/PAS2136/jbradley/SnakemakeWorkflow

Copy Files

Run the following command to copy these files into your current directory (SnakemakeWorkflow):

BASH

cp -r /fs/ess/PAS2136/Workshops/Snakemake/files/* .

Next run the ls command to ensure you have all the needed files:

BASH

ls

Expected Output:

OUTPUT

multimedia.csv  run-workflow.sh  Scripts  slurm

Lesson Files

  • multimedia.csv - Main input file of fish images used by the workflow
  • Scripts/
    • setup_env.sh - Used to activates snakemake conda environment and other utilities
    • FilterImagesHardCoded.R - Rscript that filters a CSV for a target species with hard coded filenames
    • FilterImages.R - R script that filters a CSV for a target species
    • SummaryReport.R - R script that builds a summary report of the workflow outputs
    • Summary.Rmd - R markdown script used by SummaryReport.R to create a report
  • run-workflow.sh - sbatch script used to run the workflow using SLURM
  • slurm/ - Directory containing a config file used by Snakemake to run SLURM jobs

The main input file (multimedia.csv) was downloaded from https://bgnn.tulane.edu.

Setup Snakemake Singularity Cache


To avoid waiting for singularity containers to pull we will copy cached containers inline.

BASH

mkdir -p .snakemake/singularity
cp /fs/ess/PAS2136/Workshops/Snakemake/singularity_images/* .snakemake/singularity/.