How to run a Snakemake workflow

From MaRDI portal
Project:WorkflowSnakemakeHowTo

This guide describes how to fetch and run Snakemake workflows registered as Fair Digital Objects (FDOs) on the MaRDI portal. The process is split into two main parts:

  • System preparation - a one-time setup of Mamba/Conda and Snakemake on Windows, Linux, or macOS
  • Executing a workflow - downloading an RO-Crate from the portal and running the workflow via Snakemake

Why Conda? This approach relies on Conda environment management to ensure reproducibility across platforms without needing Docker or virtualization. The recommended distribution is Miniforge: it defaults to the community-maintained conda-forge channel, ships with Mamba (a significantly faster dependency resolver) out of the box, and carries no licensing restrictions.

When running a workflow, the --use-conda flag instructs Snakemake to read the environment.yaml files bundled in the RO-Crate and let Mamba create isolated environments with exact dependency versions compiled for your OS. This means the workflow runs identically across Windows, Linux, and macOS.


Step 1: Install Snakemake - WINDOWS

Ensure the following are installed before proceeding:

  • Mamba/Conda: Install Miniforge (recommended):
    • Download from conda-forge.github.io/miniforge
    • Run the .exe installer and select "Add to PATH" during installation
    • Verify: Open PowerShell or CMD and run mamba --version
  • Python: Comes bundled with Miniforge (Python 3.9+ recommended).
    • Verify: python --version
  • Snakemake: Install the Snakemake engine in a dedicated environment:
mamba create -n snakemake -c conda-forge -c bioconda snakemake
mamba activate snakemake
Step 1: Install Snakemake - LINUX

Ensure the following are installed before proceeding:

  • Mamba/Conda: Install Miniforge (recommended):
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh
# Follow prompts, then reload shell: source ~/.bashrc
  • Verify: mamba --version
  • Python: Comes bundled with Miniforge (Python 3.9+ recommended).
  • Verify: python --version
  • Snakemake: Install the Snakemake engine in a dedicated environment:
mamba create -n snakemake -c conda-forge -c bioconda snakemake
mamba activate snakemake
Step 1: Install Snakemake - MACOS

Ensure the following are installed before proceeding:

  • Mamba/Conda: Install Miniforge (recommended):
    • For Intel Macs: Download Miniforge3-MacOSX-x86_64.sh
    • For Apple Silicon (M1/M2/M3): Download Miniforge3-MacOSX-arm64.sh
    • Both available at conda-forge.github.io/miniforge
    • Run: bash Miniforge3-MacOSX-*.sh and follow prompts
    • Reload shell: source ~/.zshrc (or ~/.bash_profile for older macOS)
    • Verify: mamba --version
  • Python: Comes bundled with Miniforge (Python 3.9+ recommended).
    • Verify: python --version
  • Snakemake: Install the Snakemake engine in a dedicated environment:
mamba create -n snakemake -c conda-forge -c bioconda snakemake
mamba activate snakemake
Step 2: Download RO-Crate

The workflow is distributed as an RO-Crate archive, downloadable from the MaRDI portal.

  • Navigate to the workflow page (click here for an example workflow and click the Download ROCrate button.
  • Create a new empty directory for the workflow
  • Extract the downloaded .zip archive into that new directory

All subsequent steps must be run from inside this new directory.

Step 3: Run workflow

Make sure you are inside the new workflow directory created in Step 2 and that the snakemake environment is activated:

mamba activate snakemake

Then run the workflow, replacing my_workflow.smk with the actual .smk filename found in your extracted directory:

snakemake -s my_workflow.smk --cores 1 --use-conda --verbose
  • --cores 1: number of CPU cores to use — increase for faster execution (e.g. --cores 4)
  • --use-conda: automatically installs and uses per-rule conda environments defined in the workflow
  • --verbose: prints detailed logging output