Snakedeploy¶
SnakeDeploy is a command line tool and Python library to easily deploy Snakemake pipelines from version control like GitHub. To learn more about Snakemake, visit the official documentation
Support¶
- In case of questions, please post on stack overflow.
- For bugs and feature requests, please use the issue tracker.
- For contributions, visit snakedeploy on Github.
Installation¶
Snakedeploy can be installed via pypi or from source.
Install via pip¶
Snakedeploy can also be installed with pip.
$ pip install snakedeploy
Once it’s installed, you should be able to inspect the client with
snakedeploy --help
Install from source¶
$ git clone git@github.com:snakemake/snakedeploy.git
$ cd snakedeploy
$ pip install .
$ pip install -e .
Tools¶
Snakedeploy provides several command line tools for preparing or otherwise interacting with your data or workflow.
Deploying workflows¶
Snakedeploy enables you to automatically deploy a workflow from a public git repository to your local machine, by using Snakemake’s module system.
Command Line¶
Via the command line, deployment works as follows:
$ snakedeploy deploy-workflow https://github.com/snakemake-workflows/dna-seq-varlociraptor /tmp/dest --tag v1.0.0
Snakedeploy will then generate a workflow definition /tmp/dest/workflow/Snakefile
that declares the given workflow as a module.
For the example above, it will have the following content
configfile: "config/config.yaml"
# declare https://github.com/snakemake-workflows/dna-seq-varlociraptor as a module
module dna_seq:
snakefile:
"https://github.com/snakemake-workflows/raw/v1.0.0/workflow/Snakefile"
config:
config
# use all rules from https://github.com/snakemake-workflows/dna-seq-varlociraptor
use rule * from dna_seq
In addition, it will copy over the contents of the config
directory of the given repository into /tmp/dest/workflow/Snakefile
.
These should be seen as a template, can be modified according to your needs.
Further, the workflow definition Snakefile can be arbitrarily extended and modified, thereby making any changes to the used workflow transparent (also see the snakemake module documentation).
It is highly advisable to put the deployed workflow into a new (perhaps private) git repository (e.g., see here for instructions how to do that with Github).
Python¶
These same interactions can be done from within Python.
$ from snakedeploy.deploy import deploy
$ deploy("https://github.com/snakemake-workflows/dna-seq-varlociraptor", dest_path="/tmp/dest", name="dna_seq", tag="v1.0.0", force=True)
Also see The Snakedeploy API for details.
Collecting Files¶
In addition to deploying workflows, snakedeploy helps with generating sample/unit sheets from files on the filesystem.
These can then be used to configure a Snakemake workflow.
Let’s say that we have a tab separated sheet of inputs called unit-patterns.tsv
:
S743_Nr(?P<nr>[0-9]+) S743_1/01_fastq/S743Nr{nr}.*.fastq.gz
S839_Nr(?P<nr>[0-9]+) S839_*/01_fastq/S839Nr{nr}.*.fastq.gz
S888_Nr(?P<nr>[0-9]+) S888/S888_1/01_fastq/S888Nr{nr}.*.fastq.gz
And then a file of samples, samples.tsv
where the first column contains the sample ids. If we want to collect files on the system based on a glob
pattern of interest and print them to STDOUT (along with the sample id) we can do:
cut -f1 samples.tsv | tail -n+2 | snakedeploy collect-files --config unit-patterns.tsv
More specifically, the config sheet above lets us select, for each sample, a glob pattern, which is then used to obtain the files on disk that correspond to this sample, which are then printed tab separated to STDOUT, along with the sample id that we put in. This allows us to obtain the path to the raw data of the given samples.
The Snakedeploy API¶
These sections detail the internal functions for Snakedeploy.
snakedeploy.deploy module¶
snakedeploy.collect_files module¶
Internal API¶
These pages document the entire internal API of Snakedeploy.
snakedeploy package¶
Submodules¶
snakedeploy.client module¶
snakedeploy.providers module¶
snakedeploy.logging module¶
-
class
snakedeploy.logger.
ColorizingStreamHandler
(nocolor=False, stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>)[source]¶ Bases:
logging.StreamHandler
-
BLACK
= 0¶
-
BLUE
= 4¶
-
BOLD_SEQ
= '\x1b[1m'¶
-
COLOR_SEQ
= '\x1b[%dm'¶
-
CYAN
= 6¶
-
GREEN
= 2¶
-
MAGENTA
= 5¶
-
RED
= 1¶
-
RESET_SEQ
= '\x1b[0m'¶
-
WHITE
= 7¶
-
YELLOW
= 3¶
-
colors
= {'CRITICAL': 1, 'DEBUG': 4, 'ERROR': 1, 'INFO': 2, 'WARNING': 3}¶
-
emit
(record)[source]¶ Emit a record.
If a formatter is specified, it is used to format the record. The record is then written to the stream with a trailing newline. If exception information is present, it is formatted using traceback.print_exception and appended to the stream. If the stream has an ‘encoding’ attribute, it is used to determine how to do the output to the stream.
-
property
is_tty
¶
-