16SMaRT is a bioinformatics analysis pipeline for 16s rRNA gene sequencing data. 16SMaRT is a “one-click” solution towards performing microbial community analysis of amplicon sequencing data. 16SMaRT aims to be your go-to solution for your next microbiome/metagenomics project. The primary objective of 16SMaRT analysis is to determine what genes are present and in what proportions in comparison across a range of samples. It currently supports single-end or paired-end Illumina MiSeq data.
16SMaRT is written in Python using boilpy’s data-pipeline boilerplate. 16SMaRT is built on top of a considerable amount of dependencies and hence, the recommended way to install it is by using docker thus making installation in “one-click” and perfectly reproducible results. 16SMaRT is built with considering maximizing computation resources in mind thereby making it blazingly fast even on a local machine for a decent amount of samples. For a large number of studies, it is recommended to run 16SMaRT on a High-Performance Computing system using singularity.
Table of Contents
Features
Quick Start
Docker
UsingFirst, install docker onto your system (can be followed via docker’s documentation).
Then, you can run simply run 16SMaRT by the following command:
docker run
--rm -it
-v "<HOST_MACHINE_PATH_DATA>:/data"
-v "<HOST_MACHINE_PATH_CONFIG>:/root/.config/s3mart
-v "<HOST_MACHINE_PATH_WORKSPACE>:/work
ghcr.io/achillesrasquinha/s3mart
bpyutils --run-ml s3mart -p "data_dir=/data" --verbose
where <HOST_MACHINE_PATH_DATA>
is the path to your host machine to store pipeline data and <HOST_MACHINE_PATH_CONFIG>
is the path to store 16SMaRT configuration and intermediate data. <HOST_MACHINE_PATH_WORKSPACE>
is a workspace directory for you to store your files that can be used by 16SMaRT (e.g. input files).
Singularity
Running on HPC systems usingSingularity is the most widely used container system for HPC (High-Performance Computing) systems. In order to run your analysis on an HPC system, simply run the following command.
singularity run
--home $HOME
--cleanenv
-B <HOST_MACHINE_PATH_DATA>:/data
-B <HOST_MACHINE_PATH_CONFIG>:/root/.config/s3mart
-B <HOST_MACHINE_PATH_WORKSPACE>:/work
oras://ghcr.io/achillesrasquinha/s3mart:singularity
bpyutils --run-ml s3mart -p "data_dir=/data" --verbose
Usage
Basic Usage
-
Path to input CSV file, data directory of FASTQ files, URL to CSV file.
-
Run FASTQC after downloading SRAs. (boolean, default –
True
) -
Run MultiQC after performing FASTQC. (boolean, default –
True
)
Check out the docs page to understand how to use this pipeline.
Support
Have any queries? Post an issue on the GitHub Issue Tracker.
Citation
If you use this software in your work, please cite it using the following:
Furbeck, R., & Rasquinha, A. (2021). 16SMaRT – 16s rRNA Sequencing Meta-analysis Reconstruction Tool. (Version 0.1.0) [Computer software]. github.com/achillesrasquinha/16SMaRT
A comprehensive list of references for the tools used is listed here.
License
This repository has been released under the MIT License.
Read more here: Source link