Skip to content

yanhui09/nart

Repository files navigation

A tool for
Nanopore Amplicon Real-Time (NART) analysis

snakemake linux/amd64 CI Docker

NART is desgined for mapping-based Nanopore Amplicon (Real-Time) analysis, e.g., 16S rRNA gene. NART utils are composed of nart (Nanopore Amplicon Real-Time entry) and nawf (Nanopore Amplicon snakemake WorkFlow entry) in one python package. NART provides an (real-time) end-to-end solution from bascecalled reads to the final count matrix through mapping-based strategy.

Important: NART is under development, and here released as a preview. NART is only tested in Linux systems, i.e., Ubuntu.

demo

Demo video on Youtube

DAG workflow

nawf provide three options (i.e., emu, minimap2lca and blast2lca) to determine microbial composition. dag

Docker image

The easiest way to use NART is to pull the docker image from Docker Hub for cross-platform support.

docker pull yanhui09/nart 

To use the docker image, you need to mount your data directory, e.g., pwd, to the /home in the container.

docker run -it -v `pwd`:/home --network host --privileged yanhui09/nart 

Note: --network host is required for nart monitor to work.

The host networking driver only works on Linux hosts, and is not supported on Docker Desktop for Mac, Docker Desktop for Windows, or Docker EE for Windows Server. [Read more]

Installation from Github repository

Conda is the only required dependency prior to installation. Miniconda is enough for the whole pipeline.

  1. Clone the Github repository and create an isolated conda environment
git clone https://github.com/yanhui09/nart.git cd nart conda env create -n nart -f env.yaml 

You can speed up the whole process if mamba is installed.

mamba env create -n nart -f env.yaml 
  1. Install NART with pip

To avoid inconsistency, we suggest installing NART in the above conda environment

conda activate nart pip install --editable . 

At this moment, NART uses guppy or minibar for custom barcode demultiplexing (in our lab).
Remember to prepare the barcoding files in guppy or minibar if new barcodes are introduced. Click me

Quick start

Remember to activate the conda environment if NART is installed in a conda environment.

conda activate nart 

Amplicon analysis in single batch

nawf can be used to profile any single basecalled fastq file from a Nanopore run or batch.

nawf config -b /path/to/single_basecall_fastq -d /path/to/database # init config file and check nawf run all # start analysis 

Real-time analysis

nart provide utils to record, process and profile the continuously generated fastq batch.

Before starting real-time analysis, you need nawf to configure the workflow according to your needs.

nawf config -d /path/to/database # init config file and check 

In common cases, you need three independent sessions to handle monitor, process and visulization, repectively.

  1. Minitor the bascall output and record
nart monitor -q /path/to/basecall_fastq_dir # monitor basecall output 
  1. Start amplicon analysis for new fastq
nart run -t 10 # real-time process in batches 
  1. Update the feature table for interactively visualize in the browser
nart visual # interactive visualization 

Usage

NART is composed of two sets of scripts: nart and nawf, which controls real-time analysis and workflow performance, respectively.

nart

Usage: nart [OPTIONS] COMMAND [ARGS]... NART: A tool for Nanopore Amplicon Real-Time (NART) analysis. To follow updates and report issues, see: https://github.com/yanhui09/nart. Options: -v, --version Show the version and exit. -h, --help Show this message and exit. Commands: monitor Start NART to monitor a directory. run Start NART workflow. visual Start NART app to interactively visualize the results. 
Usage: nart monitor [OPTIONS] Start NART monitor. Options: -q, --query PATH A query directory to monitor the new fastq files. -e, --extension TEXT The file extension to monitor for (e.g. '.fastq.gz'). [default: .fastq.gz] -w, --workdir PATH Workflow working directory. [default: .] -t, --timeout INTEGER Stop query if no new files were generated within the give minutes. [default: 30] -h, --help Show this message and exit. 
Usage: nart run [OPTIONS] [SNAKE_ARGS]... Start NART. Options: -w, --workdir PATH Workflow working directory. [default: .] -t, --timeout INTEGER Stop run if no new files were updated in list within the given minutes. [default: 10] -c, --configfile FILE Workflow config file. Use config.yaml in working directory if not specified. -j, --jobs INTEGER Maximum jobs to run in parallel. [default: 6] -m, --maxmem FLOAT Specify maximum memory (GB) to use. Memory is controlled by profile in cluster execution. --profile TEXT Snakemake profile for cluster execution. -n, --dryrun Dry run. -h, --help Show this message and exit. 
Usage: nart visual [OPTIONS] Options: -p, --port INTEGER Port to run the app on. [default: 5000] -i, --input PATH Path to the working directory. [default: .] -w, --wait-time INTEGER Time to wait (in minutes) if input file is missing. [default: 5] --relative Use relative abundance instead of absolute abundance. --rm-unmapped Remove unmapped reads from the table. --min-abundance INTEGER Minimum absolute abundance of a feature to plot. [default: 1] --order-by [mean|median|alpha] Order taxonomic features by mean, median, or alphabetically. [default: mean] -h, --help Show this message and exit. 

nawf

Usage: nawf [OPTIONS] COMMAND [ARGS]... NAWF: A sub-tool to run Nanopore Amplicon WorkFlow. The workflow command initiates the NAWF in a single batch, using either a fastq file from one ONT run or a fastq file generated during sequencing. To follow updates and report issues, see: https://github.com/yanhui09/nart. Options: -v, --version Show the version and exit. -h, --help Show this message and exit. Commands: config Generate the workflow config file. run Start workflow in a single batch. 
Usage: nawf config [OPTIONS] Config NAWF. Options: -b, --bascfq PATH Path to a basecalled fastq file. Option is mutually exclusive with 'demuxdir'. -x, --demuxdir PATH Path to a directory of demultiplexed fastq files. Option is mutually exclusive with 'bascfq'. -d, --dbdir PATH Path to the taxonomy databases. [required] -w, --workdir PATH Output directory for NAWF. [default: .] --demuxer [guppy|minibar] Demultiplexer. [default: guppy] --fqs-min INTEGER Minimum number of reads for the demultiplexed fastqs. [default: 50] --subsample Subsample the reads. --chimera-filt Filter chimeric reads. --primer-check Check primer pattern. --classifier [emu|minimap2lca|blast2lca] Classifier. [default: emu] --jobs-min INTEGER Number of jobs for common tasks. [default: 2] --jobs-max INTEGER Number of jobs for threads-dependent tasks. [default: 6] -h, --help Show this message and exit. 
Usage: nawf run [OPTIONS] {init|demux|qc|all} [SNAKE_ARGS]... Run NAWF in a single batch. Options: -w, --workdir PATH Workflow working directory. [default: .] -c, --configfile FILE Workflow config file. Use config.yaml in working directory if not specified. -j, --jobs INTEGER Maximum jobs to run in parallel. [default: 6] -m, --maxmem FLOAT Specify maximum memory (GB) to use. Memory is controlled by profile in cluster execution. --profile TEXT Snakemake profile for cluster execution. -n, --dryrun Dry run. -h, --help Show this message and exit. 

About

A tool for Nanopore Amplicon Real-Time (NART) analysis.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published