NART is desgined for mapping-based Nanopore Amplicon (Real-Time) analysis, e.g., 16S rRNA gene. NART utils are composed of nart (Nanopore Amplicon Real-Time entry) and nawf (Nanopore Amplicon snakemake WorkFlow entry) in one python package. NART provides an (real-time) end-to-end solution from bascecalled reads to the final count matrix through mapping-based strategy.
Important: NART is under development, and here released as a preview. NART is only tested in Linux systems, i.e., Ubuntu.
nawf provide three options (i.e., emu, minimap2lca and blast2lca) to determine microbial composition. 
The easiest way to use NART is to pull the docker image from Docker Hub for cross-platform support.
docker pull yanhui09/nart To use the docker image, you need to mount your data directory, e.g., pwd, to the /home in the container.
docker run -it -v `pwd`:/home --network host --privileged yanhui09/nart Note: --network host is required for nart monitor to work.
The host networking driver only works on Linux hosts, and is not supported on Docker Desktop for Mac, Docker Desktop for Windows, or Docker EE for Windows Server. [Read more]
Conda is the only required dependency prior to installation. Miniconda is enough for the whole pipeline.
- Clone the Github repository and create an isolated
condaenvironment
git clone https://github.com/yanhui09/nart.git cd nart conda env create -n nart -f env.yaml You can speed up the whole process if mamba is installed.
mamba env create -n nart -f env.yaml - Install
NARTwithpip
To avoid inconsistency, we suggest installing NART in the above conda environment
conda activate nart pip install --editable . At this moment, NART uses guppy or minibar for custom barcode demultiplexing (in our lab).
Remember to prepare the barcoding files in guppy or minibar if new barcodes are introduced. Click me
Remember to activate the conda environment if NART is installed in a conda environment.
conda activate nart nawf can be used to profile any single basecalled fastq file from a Nanopore run or batch.
nawf config -b /path/to/single_basecall_fastq -d /path/to/database # init config file and check nawf run all # start analysis nart provide utils to record, process and profile the continuously generated fastq batch.
Before starting real-time analysis, you need nawf to configure the workflow according to your needs.
nawf config -d /path/to/database # init config file and check In common cases, you need three independent sessions to handle monitor, process and visulization, repectively.
- Minitor the bascall output and record
nart monitor -q /path/to/basecall_fastq_dir # monitor basecall output - Start amplicon analysis for new fastq
nart run -t 10 # real-time process in batches - Update the feature table for interactively visualize in the browser
nart visual # interactive visualization NART is composed of two sets of scripts: nart and nawf, which controls real-time analysis and workflow performance, respectively.
Usage: nart [OPTIONS] COMMAND [ARGS]... NART: A tool for Nanopore Amplicon Real-Time (NART) analysis. To follow updates and report issues, see: https://github.com/yanhui09/nart. Options: -v, --version Show the version and exit. -h, --help Show this message and exit. Commands: monitor Start NART to monitor a directory. run Start NART workflow. visual Start NART app to interactively visualize the results. Usage: nart monitor [OPTIONS] Start NART monitor. Options: -q, --query PATH A query directory to monitor the new fastq files. -e, --extension TEXT The file extension to monitor for (e.g. '.fastq.gz'). [default: .fastq.gz] -w, --workdir PATH Workflow working directory. [default: .] -t, --timeout INTEGER Stop query if no new files were generated within the give minutes. [default: 30] -h, --help Show this message and exit. Usage: nart run [OPTIONS] [SNAKE_ARGS]... Start NART. Options: -w, --workdir PATH Workflow working directory. [default: .] -t, --timeout INTEGER Stop run if no new files were updated in list within the given minutes. [default: 10] -c, --configfile FILE Workflow config file. Use config.yaml in working directory if not specified. -j, --jobs INTEGER Maximum jobs to run in parallel. [default: 6] -m, --maxmem FLOAT Specify maximum memory (GB) to use. Memory is controlled by profile in cluster execution. --profile TEXT Snakemake profile for cluster execution. -n, --dryrun Dry run. -h, --help Show this message and exit. Usage: nart visual [OPTIONS] Options: -p, --port INTEGER Port to run the app on. [default: 5000] -i, --input PATH Path to the working directory. [default: .] -w, --wait-time INTEGER Time to wait (in minutes) if input file is missing. [default: 5] --relative Use relative abundance instead of absolute abundance. --rm-unmapped Remove unmapped reads from the table. --min-abundance INTEGER Minimum absolute abundance of a feature to plot. [default: 1] --order-by [mean|median|alpha] Order taxonomic features by mean, median, or alphabetically. [default: mean] -h, --help Show this message and exit. Usage: nawf [OPTIONS] COMMAND [ARGS]... NAWF: A sub-tool to run Nanopore Amplicon WorkFlow. The workflow command initiates the NAWF in a single batch, using either a fastq file from one ONT run or a fastq file generated during sequencing. To follow updates and report issues, see: https://github.com/yanhui09/nart. Options: -v, --version Show the version and exit. -h, --help Show this message and exit. Commands: config Generate the workflow config file. run Start workflow in a single batch. Usage: nawf config [OPTIONS] Config NAWF. Options: -b, --bascfq PATH Path to a basecalled fastq file. Option is mutually exclusive with 'demuxdir'. -x, --demuxdir PATH Path to a directory of demultiplexed fastq files. Option is mutually exclusive with 'bascfq'. -d, --dbdir PATH Path to the taxonomy databases. [required] -w, --workdir PATH Output directory for NAWF. [default: .] --demuxer [guppy|minibar] Demultiplexer. [default: guppy] --fqs-min INTEGER Minimum number of reads for the demultiplexed fastqs. [default: 50] --subsample Subsample the reads. --chimera-filt Filter chimeric reads. --primer-check Check primer pattern. --classifier [emu|minimap2lca|blast2lca] Classifier. [default: emu] --jobs-min INTEGER Number of jobs for common tasks. [default: 2] --jobs-max INTEGER Number of jobs for threads-dependent tasks. [default: 6] -h, --help Show this message and exit. Usage: nawf run [OPTIONS] {init|demux|qc|all} [SNAKE_ARGS]... Run NAWF in a single batch. Options: -w, --workdir PATH Workflow working directory. [default: .] -c, --configfile FILE Workflow config file. Use config.yaml in working directory if not specified. -j, --jobs INTEGER Maximum jobs to run in parallel. [default: 6] -m, --maxmem FLOAT Specify maximum memory (GB) to use. Memory is controlled by profile in cluster execution. --profile TEXT Snakemake profile for cluster execution. -n, --dryrun Dry run. -h, --help Show this message and exit. 