Manual¶
Modules available:
nanopolish extract: extract reads in FASTA or FASTQ format from a directory of FAST5 files
nanopolish call-methylation: predict genomic bases that may be methylated
nanopolish variants: detect SNPs and indels with respect to a reference genome
nanopolish variants --consensus: calculate an improved consensus sequence for a draft genome assembly
nanopolish eventalign: align signal-level events to k-mers of a reference genome
nanopolish phase-reads: Phase reads using heterozygous SNVs with respect to a reference genome
nanopolish polya: Estimate polyadenylated tail lengths on native RNA reads
extract¶
Overview¶
This module is used to extract reads in FASTA or FASTQ format from a directory of FAST5 files.
Input¶
path to a directory of FAST5 files modified to contain basecall information
Output¶
sequences of reads in FASTA or FASTQ format
Usage example¶
nanopolish extract [OPTIONS] <fast5|dir>
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
<fast5|dir> |
Y |
NA |
FAST5 or path to directory of FAST5 files. |
|
N |
NA |
Recurse into subdirectories |
|
N |
fasta format |
Use when you want to extract to FASTQ format |
|
N |
2d-or-template |
The type of read either: {template, complement, 2d, 2d-or-template, any} |
|
N |
NA |
consider only data produced by basecaller NAME, optionally with given exact VERSION |
|
N |
stdout |
Write output to FILE |
index¶
Overview¶
Build an index mapping from basecalled reads to the signals measured by the sequencer
Input¶
path to directory of raw nanopore sequencing data in FAST5 format
basecalled reads
Output¶
gzipped FASTA file of basecalled reads (.index)
index files (.fai, .gzi, .readdb)
Readdb file format¶
Readdb file is a tab-separated file that contains two columns. One column represents read ids and the other column represents the corresponding path to FAST5 file:
read_id_1 /path/to/fast5/containing/reads_id_1/signals
read_id_2 /path/to/fast5/containing/read_id_2/signals
Usage example¶
nanopolish index [OPTIONS] -d nanopore_raw_file_directory reads.fastq
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
|
Y |
NA |
FAST5 or path to directory of FAST5 files containing ONT sequencing raw signal information. |
|
N |
NA |
file containing the paths to each fast5 for the run |
call-methylation¶
Overview¶
Classify nucleotides as methylated or not.
Input¶
Basecalled ONT reads in FASTA format
Output¶
tab-separated file containing per-read log-likelihood ratios
Usage example¶
nanopolish call-methylation [OPTIONS] <fast5|dir>
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
|
Y |
NA |
the ONT reads are in fasta FILE |
|
Y |
NA |
the reads aligned to the genome assembly are in bam FILE |
|
Y |
NA |
the genome we are computing a consensus for is in FILE |
|
N |
1 |
use NUM threads |
|
N |
NA |
print out a progress message |
variants¶
Overview¶
This module is used to call single nucleotide polymorphisms (SNPs) using a signal-level HMM.
Input¶
basecalled reads
alignment info
genome assembly
Output¶
VCF file
Usage example¶
nanopolish variants [OPTIONS] --reads reads.fa --bam alignments.bam --genome genome.fa
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
|
N |
NA |
use flag to only call SNPs |
|
N |
NA |
run in consensus calling mode and write polished sequence to FILE |
|
N |
NA |
use flag to run the experimental homopolymer caller |
|
N |
NA |
minimize compute time while slightly reducing consensus accuracy |
|
N |
NA |
find variants in window STR (format: <chromsome_name>:<start>-<end>) |
|
Y |
NA |
the ONT reads are in fasta FILE |
|
Y |
NA |
the reads aligned to the reference genome are in bam FILE |
|
Y |
NA |
the events aligned to the reference genome are in bam FILE |
|
Y |
NA |
the reference genome is in FILE |
|
N |
stdout |
write result to FILE |
|
N |
1 |
use NUM threads |
|
N |
0.2 |
extract candidate variants from the aligned reads when the variant frequency is at least F |
|
N |
20 |
extract candidate variants from the aligned reads when the depth is at least D |
|
N |
1000 |
consider at most N haplotypes combinations |
|
N |
50 |
perform N rounds of consensus sequence improvement |
|
N |
NA |
read variants candidates from VCF, rather than discovering them from aligned reads |
|
N |
NA |
if an alternative basecaller was used that does not output event annotations then use basecalled sequences from FILE. The signal-level events will still be taken from the -b bam |
|
N |
NA |
when making a call, also calculate the support of the 3 other possible bases |
|
N |
NA |
read alternatives k-mer models from FILE |
event align¶
Overview¶
Align nanopore events to reference k-mers
Input¶
basecalled reads
alignment information
assembled genome
Usage example¶
nanopolish eventalign [OPTIONS] --reads reads.fa --bam alignments.bam --genome genome.fa
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
|
N |
NA |
use to write output in SAM format |
|
N |
NA |
Compute the consensus for window STR (format : ctg:start_id-end_id) |
|
Y |
NA |
the ONT reads are in fasta FILE |
|
Y |
NA |
the reads aligned to the genome assembly are in bam FILE |
|
Y |
NA |
the genome we are computing a consensus for is in FILE |
|
N |
1 |
use NUM threads |
|
N |
NA |
scale events to the model, rather than vice-versa |
|
N |
NA |
print out a progress message |
|
N |
NA |
print read names instead of indexes |
|
N |
NA |
summarize the alignment of each read/strand in FILE |
|
N |
NA |
write the raw samples for the event to the tsv output |
|
N |
NA |
read alternative k-mer models from FILE |
phase-reads - (experimental)¶
Overview¶
Phase reads using heterozygous SNVs with respect to a reference genome
Input¶
basecalled reads
alignment information
assembled genome
variants (from nanopolish variants or from other sources eg. Illumina VCF)
Usage example¶
nanopolish phase-reads [OPTIONS] --reads reads.fa --bam alignments.bam --genome genome.fa variants.vcf
polya¶
Overview¶
Estimate the number of nucleotides in the poly(A) tails of native RNA reads.
Input¶
basecalled reads
alignment information
reference transcripts
Usage example¶
nanopolish polya [OPTIONS] --reads=reads.fa --bam=alignments.bam --genome=ref.fa
Argument name(s) |
Required |
Default value |
Description |
---|---|---|---|
|
N |
NA |
Compute only for reads aligning to window of reference STR (format : ctg:start_id-end_id) |
|
Y |
NA |
the FAST(A/Q) file of native RNA reads |
|
Y |
NA |
the BAM file of alignments between reads and the reference |
|
Y |
NA |
the reference transcripts |
|
N |
1 |
use NUM threads |
|
N |
NA |
-v returns raw sample log-likelihoods, while -vv returns event durations |