RNAplot
RNAplot - manual page for RNAplot 2.6.4
Synopsis
RNAplot [OPTIONS] [<input0>] [<input1>]...
DESCRIPTION
RNAplot 2.6.4
Draw RNA Secondary Structures
The program reads (aligned) RNA sequences and structures in the format as
produced by RNAfold or Stockholm 1.0 and produces drawings of the secondary
structure graph.
Coordinates for the structure graphs are produced using either E. Bruccoleri’s
naview routines, or a simple radial layout method.
For aligned sequences and consensus structures (--msa
option) the graph may be
annotated by covariance information. Additionally, a color-annotated EPS
alignment figure can be produced, similar to that obtained by RNAalifold and
RNALalifold.
If the sequence was preceded by a FASTA header, or if the multiple sequence
alignment contains an ID field, these IDs will be taken as names for the output
file(s): “name_ss.ps” and “name_aln.ps”. Otherwise “rna.ps” and
“aln.ps” will be used. This behavior may be over-ruled by explicitly setting
a filename prefix using the --auto-id
option.
Existing files of the same name will be overwritten.
- -h, --help
Print help and exit
- --detailed-help
Print help, including all details and hidden options, and exit
- --full-help
Print help, including hidden options, and exit
- -V, --version
Print version and exit
I/O Options:
Command line options for input and output (pre-)processing
- -i, --infile=<filename>
Read a file instead of reading from stdin.
The default behavior of RNAplot is to read input from stdin or the file(s) that follow(s) the RNAplot command. Using this parameter the user can specify input file names where data is read from. Note, that any additional files supplied to RNAplot are still processed as well.
- -a, --msa
Input is multiple sequence alignment in Stockholm 1.0 format. (default=off)
Using this flag indicates that the input is a multiple sequence alignment (MSA) instead of (a) single sequence(s). Note, that only STOCKHOLM format allows one to specify a consensus structure. Therefore, this is the only supported MSA format for now!
- --mis
Output “most informative sequence” instead of simple consensus (default=off)
For each column of the alignment output this is the set of nucleotides with frequency greater than average in IUPAC notation.
- -j, --jobs[=number]
Split batch input into jobs and start processing in parallel using multiple threads. (default=”0”)
Default processing of input data is performed in a serial fashion, i.e. one sequence at a time. Using this switch, a user can instead start the computation for many sequences in the input in parallel. RNAplot will create as many parallel computation slots as specified and assigns input sequences of the input file(s) to the available slots. Note, that this increases memory consumption since input alignments have to be kept in memory until an empty compute slot is available and each running job requires its own dynamic programming matrices. A value of 0 indicates to use as many parallel threads as computation cores are available.
- -o, --output-format=ps|gml|xrna|svg
Specify output format. (default=”ps”)
Available formats are: PostScript (
ps
), Graph Meta Language (gml
), Scalable Vector Graphics (svg
), and XRNA save file (xrna
). Output filenames will end in “.ps” “.gml” “.svg” “.ss”, respectively.
- --pre=string
Add annotation macros to postscript file, and add the postscript code in “string” just before the code to draw the structure. This is an easy way to add annotation.
- --post=string
Same as
--pre
but in contrast to adding the annotation macros. E.g to mark position 15 with circle use--post=
”15 cmark”.
- --auto-id
Automatically generate an ID for each sequence. (default=off)
The default mode of RNAfold is to automatically determine an ID from the input sequence data if the input file format allows to do that. Sequence IDs are usually given in the FASTA header of input sequences. If this flag is active, RNAfold ignores any IDs retrieved from the input and automatically generates an ID for each sequence. This ID consists of a prefix and an increasing number. This flag can also be used to add a FASTA header to the output even if the input has none.
- --id-prefix=STRING
Prefix for automatically generated IDs (as used in output file names).
(default=”sequence”)
If this parameter is set, each sequence will be prefixed with the provided string. Hence, the output files will obey the following naming scheme: “prefix_xxxx_ss.ps” (secondary structure plot), “prefix_xxxx_dp.ps” (dot-plot), “prefix_xxxx_dp2.ps” (stack probabilities), etc. where xxxx is the sequence number. Note: Setting this parameter implies
--auto-id
.
- --id-delim=CHAR
Change the delimiter between prefix and increasing number for automatically generated IDs (as used in output file names).
(default=”_”)
This parameter can be used to change the default delimiter “_” between the prefix string and the increasing number for automatically generated ID.
- --id-digits=INT
Specify the number of digits of the counter in automatically generated alignment IDs.
(default=”4”)
When alignments IDs are automatically generated, they receive an increasing number, starting with 1. This number will always be left-padded by leading zeros, such that the number takes up a certain width. Using this parameter, the width can be specified to the users need. We allow numbers in the range [1:18]. This option implies
--auto-id
.
- --id-start=LONG
Specify the first number in automatically generated IDs.
(default=”1”)
When sequence IDs are automatically generated, they receive an increasing number, usually starting with 1. Using this parameter, the first number can be specified to the users requirements. Note: negative numbers are not allowed. Note: Setting this parameter implies to ignore any IDs retrieved from the input data, i.e. it activates the
--auto-id
flag.
- --filename-delim=CHAR
Change the delimiting character used in sanitized filenames.
(default=”ID-delimiter”)
This parameter can be used to change the delimiting character used while sanitizing filenames, i.e. replacing invalid characters. Note, that the default delimiter ALWAYS is the first character of the “ID delimiter” as supplied through the
--id-delim
option. If the delimiter is a whitespace character or empty, invalid characters will be simply removed rather than substituted. Currently, we regard the following characters as illegal for use in filenames: backslash\
, slash/
, question mark?
, percent sign%
, asterisk*
, colon:
, pipe symbol|
, double quote"
, triangular brackets<
and>
.
- --filename-full
Use full FASTA header to create filenames. (default=off)
This parameter can be used to deactivate the default behavior of limiting output filenames to the first word of the sequence ID. Consider the following example: An input with FASTA header
>NM_0001 Homo Sapiens some gene
usually produces output files with the prefix “NM_0001” without the additional data available in the FASTA header, e.g. “NM_0001_ss.ps” for secondary structure plots. With this flag set, no truncation of the output filenames is done, i.e. output filenames receive the full FASTA header data as prefixes. Note, however, that invalid characters (such as whitespace) will be substituted by a delimiting character or simply removed, (see also the parameter option--filename-delim
).
Plotting:
Command line options for changing the default behavior of structure layout and pairing probability plots
- --covar
Annotate covariance of base pairs in consensus structure.
(default=off)
- --aln
Produce a colored and structure annotated alignment in PostScript format in the file “aln.ps” in the current directory.
(default=off)
- --aln-EPS-cols=INT
Number of columns in colored EPS alignment output.
(default=”60”)
A value less than 1 indicates that the output should not be wrapped at all.
- -t, --layout-type=INT
Choose the plotting layout algorithm. (default=”1”)
Select the layout algorithm that computes the nucleotide coordinates. Currently, the following algorithms are available:
0
: simple radial layout1
: Naview layout (Bruccoleri et al. 1988)2
: circular layout3
: RNAturtle (Wiegreffe et al. 2018)4
: RNApuzzler (Wiegreffe et al. 2018)
- --noOptimization
Disable the drawing space optimization of RNApuzzler.
(default=off)
- --ignoreExteriorIntersections
Ignore intersections with the exterior loop
within the RNA-tree.
(default=off)
- --ignoreAncestorIntersections
Ignore ancestor intersections within the
RNA-tree.
(default=off)
- --ignoreSiblingIntersections
Ignore sibling intersections within the
RNA-tree.
(default=off)
- --allowFlipping
Allow flipping of exterior loop branches to resolve exterior branch intersections.
(default=off)
REFERENCES
If you use this program in your work you might want to cite:
R. Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C. Flamm, P.F. Stadler and I.L. Hofacker (2011), “ViennaRNA Package 2.0”, Algorithms for Molecular Biology: 6:26
I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M. Tacker, P. Schuster (1994), “Fast Folding and Comparison of RNA Secondary Structures”, Monatshefte f. Chemie: 125, pp 167-188
R. Lorenz, I.L. Hofacker, P.F. Stadler (2016), “RNA folding with hard and soft constraints”, Algorithms for Molecular Biology 11:1 pp 1-13
The energy parameters are taken from:
D.H. Mathews, M.D. Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J. Susan, M. Zuker, D.H. Turner (2004), “Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure”, Proc. Natl. Acad. Sci. USA: 101, pp 7287-7292
D.H Turner, D.H. Mathews (2009), “NNDB: The nearest neighbor parameter database for predicting stability of nucleic acid secondary structure”, Nucleic Acids Research: 38, pp 280-282
REPORTING BUGS
If in doubt our program is right, nature is at fault. Comments should be sent to rna@tbi.univie.ac.at.