Multiple Sequence Alignment Utilities
Functions to extract features from and to manipulate multiple sequence alignments (MSA).
Defines
-
VRNA_ALN_DEFAULT
- #include <ViennaRNA/utils/alignments.h>
Use default alignment settings.
-
VRNA_ALN_RNA
- #include <ViennaRNA/utils/alignments.h>
Convert to RNA alphabet.
-
VRNA_ALN_DNA
- #include <ViennaRNA/utils/alignments.h>
Convert to DNA alphabet.
-
VRNA_ALN_UPPERCASE
- #include <ViennaRNA/utils/alignments.h>
Convert to uppercase nucleotide letters.
-
VRNA_ALN_LOWERCASE
- #include <ViennaRNA/utils/alignments.h>
Convert to lowercase nucleotide letters.
-
VRNA_MEASURE_SHANNON_ENTROPY
- #include <ViennaRNA/utils/alignments.h>
Flag indicating Shannon Entropy measure.
Shannon Entropy is defined as \( H = - \sum_c p_c \cdot \log_2 p_c \)
Typedefs
-
typedef struct vrna_pinfo_s vrna_pinfo_t
- #include <ViennaRNA/utils/alignments.h>
Typename for the base pair info repesenting data structure vrna_pinfo_s.
Functions
-
int vrna_aln_mpi(const char **alignment)
- #include <ViennaRNA/utils/alignments.h>
Get the mean pairwise identity in steps from ?to?(ident)
- SWIG Wrapper Notes:
This function is available as function
aln_mpi()
. See e.g.RNA.aln_mpi()
in the Python API.
- Parameters
alignment – Aligned sequences
- Returns
The mean pairwise identity
-
vrna_pinfo_t *vrna_aln_pinfo(vrna_fold_compound_t *fc, const char *structure, double threshold)
- #include <ViennaRNA/utils/alignments.h>
Retrieve an array of vrna_pinfo_t structures from precomputed pair probabilities.
This array of structures contains information about positionwise pair probabilies, base pair entropy and more
See also
vrna_pinfo_t, and vrna_pf()
- Parameters
fc – The vrna_fold_compound_t of type VRNA_FC_TYPE_COMPARATIVE with precomputed partition function matrices
structure – An optional structure in dot-bracket notation (Maybe NULL)
threshold – Do not include results with pair probabilities below threshold
- Returns
The vrna_pinfo_t array
-
int *vrna_aln_pscore(const char **alignment, vrna_md_t *md)
- #include <ViennaRNA/utils/alignments.h>
- SWIG Wrapper Notes:
This function is available as overloaded function
aln_pscore()
where the last parameter may be omitted, indicatingmd
=NULL
. See e.g.RNA.aln_pscore()
in the Python API.
-
int vrna_pscore(vrna_fold_compound_t *fc, unsigned int i, unsigned int j)
- #include <ViennaRNA/utils/alignments.h>
-
int vrna_pscore_freq(vrna_fold_compound_t *fc, const unsigned int *frequencies, unsigned int pairs)
- #include <ViennaRNA/utils/alignments.h>
-
char **vrna_aln_slice(const char **alignment, unsigned int i, unsigned int j)
- #include <ViennaRNA/utils/alignments.h>
Slice out a subalignment from a larger alignment.
See also
Note
The user is responsible to free the memory occupied by the returned subalignment
- Parameters
alignment – The input alignment
i – The first column of the subalignment (1-based)
j – The last column of the subalignment (1-based)
- Returns
The subalignment between column \(i\) and \(j\)
-
void vrna_aln_free(char **alignment)
- #include <ViennaRNA/utils/alignments.h>
Free memory occupied by a set of aligned sequences.
- Parameters
alignment – The input alignment
-
char **vrna_aln_uppercase(const char **alignment)
- #include <ViennaRNA/utils/alignments.h>
Create a copy of an alignment with only uppercase letters in the sequences.
See also
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
- Returns
A copy of the input alignment where lowercase sequence letters are replaced by uppercase letters
-
char **vrna_aln_toRNA(const char **alignment)
- #include <ViennaRNA/utils/alignments.h>
Create a copy of an alignment where DNA alphabet is replaced by RNA alphabet.
See also
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
- Returns
A copy of the input alignment where DNA alphabet is replaced by RNA alphabet (T -> U)
-
char **vrna_aln_copy(const char **alignment, unsigned int options)
- #include <ViennaRNA/utils/alignments.h>
Make a copy of a multiple sequence alignment.
This function allows one to create a copy of a multiple sequence alignment. The
options
parameter additionally allows for sequence manipulation, such as converting DNA to RNA alphabet, and conversion to uppercase letters.- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
options – Option flags indicating whether the aligned sequences should be converted
- Returns
A (manipulated) copy of the input alignment
-
float *vrna_aln_conservation_struct(const char **alignment, const char *structure, const vrna_md_t *md)
- #include <ViennaRNA/utils/alignments.h>
Compute base pair conservation of a consensus structure.
This function computes the base pair conservation (fraction of canonical base pairs) of a consensus structure given a multiple sequence alignment. The base pair types that are considered canonical may be specified using the vrna_md_t.pair array. Passing NULL as parameter
md
results in default pairing rules, i.e. canonical Watson-Crick and GU Wobble pairs.- SWIG Wrapper Notes:
This function is available as overloaded function
aln_conservation_struct()
where the last parametermd
may be omitted, indicatingmd
=NULL
. See, e.g.RNA.aln_conservation_struct()
in the Python API.
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
structure – The consensus structure in dot-bracket notation
md – Model details that specify compatible base pairs (Maybe NULL)
- Returns
A 1-based vector of base pair conservations
-
float *vrna_aln_conservation_col(const char **alignment, const vrna_md_t *md_p, unsigned int options)
- #include <ViennaRNA/utils/alignments.h>
Compute nucleotide conservation in an alignment.
This function computes the conservation of nucleotides in alignment columns. The simples measure is Shannon Entropy and can be selected by passing the VRNA_MEASURE_SHANNON_ENTROPY flag in the
options
parameter.- SWIG Wrapper Notes:
This function is available as overloaded function
aln_conservation_col()
where the last two parameters may be omitted, indicatingmd
=NULL
, andoptions
= VRNA_MEASURE_SHANNON_ENTROPY, respectively. See e.g.RNA.aln_conservation_col()
in the Python API.
See also
Note
Currently, only VRNA_MEASURE_SHANNON_ENTROPY is supported as conservation measure.
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
md – Model details that specify known nucleotides (Maybe NULL)
options – A flag indicating which measure of conservation should be applied
- Returns
A 1-based vector of column conservations
-
char *vrna_aln_consensus_sequence(const char **alignment, const vrna_md_t *md_p)
- #include <ViennaRNA/utils/alignments.h>
Compute the consensus sequence for a given multiple sequence alignment.
- SWIG Wrapper Notes:
This function is available as overloaded function
aln_consensus_sequence()
where the last parameter may be omitted, indicatingmd
=NULL
. See e.g.RNA.aln_consensus_sequence()
in the Python API.
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
md_p – Model details that specify known nucleotides (Maybe NULL)
- Returns
The consensus sequence of the alignment, i.e. the most frequent nucleotide for each alignment column
-
char *vrna_aln_consensus_mis(const char **alignment, const vrna_md_t *md_p)
- #include <ViennaRNA/utils/alignments.h>
Compute the Most Informative Sequence (MIS) for a given multiple sequence alignment.
The most informative sequence (MIS) [Freyhult et al., 2005] displays for each alignment column the nucleotides with frequency greater than the background frequency, projected into IUPAC notation. Columns where gaps are over-represented are in lower case.
- SWIG Wrapper Notes:
This function is available as overloaded function
aln_consensus_mis()
where the last parameter may be omitted, indicatingmd
=NULL
. See e.g.RNA.aln_consensus_mis()
in the Python API.
- Parameters
alignment – The input sequence alignment (last entry must be NULL terminated)
md_p – Model details that specify known nucleotides (Maybe NULL)
- Returns
The most informative sequence for the alignment
-
struct vrna_pinfo_s
- #include <ViennaRNA/utils/alignments.h>
A base pair info structure.
For each base pair (i,j) with i,j in [0, n-1] the structure lists:
its probability ‘p’
an entropy-like measure for its well-definedness ‘ent’
the frequency of each type of pair in ‘bp[]’
’bp[0]’ contains the number of non-compatible sequences
’bp[1]’ the number of CG pairs, etc.
-
VRNA_ALN_DEFAULT