RNAlib-2.4.14
Multiple Sequence Alignment Utilities

Functions to extract features from and to manipulate multiple sequence alignments. More...

Detailed Description

Functions to extract features from and to manipulate multiple sequence alignments.

+ Collaboration diagram for Multiple Sequence Alignment Utilities:

Modules

 Deprecated Interface for Multiple Sequence Alignment Utilities
 

Files

file  alignments.h
 Various utility- and helper-functions for sequence alignments and comparative structure prediction.
 

Data Structures

struct  vrna_pinfo_s
 A base pair info structure. More...
 

Macros

#define VRNA_ALN_DEFAULT   0U
 Use default alignment settings.
 
#define VRNA_ALN_RNA   1U
 Convert to RNA alphabet.
 
#define VRNA_ALN_DNA   2U
 Convert to DNA alphabet.
 
#define VRNA_ALN_UPPERCASE   4U
 Convert to uppercase nucleotide letters.
 
#define VRNA_ALN_LOWERCASE   8U
 Convert to lowercase nucleotide letters.
 
#define VRNA_MEASURE_SHANNON_ENTROPY   1U
 Flag indicating Shannon Entropy measure. More...
 

Typedefs

typedef struct vrna_pinfo_s vrna_pinfo_t
 Typename for the base pair info repesenting data structure vrna_pinfo_s.
 

Functions

int vrna_aln_mpi (const char **alignment)
 Get the mean pairwise identity in steps from ?to?(ident) More...
 
vrna_pinfo_tvrna_aln_pinfo (vrna_fold_compound_t *vc, const char *structure, double threshold)
 Retrieve an array of vrna_pinfo_t structures from precomputed pair probabilities. More...
 
char ** vrna_aln_slice (const char **alignment, unsigned int i, unsigned int j)
 Slice out a subalignment from a larger alignment. More...
 
void vrna_aln_free (char **alignment)
 Free memory occupied by a set of aligned sequences. More...
 
char ** vrna_aln_uppercase (const char **alignment)
 Create a copy of an alignment with only uppercase letters in the sequences. More...
 
char ** vrna_aln_toRNA (const char **alignment)
 Create a copy of an alignment where DNA alphabet is replaced by RNA alphabet. More...
 
char ** vrna_aln_copy (const char **alignment, unsigned int options)
 Make a copy of a multiple sequence alignment. More...
 
float * vrna_aln_conservation_struct (const char **alignment, const char *structure, const vrna_md_t *md)
 Compute base pair conservation of a consensus structure. More...
 
float * vrna_aln_conservation_col (const char **alignment, const vrna_md_t *md_p, unsigned int options)
 Compute nucleotide conservation in an alignment. More...
 
char * vrna_aln_consensus_sequence (const char **alignment, const vrna_md_t *md_p)
 Compute the consensus sequence for a given multiple sequence alignment. More...
 
char * vrna_aln_consensus_mis (const char **alignment, const vrna_md_t *md_p)
 Compute the Most Informative Sequence (MIS) for a given multiple sequence alignment. More...
 

Data Structure Documentation

struct vrna_pinfo_s

A base pair info structure.

For each base pair (i,j) with i,j in [0, n-1] the structure lists:

  • its probability 'p'
  • an entropy-like measure for its well-definedness 'ent'
  • the frequency of each type of pair in 'bp[]'
    • 'bp[0]' contains the number of non-compatible sequences
    • 'bp[1]' the number of CG pairs, etc.

Data Fields

unsigned i
 nucleotide position i
 
unsigned j
 nucleotide position j
 
float p
 Probability.
 
float ent
 Pseudo entropy for $ p(i,j) = S_i + S_j - p_ij*ln(p_ij) $.
 
short bp [8]
 Frequencies of pair_types.
 
char comp
 1 iff pair is in mfe structure
 

Macro Definition Documentation

#define VRNA_MEASURE_SHANNON_ENTROPY   1U

#include <ViennaRNA/utils/alignments.h>

Flag indicating Shannon Entropy measure.

Shannon Entropy is defined as $ H = - \sum_c p_c \cdot \log_2 p_c $

Function Documentation

int vrna_aln_mpi ( const char **  alignment)

#include <ViennaRNA/utils/alignments.h>

Get the mean pairwise identity in steps from ?to?(ident)

Parameters
alignmentAligned sequences
Returns
The mean pairwise identity
vrna_pinfo_t* vrna_aln_pinfo ( vrna_fold_compound_t vc,
const char *  structure,
double  threshold 
)

#include <ViennaRNA/utils/alignments.h>

Retrieve an array of vrna_pinfo_t structures from precomputed pair probabilities.

This array of structures contains information about positionwise pair probabilies, base pair entropy and more

See also
vrna_pinfo_t, and vrna_pf()
Parameters
vcThe vrna_fold_compound_t of type VRNA_FC_TYPE_COMPARATIVE with precomputed partition function matrices
structureAn optional structure in dot-bracket notation (Maybe NULL)
thresholdDo not include results with pair probabilities below threshold
Returns
The vrna_pinfo_t array
char** vrna_aln_slice ( const char **  alignment,
unsigned int  i,
unsigned int  j 
)

#include <ViennaRNA/utils/alignments.h>

Slice out a subalignment from a larger alignment.

Note
The user is responsible to free the memory occupied by the returned subalignment
See also
vrna_aln_free()
Parameters
alignmentThe input alignment
iThe first column of the subalignment (1-based)
jThe last column of the subalignment (1-based)
Returns
The subalignment between column $i$ and $j$
void vrna_aln_free ( char **  alignment)

#include <ViennaRNA/utils/alignments.h>

Free memory occupied by a set of aligned sequences.

Parameters
alignmentThe input alignment
char** vrna_aln_uppercase ( const char **  alignment)

#include <ViennaRNA/utils/alignments.h>

Create a copy of an alignment with only uppercase letters in the sequences.

See also
vrna_aln_copy
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
Returns
A copy of the input alignment where lowercase sequence letters are replaced by uppercase letters
char** vrna_aln_toRNA ( const char **  alignment)

#include <ViennaRNA/utils/alignments.h>

Create a copy of an alignment where DNA alphabet is replaced by RNA alphabet.

See also
vrna_aln_copy
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
Returns
A copy of the input alignment where DNA alphabet is replaced by RNA alphabet (T -> U)
char** vrna_aln_copy ( const char **  alignment,
unsigned int  options 
)

#include <ViennaRNA/utils/alignments.h>

Make a copy of a multiple sequence alignment.

This function allows one to create a copy of a multiple sequence alignment. The options parameter additionally allows for sequence manipulation, such as converting DNA to RNA alphabet, and conversion to uppercase letters.

See also
vrna_aln_copy(), VRNA_ALN_RNA, VRNA_ALN_UPPERCASE, VRNA_ALN_DEFAULT
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
optionsOption flags indicating whether the aligned sequences should be converted
Returns
A (manipulated) copy of the input alignment
float * vrna_aln_conservation_struct ( const char **  alignment,
const char *  structure,
const vrna_md_t md 
)

#include <ViennaRNA/utils/alignments.h>

Compute base pair conservation of a consensus structure.

This function computes the base pair conservation (fraction of canonical base pairs) of a consensus structure given a multiple sequence alignment. The base pair types that are considered canonical may be specified using the vrna_md_t.pair array. Passing NULL as parameter md results in default pairing rules, i.e. canonical Watson-Crick and GU Wobble pairs.

Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
structureThe consensus structure in dot-bracket notation
mdModel details that specify compatible base pairs (Maybe NULL)
Returns
A 1-based vector of base pair conservations
SWIG Wrapper Notes:
This function is available in an overloaded form where the last parameter may be omitted, indicating md = NULL
float * vrna_aln_conservation_col ( const char **  alignment,
const vrna_md_t md,
unsigned int  options 
)

#include <ViennaRNA/utils/alignments.h>

Compute nucleotide conservation in an alignment.

This function computes the conservation of nucleotides in alignment columns. The simples measure is Shannon Entropy and can be selected by passing the VRNA_MEASURE_SHANNON_ENTROPY flag in the options parameter.

Note
Currently, only VRNA_MEASURE_SHANNON_ENTROPY is supported as conservation measure.
See also
VRNA_MEASURE_SHANNON_ENTROPY
Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
mdModel details that specify known nucleotides (Maybe NULL)
optionsA flag indicating which measure of conservation should be applied
Returns
A 1-based vector of column conservations
SWIG Wrapper Notes:
This function is available in an overloaded form where the last two parameters may be omitted, indicating md = NULL, and options = VRNA_MEASURE_SHANNON_ENTROPY, respectively.
char* vrna_aln_consensus_sequence ( const char **  alignment,
const vrna_md_t md_p 
)

#include <ViennaRNA/utils/alignments.h>

Compute the consensus sequence for a given multiple sequence alignment.

Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
md_pModel details that specify known nucleotides (Maybe NULL)
Returns
The consensus sequence of the alignment, i.e. the most frequent nucleotide for each alignment column
char* vrna_aln_consensus_mis ( const char **  alignment,
const vrna_md_t md_p 
)

#include <ViennaRNA/utils/alignments.h>

Compute the Most Informative Sequence (MIS) for a given multiple sequence alignment.

The most informative sequence (MIS) [9] displays for each alignment column the nucleotides with frequency greater than the background frequency, projected into IUPAC notation. Columns where gaps are over-represented are in lower case.

Parameters
alignmentThe input sequence alignment (last entry must be NULL terminated)
md_pModel details that specify known nucleotides (Maybe NULL)
Returns
The most informative sequence for the alignment