RNAlib-2.4.14
(Nucleic Acid Sequence) String Utilitites

Functions to parse, convert, manipulate, create, and compare (nucleic acid sequence) strings. More...

Detailed Description

Functions to parse, convert, manipulate, create, and compare (nucleic acid sequence) strings.

+ Collaboration diagram for (Nucleic Acid Sequence) String Utilitites:

Files

file  strings.h
 General utility- and helper-functions for RNA sequence and structure strings used throughout the ViennaRNA Package.
 

Macros

#define XSTR(s)   STR(s)
 Stringify a macro after expansion.
 
#define STR(s)   #s
 Stringify a macro argument.
 
#define FILENAME_MAX_LENGTH   80
 Maximum length of filenames that are generated by our programs. More...
 
#define FILENAME_ID_LENGTH   42
 Maximum length of id taken from fasta header for filename generation. More...
 

Functions

char * vrna_strdup_printf (const char *format,...)
 Safely create a formatted string. More...
 
char * vrna_strdup_vprintf (const char *format, va_list argp)
 Safely create a formatted string. More...
 
int vrna_strcat_printf (char **dest, const char *format,...)
 Safely append a formatted string to another string. More...
 
int vrna_strcat_vprintf (char **dest, const char *format, va_list args)
 Safely append a formatted string to another string. More...
 
char ** vrna_strsplit (const char *string, const char *delimiter)
 Split a string into tokens using a delimiting character. More...
 
char * vrna_random_string (int l, const char symbols[])
 Create a random string using characters from a specified symbol set. More...
 
int vrna_hamming_distance (const char *s1, const char *s2)
 Calculate hamming distance between two sequences. More...
 
int vrna_hamming_distance_bound (const char *s1, const char *s2, int n)
 Calculate hamming distance between two sequences up to a specified length. More...
 
void vrna_seq_toRNA (char *sequence)
 Convert an input sequence (possibly containing DNA alphabet characters) to RNA alphabet. More...
 
void vrna_seq_toupper (char *sequence)
 Convert an input sequence to uppercase. More...
 
char * vrna_seq_ungapped (const char *seq)
 Remove gap characters from a nucleotide sequence. More...
 
char * vrna_cut_point_insert (const char *string, int cp)
 Add a separating '&' character into a string according to cut-point position. More...
 
char * vrna_cut_point_remove (const char *string, int *cp)
 Remove a separating '&' character from a string. More...
 

Macro Definition Documentation

#define FILENAME_MAX_LENGTH   80

#include <ViennaRNA/utils/strings.h>

Maximum length of filenames that are generated by our programs.

This definition should be used throughout the complete ViennaRNA package wherever a static array holding filenames of output files is declared.

#define FILENAME_ID_LENGTH   42

#include <ViennaRNA/utils/strings.h>

Maximum length of id taken from fasta header for filename generation.

this has to be smaller than FILENAME_MAX_LENGTH since in most cases, some suffix will be appended to the ID

Function Documentation

char* vrna_strdup_printf ( const char *  format,
  ... 
)

#include <ViennaRNA/utils/strings.h>

Safely create a formatted string.

This function is a safe implementation for creating a formatted character array, similar to sprintf. Internally, it uses the asprintf function if available to dynamically allocate a large enough character array to store the supplied content. If asprintf is not available, mimic it's behavior using vsnprintf.

Note
The returned pointer of this function should always be passed to free() to release the allocated memory
See also
vrna_strdup_vprintf(), vrna_strcat_printf()
Parameters
formatThe format string (See also asprintf)
...The list of variables used to fill the format string
Returns
The formatted, null-terminated string, or NULL if something has gone wrong
char* vrna_strdup_vprintf ( const char *  format,
va_list  argp 
)

#include <ViennaRNA/utils/strings.h>

Safely create a formatted string.

This function is the va_list version of vrna_strdup_printf()

Note
The returned pointer of this function should always be passed to free() to release the allocated memory
See also
vrna_strdup_printf(), vrna_strcat_printf(), vrna_strcat_vprintf()
Parameters
formatThe format string (See also asprintf)
argpThe list of arguments to fill the format string
Returns
The formatted, null-terminated string, or NULL if something has gone wrong
int vrna_strcat_printf ( char **  dest,
const char *  format,
  ... 
)

#include <ViennaRNA/utils/strings.h>

Safely append a formatted string to another string.

This function is a safe implementation for appending a formatted character array, similar to a cobination of strcat and sprintf. The function automatically allocates enough memory to store both, the previous content stored at dest and the appended format string. If the dest pointer is NULL, the function allocate memory only for the format string. The function returns the number of characters in the resulting string or -1 in case of an error.

See also
vrna_strcat_vprintf(), vrna_strdup_printf(), vrna_strdup_vprintf()
Parameters
destThe address of a char *pointer where the formatted string is to be appended
formatThe format string (See also sprintf)
...The list of variables used to fill the format string
Returns
The number of characters in the final string, or -1 on error
int vrna_strcat_vprintf ( char **  dest,
const char *  format,
va_list  args 
)

#include <ViennaRNA/utils/strings.h>

Safely append a formatted string to another string.

This function is the va_list version of vrna_strcat_printf()

See also
vrna_strcat_printf(), vrna_strdup_printf(), vrna_strdup_vprintf()
Parameters
destThe address of a char *pointer where the formatted string is to be appended
formatThe format string (See also sprintf)
argsThe list of argument to fill the format string
Returns
The number of characters in the final string, or -1 on error
char** vrna_strsplit ( const char *  string,
const char *  delimiter 
)

#include <ViennaRNA/utils/strings.h>

Split a string into tokens using a delimiting character.

This function splits a string into an array of strings using a single character that delimits the elements within the string. The default delimiter is the ampersand '&' and will be used when NULL is passed as a second argument. The returned list is NULL terminated, i.e. the last element is NULL. If the delimiter is not found, the returned list contains exactly one element: the input string.

For instance, the following code:

char **tok = vrna_strsplit("GGGG&CCCC&AAAAA", NULL);
for (char **ptr = tok; *ptr; ptr++) {
printf("%s\n", *ptr);
free(*ptr);
}
free(tok);

produces this output:

* GGGG
* CCCC
* AAAAA
* 

and properly free's the memory occupied by the returned element array.

Note
This function internally uses strtok_r() and is therefore considered to be thread-safe. Also note, that it is the users responsibility to free the memory of the array and that of the individual element strings!
Parameters
stringThe input string that should be split into elements
delimiterThe delimiting character. If NULL, the delimiter is "&"
Returns
A NULL terminated list of the elements in the string
char* vrna_random_string ( int  l,
const char  symbols[] 
)

#include <ViennaRNA/utils/strings.h>

Create a random string using characters from a specified symbol set.

Parameters
lThe length of the sequence
symbolsThe symbol set
Returns
A random string of length 'l' containing characters from the symbolset
int vrna_hamming_distance ( const char *  s1,
const char *  s2 
)

#include <ViennaRNA/utils/strings.h>

Calculate hamming distance between two sequences.

Parameters
s1The first sequence
s2The second sequence
Returns
The hamming distance between s1 and s2
int vrna_hamming_distance_bound ( const char *  s1,
const char *  s2,
int  n 
)

#include <ViennaRNA/utils/strings.h>

Calculate hamming distance between two sequences up to a specified length.

This function is similar to vrna_hamming_distance() but instead of comparing both sequences up to their actual length only the first 'n' characters are taken into account

Parameters
s1The first sequence
s2The second sequence
nThe length of the subsequences to consider (starting from the 5' end)
Returns
The hamming distance between s1 and s2
void vrna_seq_toRNA ( char *  sequence)

#include <ViennaRNA/utils/strings.h>

Convert an input sequence (possibly containing DNA alphabet characters) to RNA alphabet.

This function substitudes T and t with U and u, respectively

Parameters
sequenceThe sequence to be converted
void vrna_seq_toupper ( char *  sequence)

#include <ViennaRNA/utils/strings.h>

Convert an input sequence to uppercase.

Parameters
sequenceThe sequence to be converted
char* vrna_seq_ungapped ( const char *  seq)

#include <ViennaRNA/utils/strings.h>

Remove gap characters from a nucleotide sequence.

Parameters
sequenceThe original, null-terminated nucleotide sequence
Returns
A copy of the input sequence with all gap characters removed
char* vrna_cut_point_insert ( const char *  string,
int  cp 
)

#include <ViennaRNA/utils/strings.h>

Add a separating '&' character into a string according to cut-point position.

If the cut-point position is less or equal to zero, this function just returns a copy of the provided string. Otherwise, the cut-point character is set at the corresponding position

Parameters
stringThe original string
cpThe cut-point position
Returns
A copy of the provided string including the cut-point character
char* vrna_cut_point_remove ( const char *  string,
int *  cp 
)

#include <ViennaRNA/utils/strings.h>

Remove a separating '&' character from a string.

This function removes the cut-point indicating '&' character from a string and memorizes its position in a provided integer variable. If not '&' is found in the input, the integer variable is set to -1. The function returns a copy of the input string with the '&' being sliced out.

Parameters
stringThe original string
cpThe cut-point position
Returns
A copy of the input string with the '&' being sliced out