Nucleic Acid Sequences and Structures
Defines
-
VRNA_OPTION_MULTILINE
- #include <ViennaRNA/io/file_formats.h>
Tell a function that an input is assumed to span several lines.
If used as input-option a function might also be returning this state telling that it has read data from multiple lines.
-
VRNA_CONSTRAINT_MULTILINE
- #include <ViennaRNA/io/file_formats.h>
parse multiline constraint
-
VRNA_INPUT_VERBOSE
- #include <ViennaRNA/io/file_formats.h>
Functions
-
void vrna_file_helixlist(const char *seq, const char *db, float energy, FILE *file)
- #include <ViennaRNA/io/file_formats.h>
Print a secondary structure as helix list.
- Parameters
seq – The RNA sequence
db – The structure in dot-bracket format
energy – Free energy of the structure in kcal/mol
file – The file handle used to print to (print defaults to ‘stdout’ if(file == NULL) )
-
void vrna_file_connect(const char *seq, const char *db, float energy, const char *identifier, FILE *file)
- #include <ViennaRNA/io/file_formats.h>
Print a secondary structure as connect table.
Connect table file format looks like this:
where the headerline is followed by 6 columns with:* 300 ENERGY = 7.0 example * 1 G 0 2 22 1 * 2 G 1 3 21 2 *
Base number: index n
Base (A, C, G, T, U, X)
Index n-1 (0 if first nucleotide)
Index n+1 (0 if last nucleotide)
Number of the base to which n is paired. No pairing is indicated by 0 (zero).
Natural numbering.
- Parameters
seq – The RNA sequence
db – The structure in dot-bracket format
energy – The free energy of the structure
identifier – An optional identifier for the sequence
file – The file handle used to print to (print defaults to ‘stdout’ if(file == NULL) )
-
void vrna_file_bpseq(const char *seq, const char *db, FILE *file)
- #include <ViennaRNA/io/file_formats.h>
Print a secondary structure in bpseq format.
- Parameters
seq – The RNA sequence
db – The structure in dot-bracket format
file – The file handle used to print to (print defaults to ‘stdout’ if(file == NULL) )
-
void vrna_file_json(const char *seq, const char *db, double energy, const char *identifier, FILE *file)
- #include <ViennaRNA/io/file_formats.h>
Print a secondary structure in jsonformat.
- Parameters
seq – The RNA sequence
db – The structure in dot-bracket format
energy – The free energy
identifier – An identifier for the sequence
file – The file handle used to print to (print defaults to ‘stdout’ if(file == NULL) )
-
unsigned int vrna_file_fasta_read_record(char **header, char **sequence, char ***rest, FILE *file, unsigned int options)
- #include <ViennaRNA/io/file_formats.h>
Get a (fasta) data set from a file or stdin.
This function may be used to obtain complete datasets from a filehandle or stdin. A dataset is always defined to contain at least a sequence. If data starts with a fasta header, i.e. a line like
then vrna_file_fasta_read_record() will assume that the sequence that follows the header may span over several lines. To disable this behavior and to assign a single line to the argument ‘sequence’ one can pass VRNA_INPUT_NO_SPAN in the ‘options’ argument. If no fasta header is read in the beginning of a data block, a sequence must not span over multiple lines!>some header info
Unless the options VRNA_INPUT_NOSKIP_COMMENTS or VRNA_INPUT_NOSKIP_BLANK_LINES
are passed, a sequence may be interrupted by lines starting with a comment character or empty lines.
A sequence is regarded as completely read if it was either assumed to not span over multiple lines, a secondary structure or structure constraint follows the sequence on the next line, or a new header marks the beginning of a new sequence…
All lines following the sequence (this includes comments) that do not initiate a new dataset according to the above definition are available through the line-array ‘rest’. Here one can usually find the structure constraint or other information belonging to the current dataset. Filling of ‘rest’ may be prevented by passing VRNA_INPUT_NO_REST to the options argument.
The main purpose of this function is to be able to easily parse blocks of data in the header of a loop where all calculations for the appropriate data is done inside the loop. The loop may be then left on certain return values, e.g.:
char *id, *seq, **rest; int i; id = seq = NULL; rest = NULL; while(!(vrna_file_fasta_read_record(&id, &seq, &rest, NULL, 0) & (VRNA_INPUT_ERROR | VRNA_INPUT_QUIT))){ if(id) printf("%s\n", id); printf("%s\n", seq); if(rest) for(i=0;rest[i];i++){ printf("%s\n", rest[i]); free(rest[i]); } free(rest); free(seq); free(id); }
In the example above, the while loop will be terminated when vrna_file_fasta_read_record() returns either an error, EOF, or a user initiated quit request.
As long as data is read from stdin (we are passing NULL as the file pointer), the id is printed if it is available for the current block of data. The sequence will be printed in any case and if some more lines belong to the current block of data each line will be printed as well.
Note
This function will exit any program with an error message if no sequence could be read!
This function is NOT threadsafe! It uses a global variable to store information about the next data block. Do not forget to free the memory occupied by header, sequence and rest!
- Parameters
header – A pointer which will be set such that it points to the header of the record
sequence – A pointer which will be set such that it points to the sequence of the record
rest – A pointer which will be set such that it points to an array of lines which also belong to the record
file – A file handle to read from (if NULL, this function reads from stdin)
options – Some options which may be passed to alter the behavior of the function, use 0 for no options
- Returns
A flag with information about what the function actually did read
-
char *vrna_extract_record_rest_structure(const char **lines, unsigned int length, unsigned int option)
- #include <ViennaRNA/io/file_formats.h>
Extract a dot-bracket structure string from (multiline)character array.
This function extracts a dot-bracket structure string from the ‘rest’ array as returned by vrna_file_fasta_read_record() and returns it. All occurences of comments within the ‘lines’ array will be skipped as long as they do not break the structure string. If no structure could be read, this function returns NULL.
See also
- Parameters
lines – The (multiline) character array to be parsed
length – The assumed length of the dot-bracket string (passing a value < 1 results in no length limit)
option – Some options which may be passed to alter the behavior of the function, use 0 for no options
- Pre
The argument ‘lines’ has to be a 2-dimensional character array as obtained by vrna_file_fasta_read_record()
- Returns
The dot-bracket string read from lines or NULL
-
int vrna_file_SHAPE_read(const char *file_name, int length, double default_value, char *sequence, double *values)
- #include <ViennaRNA/io/file_formats.h>
Read data from a given SHAPE reactivity input file.
This function parses the informations from a given file and stores the result in the preallocated string sequence and the double array values.
- Parameters
file_name – Path to the constraints file
length – Length of the sequence (file entries exceeding this limit will cause an error)
default_value – Value for missing indices
sequence – Pointer to an array used for storing the sequence obtained from the SHAPE reactivity file
values – Pointer to an array used for storing the values obtained from the SHAPE reactivity file
-
int vrna_file_connect_read_record(FILE *fp, char **id, char **sequence, char **structure, char **remainder, unsigned int options)
- #include <ViennaRNA/io/file_formats.h>
-
int vrna_file_RNAstrand_db_read_record(FILE *fp, char **name_p, char **sequence_p, char **structure_p, char **source_p, char **fname_p, char **id_p, unsigned int options)
- #include <ViennaRNA/io/file_formats.h>
-
void vrna_extract_record_rest_constraint(char **cstruc, const char **lines, unsigned int option)
- #include <ViennaRNA/io/file_formats.h>
Extract a hard constraint encoded as pseudo dot-bracket string.
- Deprecated:
Use vrna_extract_record_rest_structure() instead!
See also
vrna_file_fasta_read_record(), VRNA_CONSTRAINT_DB_PIPE, VRNA_CONSTRAINT_DB_DOT, VRNA_CONSTRAINT_DB_X VRNA_CONSTRAINT_DB_ANG_BRACK, VRNA_CONSTRAINT_DB_RND_BRACK
- Parameters
cstruc – A pointer to a character array that is used as pseudo dot-bracket output
lines – A 2-dimensional character array with the extension lines from the FASTA input
option – The option flags that define the behavior and recognition pattern of this function
- Pre
The argument ‘lines’ has to be a 2-dimensional character array as obtained by vrna_file_fasta_read_record()
-
char *extract_record_rest_structure(const char **lines, unsigned int length, unsigned int option)
- #include <ViennaRNA/io/file_formats.h>
-
unsigned int read_record(char **header, char **sequence, char ***rest, unsigned int options)
- #include <ViennaRNA/io/file_formats.h>
Get a data record from stdin.
- Deprecated:
This function is deprecated! Use vrna_file_fasta_read_record() as a replacment.
-
unsigned int get_multi_input_line(char **string, unsigned int options)
- #include <ViennaRNA/io/file_formats.h>
-
VRNA_OPTION_MULTILINE