bpp-seq  2.2.0
bpp::SequenceTools Class Reference

SequenceTools static class. More...

#include <Bpp/Seq/SequenceTools.h>

+ Inheritance diagram for bpp::SequenceTools:
+ Collaboration diagram for bpp::SequenceTools:

Public Member Functions

 SequenceTools ()
 
virtual ~SequenceTools ()
 

Static Public Member Functions

static Sequencesubseq (const Sequence &sequence, size_t begin, size_t end) throw (IndexOutOfBoundsException, Exception)
 Get a sub-sequence. More...
 
static Sequenceconcatenate (const Sequence &seq1, const Sequence &seq2) throw (AlphabetMismatchException, Exception)
 Concatenate two sequences. More...
 
static Sequencecomplement (Sequence &seq) throw (AlphabetException)
 Complement the nucleotide sequence itself. More...
 
static SequencegetComplement (const Sequence &sequence) throw (AlphabetException)
 Get the complementary sequence of a nucleotide sequence. More...
 
static Sequencetranscript (const Sequence &sequence) throw (AlphabetException)
 Get the transcription sequence of a DNA sequence. More...
 
static SequencereverseTranscript (const Sequence &sequence) throw (AlphabetException)
 Get the reverse-transcription sequence of a RNA sequence. More...
 
static Sequenceinvert (Sequence &seq)
 Inverse a sequence from 5'->3' to 3'->5' and vice-versa. More...
 
static SequencegetInvert (const Sequence &sequence)
 Inverse a sequence from 5'->3' to 3'->5' and vice-versa. More...
 
static SequenceinvertComplement (Sequence &seq)
 Inverse and complement a sequence. More...
 
static double getPercentIdentity (const Sequence &seq1, const Sequence &seq2, bool ignoreGaps=false) throw (AlphabetMismatchException, SequenceNotAlignedException)
 
static size_t getNumberOfSites (const Sequence &seq)
 
static size_t getNumberOfCompleteSites (const Sequence &seq)
 
static SequencegetSequenceWithCompleteSites (const Sequence &seq)
 keep only complete sites in a sequence. More...
 
static size_t getNumberOfUnresolvedSites (const Sequence &seq)
 
static void removeGaps (Sequence &seq)
 Remove gaps from a sequence. More...
 
static SequencegetSequenceWithoutGaps (const Sequence &seq)
 Get a copy of the sequence without gaps. More...
 
static void removeStops (Sequence &seq, const GeneticCode &gCode) throw (Exception)
 Remove stops from a codon sequence. More...
 
static SequencegetSequenceWithoutStops (const Sequence &seq, const GeneticCode &gCode) throw (Exception)
 Get a copy of the codon sequence without stops. More...
 
static void replaceStopsWithGaps (Sequence &seq, const GeneticCode &gCode) throw (Exception)
 Replace stop codons by gaps. More...
 
static BowkerTestbowkerTest (const Sequence &seq1, const Sequence &seq2) throw (SequenceNotAlignedException)
 Bowker's test for homogeneity. More...
 
static void getPutativeHaplotypes (const Sequence &seq, std::vector< Sequence *> &hap, unsigned int level=2)
 Get all putatives haplotypes from an heterozygous sequence. More...
 
static SequencecombineSequences (const Sequence &s1, const Sequence &s2) throw (AlphabetMismatchException)
 Combine two sequences. More...
 
static SequencesubtractHaplotype (const Sequence &s, const Sequence &h, std::string name="", unsigned int level=1) throw (SequenceNotAlignedException)
 Subtract haplotype from an heterozygous sequence. More...
 
static SequenceRNYslice (const Sequence &sequence, int ph) throw (AlphabetException)
 Get the RNY decomposition of a DNA sequence; with a given phase between 1 and 3, it gives the decomposition in this phase; in phase 1, the first triplet is centered on the first character. Without a phase the function gives the alternative succession in phases 1, 2 and 3. More...
 
static SequenceRNYslice (const Sequence &sequence) throw (AlphabetException)
 
static void getCDS (Sequence &sequence, const GeneticCode &gCode, bool checkInit, bool checkStop, bool includeInit=true, bool includeStop=true)
 Extract CDS part from a codon sequence. Optionally check for intiator and stop codons, or both. More...
 
static size_t findFirstOf (const Sequence &seq, const Sequence &motif, bool strict=true)
 Find the position of a motif in a sequence. More...
 
static SequencegetRandomSequence (const Alphabet *alphabet, size_t length)
 Get a random sequence of given size and alphabet, with all state with equal probability. More...
 
static void getCounts (const SymbolList &list, std::map< int, size_t > &counts)
 Count all states in the list. More...
 
static void getCounts (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, size_t > > &counts) throw (DimensionException)
 Count all pair of states for two lists of the same size. More...
 
static void getCounts (const SymbolList &list, std::map< int, double > &counts, bool resolveUnknowns)
 Count all states in the list, optionaly resolving unknown characters. More...
 
static void getCounts (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, double > > &counts, bool resolveUnknowns) throw (DimensionException)
 Count all pair of states for two lists of the same size, optionaly resolving unknown characters. More...
 
static void getFrequencies (const SymbolList &list, std::map< int, double > &frequencies, bool resolveUnknowns=false)
 Get all states frequencies in the list. More...
 
static void getFrequencies (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, double > > &frequencies, bool resolveUnknowns=false) throw (DimensionException)
 Get all state pairs frequencies for two lists of the same size.. More...
 
static double getGCContent (const SymbolList &list, bool ignoreUnresolved=true, bool ignoreGap=true) throw (AlphabetException)
 Get the GC content of a symbol list. More...
 
static size_t getNumberOfDistinctPositions (const SymbolList &l1, const SymbolList &l2) throw (AlphabetMismatchException)
 Get the number of distinct positions. More...
 
static size_t getNumberOfPositionsWithoutGap (const SymbolList &l1, const SymbolList &l2) throw (AlphabetMismatchException)
 Get the number of positions without gap. More...
 
static void changeGapsToUnknownCharacters (SymbolList &l)
 Change all gap elements to unknown characters. More...
 
static void changeUnresolvedCharactersToGaps (SymbolList &l)
 Change all unknown characters to gap elements. More...
 

Static Private Attributes

static DNA _DNA
 
static RNA _RNA
 
static RNY _RNY
 
static NucleicAcidsReplication _DNARep
 
static NucleicAcidsReplication _RNARep
 
static NucleicAcidsReplication _transc
 

Detailed Description

SequenceTools static class.

Implement methods to manipulate sequences

Definition at line 97 of file SequenceTools.h.

Constructor & Destructor Documentation

◆ SequenceTools()

bpp::SequenceTools::SequenceTools ( )
inline

Definition at line 109 of file SequenceTools.h.

◆ ~SequenceTools()

virtual bpp::SequenceTools::~SequenceTools ( )
inlinevirtual

Definition at line 110 of file SequenceTools.h.

Member Function Documentation

◆ bowkerTest()

BowkerTest * SequenceTools::bowkerTest ( const Sequence seq1,
const Sequence seq2 
)
throw (SequenceNotAlignedException
)
static

Bowker's test for homogeneity.

Computes the contingency table of occurrence of all pairs of states and test its symmetry using Bowker's (1948) test.

Reference:

Ababneh F. Bioinformatics 2006 22(10) 1225-1231
Parameters
seq1The first sequence.
seq2The second sequence.
Returns
A BowkerTest object with the computed statistic and p-value (computed from a chi square distribution).
Exceptions
SequenceNotAlignedExceptionIf the two sequences do not have the same length.

Definition at line 412 of file SequenceTools.cpp.

References bpp::Alphabet::getSize(), bpp::Alphabet::isGap(), bpp::Alphabet::isUnresolved(), bpp::BowkerTest::setPValue(), and bpp::BowkerTest::setStatistic().

◆ changeGapsToUnknownCharacters()

void SymbolListTools::changeGapsToUnknownCharacters ( SymbolList l)
staticinherited

Change all gap elements to unknown characters.

Parameters
lThe input list of characters.

Definition at line 180 of file SymbolListTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::getUnknownCharacterCode(), bpp::Alphabet::isGap(), and bpp::SymbolList::size().

◆ changeUnresolvedCharactersToGaps()

void SymbolListTools::changeUnresolvedCharactersToGaps ( SymbolList l)
staticinherited

Change all unknown characters to gap elements.

Parameters
lThe input list of characters.

Definition at line 189 of file SymbolListTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::getGapCharacterCode(), bpp::Alphabet::isUnresolved(), and bpp::SymbolList::size().

◆ combineSequences()

Sequence * SequenceTools::combineSequences ( const Sequence s1,
const Sequence s2 
)
throw (AlphabetMismatchException
)
static

Combine two sequences.

Author
Sylvain Gaillard

Definition at line 516 of file SequenceTools.cpp.

References bpp::Alphabet::getGeneric().

◆ complement()

Sequence & SequenceTools::complement ( Sequence seq)
throw (AlphabetException
)
static

Complement the nucleotide sequence itself.

Parameters
seqThe sequence to be complemented.
Returns
A ref toward the complemented sequence.
Exceptions
AlphabetExceptionif the sequence is not a nucleotide sequence.
Author
Sylvain Gaillard

Definition at line 108 of file SequenceTools.cpp.

References bpp::NucleicAcidsReplication::translate().

◆ concatenate()

Sequence * SequenceTools::concatenate ( const Sequence seq1,
const Sequence seq2 
)
throw (AlphabetMismatchException,
Exception
)
static

Concatenate two sequences.

Sequences must have the same name and alphabets. Only first sequence's commentaries are kept.

Parameters
seq1The first sequence.
seq2The second sequence.
Returns
A new sequence object with the concatenation of the two sequences.
Exceptions
AlphabetMismatchExceptionIf the two alphabets do not match.
ExceptionIf the sequence names do not match.

Definition at line 89 of file SequenceTools.cpp.

◆ findFirstOf()

size_t SequenceTools::findFirstOf ( const Sequence seq,
const Sequence motif,
bool  strict = true 
)
static

Find the position of a motif in a sequence.

Parameters
seqThe reference sequence
motifThe motif to find
strictIf true (default) find exactly the motif If false find compatible match
Returns
The position of the first occurence of the motif or the seq length.

Definition at line 684 of file SequenceTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::SymbolList::getValue(), bpp::AlphabetTools::match(), and bpp::SymbolList::size().

◆ getCDS()

void SequenceTools::getCDS ( Sequence sequence,
const GeneticCode gCode,
bool  checkInit,
bool  checkStop,
bool  includeInit = true,
bool  includeStop = true 
)
static

Extract CDS part from a codon sequence. Optionally check for intiator and stop codons, or both.

Parameters
sequenceThe sequence to be reduced to CDS part.
gCodeThe genetic code according to which start and stop codons are specified.
checkInitIf true, then everything before the initiator codon will be removed, together with the initiator codon if includeInit is false.
checkStopIf true, then everything after the first stop codon will be removed, together with the stop codon if includeStop is false.
includeInitTell if initiator codon should be kept or removed. No effect if checkInit is false.
includeStopTell if stop codon should be kept or removed. No effect if checkStop is false.

Definition at line 655 of file SequenceTools.cpp.

References bpp::SymbolList::deleteElement(), bpp::SymbolList::getAlphabet(), bpp::GeneticCode::isStart(), bpp::GeneticCode::isStop(), and bpp::SymbolList::size().

◆ getComplement()

Sequence * SequenceTools::getComplement ( const Sequence sequence)
throw (AlphabetException
)
static

Get the complementary sequence of a nucleotide sequence.

See also
DNAReplication
Returns
A new sequence object with the complementary sequence.
Parameters
sequenceThe sequence to complement.
Exceptions
AlphabetExceptionIf the sequence is not a nucleotide sequence.

Definition at line 133 of file SequenceTools.cpp.

References bpp::NucleicAcidsReplication::translate().

◆ getCounts() [1/4]

static void bpp::SymbolListTools::getCounts ( const SymbolList list,
std::map< int, size_t > &  counts 
)
inlinestaticinherited

Count all states in the list.

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing counts will be incremented).

Definition at line 70 of file SymbolListTools.h.

References bpp::SymbolList::getContent().

Referenced by bpp::SiteTools::getNumberOfDistinctCharacters(), bpp::SequenceApplicationTools::getSitesToAnalyse(), bpp::SiteTools::isParsimonyInformativeSite(), and bpp::CodonSiteTools::numberOfNonSynonymousSubstitutions().

◆ getCounts() [2/4]

static void bpp::SymbolListTools::getCounts ( const SymbolList list1,
const SymbolList list2,
std::map< int, std::map< int, size_t > > &  counts 
)
throw (DimensionException
)
inlinestaticinherited

Count all pair of states for two lists of the same size.

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing counts will be incremented).

Definition at line 90 of file SymbolListTools.h.

◆ getCounts() [3/4]

void SymbolListTools::getCounts ( const SymbolList list,
std::map< int, double > &  counts,
bool  resolveUnknowns 
)
staticinherited

Count all states in the list, optionaly resolving unknown characters.

For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing ocunts will be incremented).
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
Returns
A map with all states and corresponding counts.

Definition at line 51 of file SymbolListTools.cpp.

References bpp::Alphabet::getAlias(), bpp::SymbolList::getAlphabet(), and bpp::SymbolList::getContent().

◆ getCounts() [4/4]

void SymbolListTools::getCounts ( const SymbolList list1,
const SymbolList list2,
std::map< int, std::map< int, double > > &  counts,
bool  resolveUnknowns 
)
throw (DimensionException
)
staticinherited

Count all pair of states for two lists of the same size, optionaly resolving unknown characters.

For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing ocunts will be incremented).
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
Returns
A map with all states and corresponding counts.

Definition at line 73 of file SymbolListTools.cpp.

◆ getFrequencies() [1/2]

void SymbolListTools::getFrequencies ( const SymbolList list,
std::map< int, double > &  frequencies,
bool  resolveUnknowns = false 
)
staticinherited

Get all states frequencies in the list.

Author
J. Dutheil
Parameters
listThe list.
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
frequenciesThe output map with all states and corresponding frequencies. Existing frequencies will be erased if any.

Definition at line 96 of file SymbolListTools.cpp.

References bpp::SymbolList::size().

Referenced by bpp::CodonSiteTools::generateCodonSiteWithoutRareVariant(), bpp::SiteContainerTools::getConsensus(), bpp::SequenceApplicationTools::getSitesToAnalyse(), bpp::CodonSiteTools::meanNumberOfSynonymousPositions(), bpp::CodonSiteTools::piNonSynonymous(), bpp::CodonSiteTools::piSynonymous(), and bpp::SiteContainerTools::removeGapSites().

◆ getFrequencies() [2/2]

void SymbolListTools::getFrequencies ( const SymbolList list1,
const SymbolList list2,
std::map< int, std::map< int, double > > &  frequencies,
bool  resolveUnknowns = false 
)
throw (DimensionException
)
staticinherited

Get all state pairs frequencies for two lists of the same size..

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
frequenciesThe output map with all state pairs and corresponding frequencies. Existing frequencies will be erased if any.

Definition at line 107 of file SymbolListTools.cpp.

◆ getGCContent()

double SymbolListTools::getGCContent ( const SymbolList list,
bool  ignoreUnresolved = true,
bool  ignoreGap = true 
)
throw (AlphabetException
)
staticinherited

Get the GC content of a symbol list.

Parameters
listThe list.
Returns
The proportion of G and C states in the list.
Parameters
ignoreUnresolvedDo not count unresolved states. Otherwise, weight by each state probability in case of ambiguity (e.g. the R state counts for 0.5).
ignoreGapDo not count gaps in total.
Exceptions
AlphabetExceptionIf the list is not made of nucleotide states.

Definition at line 119 of file SymbolListTools.cpp.

◆ getInvert()

Sequence * SequenceTools::getInvert ( const Sequence sequence)
static

Inverse a sequence from 5'->3' to 3'->5' and vice-versa.

ABCDEF becomes FEDCBA, and the sense attribute is changed (may be inhibited).

Parameters
sequenceThe sequence to inverse.
Returns
A new sequence object containing the inverted sequence.
Author
Sylvain Gaillard

Definition at line 198 of file SequenceTools.cpp.

References bpp::Sequence::clone().

◆ getNumberOfCompleteSites()

size_t SequenceTools::getNumberOfCompleteSites ( const Sequence seq)
static
Returns
The number of complete sites in the sequences, i.e. all positions without gaps and unresolved states (generic characters).
Parameters
seqThe sequence to analyse.

Definition at line 293 of file SequenceTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::isGap(), bpp::Alphabet::isUnresolved(), and bpp::SymbolList::size().

◆ getNumberOfDistinctPositions()

size_t SymbolListTools::getNumberOfDistinctPositions ( const SymbolList l1,
const SymbolList l2 
)
throw (AlphabetMismatchException
)
staticinherited

Get the number of distinct positions.

The comparison in achieved from position 0 to the minimum size of the two vectors.

Parameters
l1SymbolList 1.
l2SymbolList 2.
Returns
The number of distinct positions.
Exceptions
AlphabetMismatchExceptionif the two lists have not the same alphabet type.

Definition at line 158 of file SymbolListTools.cpp.

◆ getNumberOfPositionsWithoutGap()

size_t SymbolListTools::getNumberOfPositionsWithoutGap ( const SymbolList l1,
const SymbolList l2 
)
throw (AlphabetMismatchException
)
staticinherited

Get the number of positions without gap.

The comparison in achieved from position 0 to the minimum size of the two vectors.

Parameters
l1SymbolList 1.
l2SymbolList 2.
Returns
The number of positions without gap.
Exceptions
AlphabetMismatchExceptionif the two lists have not the same alphabet type.

Definition at line 169 of file SymbolListTools.cpp.

◆ getNumberOfSites()

size_t SequenceTools::getNumberOfSites ( const Sequence seq)
static
Returns
The number of sites in the sequences, i.e. all positions without gaps.
Parameters
seqThe sequence to analyse.

Definition at line 279 of file SequenceTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::isGap(), and bpp::SymbolList::size().

◆ getNumberOfUnresolvedSites()

size_t SequenceTools::getNumberOfUnresolvedSites ( const Sequence seq)
static
Returns
The number of unresolved sites in the sequence.
Parameters
seqThe sequence to analyse.
Author
Sylvain Gaillard

Definition at line 323 of file SequenceTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::isUnresolved(), and bpp::SymbolList::size().

◆ getPercentIdentity()

double SequenceTools::getPercentIdentity ( const Sequence seq1,
const Sequence seq2,
bool  ignoreGaps = false 
)
throw (AlphabetMismatchException,
SequenceNotAlignedException
)
static
Returns
The identity percent of 2 sequence. One match is counted if the two sequences have identical states.
Parameters
seq1The first sequence.
seq2The second sequence.
ignoreGapsIf true, only positions without gaps will be used for the counting.
Exceptions
AlphabetMismatchExceptionIf the two sequences do not have the same alphabet.
SequenceNotAlignedExceptionIf the two sequences do not have the same length.

Definition at line 245 of file SequenceTools.cpp.

◆ getPutativeHaplotypes()

void SequenceTools::getPutativeHaplotypes ( const Sequence seq,
std::vector< Sequence *> &  hap,
unsigned int  level = 2 
)
static

Get all putatives haplotypes from an heterozygous sequence.

Parameters
seqThe sequence to resolve
hapThe vector to fill with the new sequences
levelThe maximum number of states that a generic char must code (if this number is higher than level, the state will not be resolved). For instance if level = 3 and Alphabet is DNA, all generic char will be resolved but N.
Author
Sylvain Gaillard

Definition at line 463 of file SequenceTools.cpp.

References bpp::SymbolList::addElement(), bpp::Alphabet::getAlias(), bpp::SymbolList::getAlphabet(), bpp::Alphabet::getGapCharacterCode(), bpp::Sequence::getName(), bpp::Sequence::setName(), and bpp::SymbolList::size().

◆ getRandomSequence()

Sequence * SequenceTools::getRandomSequence ( const Alphabet alphabet,
size_t  length 
)
static

Get a random sequence of given size and alphabet, with all state with equal probability.

Parameters
alphabetThe alphabet to use.
lengthThe length of the sequence to generate.
Returns
A pointer toward a new Sequence object.

Definition at line 716 of file SequenceTools.cpp.

References bpp::Alphabet::getSize().

◆ getSequenceWithCompleteSites()

Sequence * SequenceTools::getSequenceWithCompleteSites ( const Sequence seq)
static

keep only complete sites in a sequence.

The deleteElement method of the Sequence object will be used where appropriate.

Parameters
seqThe sequence to analyse.

Definition at line 307 of file SequenceTools.cpp.

References bpp::Sequence::clone(), bpp::SymbolList::getAlphabet(), bpp::Alphabet::isGap(), bpp::Alphabet::isUnresolved(), bpp::Sequence::setContent(), and bpp::SymbolList::size().

◆ getSequenceWithoutGaps()

Sequence * SequenceTools::getSequenceWithoutGaps ( const Sequence seq)
static

Get a copy of the sequence without gaps.

A whole new sequence will be created by adding all non-gap positions. The original sequence will be cloned to serve as a template.

Parameters
seqThe sequence to analyse.
Returns
A new sequence object without gaps.

Definition at line 337 of file SequenceTools.cpp.

References bpp::Sequence::clone(), bpp::SymbolList::getAlphabet(), bpp::Alphabet::isGap(), bpp::Sequence::setContent(), and bpp::SymbolList::size().

◆ getSequenceWithoutStops()

Sequence * SequenceTools::getSequenceWithoutStops ( const Sequence seq,
const GeneticCode gCode 
)
throw (Exception
)
static

Get a copy of the codon sequence without stops.

A whole new sequence will be created by adding all non-stop positions. The original sequence will be cloned to serve as a template.

Parameters
seqThe sequence to analyse.
gCodeThe genetic code according to which stop codons are specified.
Returns
A new sequence object without stops.
Exceptions
Exceptionif the input sequence does not have a codon alphabet.

Definition at line 365 of file SequenceTools.cpp.

References bpp::Sequence::setContent().

◆ invert()

Sequence & SequenceTools::invert ( Sequence seq)
static

Inverse a sequence from 5'->3' to 3'->5' and vice-versa.

ABCDEF becomes FEDCBA, and the sense attribute is changed (may be inhibited).

Parameters
seqThe sequence to inverse.
Returns
A ref toward the sequence.
Author
Sylvain Gaillard

Definition at line 181 of file SequenceTools.cpp.

References bpp::SymbolList::getValue(), bpp::SymbolList::setElement(), and bpp::SymbolList::size().

◆ invertComplement()

Sequence & SequenceTools::invertComplement ( Sequence seq)
static

Inverse and complement a sequence.

This methode is more accurate than calling invert and complement separatly.

Parameters
seqThe sequence to inverse and complement.
Returns
A ref toward the sequence.
Author
Sylvain Gaillard

Definition at line 207 of file SequenceTools.cpp.

References bpp::SymbolList::getAlphabet(), bpp::Alphabet::getAlphabetType(), bpp::SymbolList::getValue(), bpp::SymbolList::setElement(), bpp::SymbolList::size(), and bpp::NucleicAcidsReplication::translate().

◆ removeGaps()

void SequenceTools::removeGaps ( Sequence seq)
static

Remove gaps from a sequence.

The deleteElement method of the Sequence object will be used where appropriate.

Parameters
seqThe sequence to analyse.

Definition at line 353 of file SequenceTools.cpp.

References bpp::SymbolList::deleteElement(), bpp::SymbolList::getAlphabet(), bpp::Alphabet::isGap(), and bpp::SymbolList::size().

Referenced by bpp::SiteContainerTools::alignNW().

◆ removeStops()

void SequenceTools::removeStops ( Sequence seq,
const GeneticCode gCode 
)
throw (Exception
)
static

Remove stops from a codon sequence.

The deleteElement method of the Sequence object will be used where appropriate.

Parameters
seqThe sequence to analyse.
gCodeThe genetic code according to which stop codons are specified.
Exceptions
Exceptionif the input sequence does not have a codon alphabet.

Definition at line 383 of file SequenceTools.cpp.

◆ replaceStopsWithGaps()

void SequenceTools::replaceStopsWithGaps ( Sequence seq,
const GeneticCode gCode 
)
throw (Exception
)
static

Replace stop codons by gaps.

The setElement method of the Sequence object will be used where appropriate.

Parameters
seqThe sequence to analyse.
gCodeThe genetic code according to which stop codons are specified.
Exceptions
Exceptionif the input sequence does not have a codon alphabet.

Definition at line 397 of file SequenceTools.cpp.

References bpp::AbstractAlphabet::getGapCharacterCode().

◆ reverseTranscript()

Sequence * SequenceTools::reverseTranscript ( const Sequence sequence)
throw (AlphabetException
)
static

Get the reverse-transcription sequence of a RNA sequence.

Translate RNA sequence into DNA sequence.

See also
DNAReplication
Returns
sequence A new sequence object with the reverse-transcription sequence.
Parameters
sequenceThe sequence to reverse-transcript.
Exceptions
AlphabetExceptionIf the sequence is not a RNA sequence.

Definition at line 168 of file SequenceTools.cpp.

◆ RNYslice() [1/2]

Sequence * SequenceTools::RNYslice ( const Sequence sequence,
int  ph 
)
throw (AlphabetException
)
static

Get the RNY decomposition of a DNA sequence; with a given phase between 1 and 3, it gives the decomposition in this phase; in phase 1, the first triplet is centered on the first character. Without a phase the function gives the alternative succession in phases 1, 2 and 3.

Returns
sequence A new sequence object with the transcription sequence.
Parameters
sequenceThe sequence to transcript.
phThe phase to use (1,2 or 3).
Exceptions
AlphabetExceptionIf the sequence is not a DNA sequence.
Author
Laurent Guéguen

Definition at line 575 of file SequenceTools.cpp.

Referenced by bpp::SequenceApplicationTools::getSiteContainer().

◆ RNYslice() [2/2]

Sequence * SequenceTools::RNYslice ( const Sequence sequence)
throw (AlphabetException
)
static

Definition at line 617 of file SequenceTools.cpp.

◆ subseq()

Sequence * SequenceTools::subseq ( const Sequence sequence,
size_t  begin,
size_t  end 
)
throw (IndexOutOfBoundsException,
Exception
)
static

Get a sub-sequence.

Parameters
sequenceThe sequence to trunc.
beginThe first position of the subsequence.
endThe last position of the subsequence.
Returns
A new sequence object with the given subsequence.
Exceptions
IndexOutOfBoundsException,ExceptionIn case of bad indices.

Definition at line 68 of file SequenceTools.cpp.

Referenced by bpp::GeneticCode::getCodingSequence().

◆ subtractHaplotype()

Sequence * SequenceTools::subtractHaplotype ( const Sequence s,
const Sequence h,
std::string  name = "",
unsigned int  level = 1 
)
throw (SequenceNotAlignedException
)
static

Subtract haplotype from an heterozygous sequence.

Subtract an haplotype (i.e. a fully resolved sequence) from an heterozygous sequence to get the other haplotype. The new haplotype could be an unresolved sequence if unresolved characters in the sequence code for more than 2 states.

For example:

>heterozygous sequence
ATTCGGGKWTATRYRM
>haplotype
ATTCGGGTATATGCAA
>subtracted haplotype
ATTCGGGGTTATATGC
Parameters
sThe heterozygous sequence.
hThe haplotype to subtract.
nameThe name of the new computed haplotype.
levelThe number of states from which the site is set to fully unresolved.
Exceptions
SequenceNotAlignedExceptionif s and h don't have the same size.
Author
Sylvain Gaillard

Definition at line 541 of file SequenceTools.cpp.

References bpp::Alphabet::getAlias(), bpp::Alphabet::getGeneric(), bpp::Alphabet::getName(), bpp::Alphabet::getUnknownCharacterCode(), bpp::Alphabet::intToChar(), and bpp::Alphabet::isUnresolved().

◆ transcript()

Sequence * SequenceTools::transcript ( const Sequence sequence)
throw (AlphabetException
)
static

Get the transcription sequence of a DNA sequence.

Translate DNA sequence into RNA sequence.

See also
DNAReplication
Returns
sequence A new sequence object with the transcription sequence.
Parameters
sequenceThe sequence to transcript.
Exceptions
AlphabetExceptionIf the sequence is not a DNA sequence.

Definition at line 155 of file SequenceTools.cpp.

Member Data Documentation

◆ _DNA

DNA SequenceTools::_DNA
staticprivate

Definition at line 101 of file SequenceTools.h.

◆ _DNARep

NucleicAcidsReplication bpp::SequenceTools::_DNARep
staticprivate

Definition at line 104 of file SequenceTools.h.

◆ _RNA

RNA SequenceTools::_RNA
staticprivate

Definition at line 102 of file SequenceTools.h.

◆ _RNARep

NucleicAcidsReplication bpp::SequenceTools::_RNARep
staticprivate

Definition at line 105 of file SequenceTools.h.

◆ _RNY

RNY SequenceTools::_RNY
staticprivate

Definition at line 103 of file SequenceTools.h.

◆ _transc

NucleicAcidsReplication bpp::SequenceTools::_transc
staticprivate

Definition at line 106 of file SequenceTools.h.


The documentation for this class was generated from the following files: