bpp-seq
2.2.0
|
Utilitary methods dealing with sites. More...
#include <Bpp/Seq/SiteTools.h>
Public Member Functions | |
SiteTools () | |
virtual | ~SiteTools () |
Static Public Member Functions | |
static bool | hasGap (const Site &site) |
static bool | isGapOnly (const Site &site) |
static bool | isGapOrUnresolvedOnly (const Site &site) |
static bool | hasUnknown (const Site &site) |
static bool | isComplete (const Site &site) |
static bool | isConstant (const Site &site, bool ignoreUnknown=false, bool unresolvedRaisesException=true) throw (EmptySiteException) |
Tell if a site is constant, that is displaying the same state in all sequences that do not present a gap. More... | |
static bool | areSitesIdentical (const Site &site1, const Site &site2) |
static double | variabilityShannon (const Site &site, bool resolveUnknowns) throw (EmptySiteException) |
Compute the Shannon entropy index of a site. More... | |
static double | variabilityFactorial (const Site &site) throw (EmptySiteException) |
Compute the factorial diversity index of a site. More... | |
static double | mutualInformation (const Site &site1, const Site &site2, bool resolveUnknowns) throw (DimensionException,EmptySiteException) |
Compute the mutual information between two sites. More... | |
static double | entropy (const Site &site, bool resolveUnknowns) throw (EmptySiteException) |
Compute the entropy of a site. This is an alias of method variabilityShannon. More... | |
static double | jointEntropy (const Site &site1, const Site &site2, bool resolveUnknowns) throw (DimensionException,EmptySiteException) |
Compute the joint entropy between two sites. More... | |
static double | heterozygosity (const Site &site) throw (EmptySiteException) |
Compute the heterozygosity index of a site. More... | |
static size_t | getNumberOfDistinctCharacters (const Site &site) throw (EmptySiteException) |
Give the number of distinct characters at a site. More... | |
static bool | hasSingleton (const Site &site) throw (EmptySiteException) |
Tell if a site has singletons. More... | |
static bool | isParsimonyInformativeSite (const Site &site) throw (EmptySiteException) |
Tell if a site is a parsimony informative site. More... | |
static bool | isTriplet (const Site &site) throw (EmptySiteException) |
Tell if a site has more than 2 distinct characters. More... | |
static void | getCounts (const SymbolList &list, std::map< int, size_t > &counts) |
Count all states in the list. More... | |
static void | getCounts (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, size_t > > &counts) throw (DimensionException) |
Count all pair of states for two lists of the same size. More... | |
static void | getCounts (const SymbolList &list, std::map< int, double > &counts, bool resolveUnknowns) |
Count all states in the list, optionaly resolving unknown characters. More... | |
static void | getCounts (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, double > > &counts, bool resolveUnknowns) throw (DimensionException) |
Count all pair of states for two lists of the same size, optionaly resolving unknown characters. More... | |
static void | getFrequencies (const SymbolList &list, std::map< int, double > &frequencies, bool resolveUnknowns=false) |
Get all states frequencies in the list. More... | |
static void | getFrequencies (const SymbolList &list1, const SymbolList &list2, std::map< int, std::map< int, double > > &frequencies, bool resolveUnknowns=false) throw (DimensionException) |
Get all state pairs frequencies for two lists of the same size.. More... | |
static double | getGCContent (const SymbolList &list, bool ignoreUnresolved=true, bool ignoreGap=true) throw (AlphabetException) |
Get the GC content of a symbol list. More... | |
static size_t | getNumberOfDistinctPositions (const SymbolList &l1, const SymbolList &l2) throw (AlphabetMismatchException) |
Get the number of distinct positions. More... | |
static size_t | getNumberOfPositionsWithoutGap (const SymbolList &l1, const SymbolList &l2) throw (AlphabetMismatchException) |
Get the number of positions without gap. More... | |
static void | changeGapsToUnknownCharacters (SymbolList &l) |
Change all gap elements to unknown characters. More... | |
static void | changeUnresolvedCharactersToGaps (SymbolList &l) |
Change all unknown characters to gap elements. More... | |
Utilitary methods dealing with sites.
Definition at line 57 of file SiteTools.h.
|
inline |
Definition at line 61 of file SiteTools.h.
|
inlinevirtual |
Definition at line 62 of file SiteTools.h.
site1 | The first site. |
site2 | The second site. |
Definition at line 121 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::getAlphabetType(), and bpp::BasicSymbolList::size().
|
staticinherited |
Change all gap elements to unknown characters.
l | The input list of characters. |
Definition at line 180 of file SymbolListTools.cpp.
References bpp::SymbolList::getAlphabet(), bpp::Alphabet::getUnknownCharacterCode(), bpp::Alphabet::isGap(), and bpp::SymbolList::size().
|
staticinherited |
Change all unknown characters to gap elements.
l | The input list of characters. |
Definition at line 189 of file SymbolListTools.cpp.
References bpp::SymbolList::getAlphabet(), bpp::Alphabet::getGapCharacterCode(), bpp::Alphabet::isUnresolved(), and bpp::SymbolList::size().
|
inlinestatic |
Compute the entropy of a site. This is an alias of method variabilityShannon.
where is the frequency of state
.
site | A site. |
resolveUnknowns | Tell is unknown characters must be resolved. |
EmptySiteException | If the site has size 0. |
Definition at line 178 of file SiteTools.h.
References variabilityShannon().
|
inlinestaticinherited |
Count all states in the list.
list | The list. |
counts | The output map to store the counts (existing counts will be incremented). |
Definition at line 70 of file SymbolListTools.h.
References bpp::SymbolList::getContent().
Referenced by getNumberOfDistinctCharacters(), bpp::SequenceApplicationTools::getSitesToAnalyse(), isParsimonyInformativeSite(), and bpp::CodonSiteTools::numberOfNonSynonymousSubstitutions().
|
inlinestaticinherited |
Count all pair of states for two lists of the same size.
NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.
list1 | The first list. |
list2 | The second list. |
counts | The output map to store the counts (existing counts will be incremented). |
Definition at line 90 of file SymbolListTools.h.
|
staticinherited |
Count all states in the list, optionaly resolving unknown characters.
For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
list | The list. |
counts | The output map to store the counts (existing ocunts will be incremented). |
resolveUnknowns | Tell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4. |
Definition at line 51 of file SymbolListTools.cpp.
References bpp::Alphabet::getAlias(), bpp::SymbolList::getAlphabet(), and bpp::SymbolList::getContent().
|
staticinherited |
Count all pair of states for two lists of the same size, optionaly resolving unknown characters.
For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.
list1 | The first list. |
list2 | The second list. |
counts | The output map to store the counts (existing ocunts will be incremented). |
resolveUnknowns | Tell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4. |
Definition at line 73 of file SymbolListTools.cpp.
|
staticinherited |
Get all states frequencies in the list.
list | The list. |
resolveUnknowns | Tell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4. |
frequencies | The output map with all states and corresponding frequencies. Existing frequencies will be erased if any. |
Definition at line 96 of file SymbolListTools.cpp.
References bpp::SymbolList::size().
Referenced by bpp::CodonSiteTools::generateCodonSiteWithoutRareVariant(), bpp::SiteContainerTools::getConsensus(), bpp::SequenceApplicationTools::getSitesToAnalyse(), bpp::CodonSiteTools::meanNumberOfSynonymousPositions(), bpp::CodonSiteTools::piNonSynonymous(), bpp::CodonSiteTools::piSynonymous(), and bpp::SiteContainerTools::removeGapSites().
|
staticinherited |
Get all state pairs frequencies for two lists of the same size..
list1 | The first list. |
list2 | The second list. |
resolveUnknowns | Tell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4. |
frequencies | The output map with all state pairs and corresponding frequencies. Existing frequencies will be erased if any. |
Definition at line 107 of file SymbolListTools.cpp.
|
staticinherited |
Get the GC content of a symbol list.
list | The list. |
ignoreUnresolved | Do not count unresolved states. Otherwise, weight by each state probability in case of ambiguity (e.g. the R state counts for 0.5). |
ignoreGap | Do not count gaps in total. |
AlphabetException | If the list is not made of nucleotide states. |
Definition at line 119 of file SymbolListTools.cpp.
|
static |
Give the number of distinct characters at a site.
site | a Site |
Definition at line 333 of file SiteTools.cpp.
References bpp::SymbolListTools::getCounts(), and isConstant().
Referenced by isTriplet(), and bpp::CodonSiteTools::numberOfSubsitutions().
|
staticinherited |
Get the number of distinct positions.
The comparison in achieved from position 0 to the minimum size of the two vectors.
l1 | SymbolList 1. |
l2 | SymbolList 2. |
AlphabetMismatchException | if the two lists have not the same alphabet type. |
Definition at line 158 of file SymbolListTools.cpp.
|
staticinherited |
Get the number of positions without gap.
The comparison in achieved from position 0 to the minimum size of the two vectors.
l1 | SymbolList 1. |
l2 | SymbolList 2. |
AlphabetMismatchException | if the two lists have not the same alphabet type. |
Definition at line 169 of file SymbolListTools.cpp.
|
static |
site | A site. |
Definition at line 56 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::isGap(), and bpp::BasicSymbolList::size().
Referenced by bpp::NoGapSiteContainerIterator::nextSiteWithoutGapPosition(), bpp::CodonSiteTools::numberOfNonSynonymousSubstitutions(), bpp::CodonSiteTools::numberOfSubsitutions(), and bpp::NoGapSiteContainerIterator::previousSiteWithoutGapPosition().
|
static |
Tell if a site has singletons.
site | a Site. |
Definition at line 354 of file SiteTools.cpp.
References isConstant().
|
static |
site | A site. |
Definition at line 95 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::getUnknownCharacterCode(), and bpp::BasicSymbolList::size().
|
static |
Compute the heterozygosity index of a site.
where is the frequency of state
.
site | A site. |
EmptySiteException | If the site has size 0. |
Definition at line 319 of file SiteTools.cpp.
|
static |
site | A site. |
Definition at line 108 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::isGap(), bpp::Alphabet::isUnresolved(), and bpp::BasicSymbolList::size().
Referenced by bpp::CompleteSiteContainerIterator::nextCompleteSitePosition(), and bpp::CompleteSiteContainerIterator::previousCompleteSitePosition().
|
static |
Tell if a site is constant, that is displaying the same state in all sequences that do not present a gap.
site | A site. |
ignoreUnknown | If true, positions with unknown positions will be ignored. Otherwise, a site with one single state + any uncertain state will not be considered as constant. |
unresolvedRaisesException | In case of ambiguous case (gap only site for instance), throw an exception. Otherwise returns false. |
EmptySiteException | If the site has size 0 or if the site cannot be resolved (for instance is made of gaps only) and unresolvedRaisesException is set to true. |
Definition at line 141 of file SiteTools.cpp.
Referenced by bpp::CodonSiteTools::fixedDifferences(), bpp::CodonSiteTools::generateCodonSiteWithoutRareVariant(), getNumberOfDistinctCharacters(), hasSingleton(), bpp::CodonSiteTools::isFourFoldDegenerated(), bpp::CodonSiteTools::isMonoSitePolymorphic(), isParsimonyInformativeSite(), bpp::CodonSiteTools::isSynonymousPolymorphic(), bpp::CodonSiteTools::numberOfNonSynonymousSubstitutions(), bpp::CodonSiteTools::numberOfSubsitutions(), bpp::CodonSiteTools::piNonSynonymous(), and bpp::CodonSiteTools::piSynonymous().
|
static |
site | A site. |
Definition at line 69 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::isGap(), and bpp::BasicSymbolList::size().
Referenced by bpp::SiteContainerTools::removeGapOnlySites(), and bpp::SiteContainerTools::removeGapOrUnresolvedOnlySites().
|
static |
site | A site. |
Definition at line 82 of file SiteTools.cpp.
References bpp::BasicSymbolList::getAlphabet(), bpp::Alphabet::isGap(), bpp::Alphabet::isUnresolved(), and bpp::BasicSymbolList::size().
Referenced by bpp::SiteContainerTools::removeGapOrUnresolvedOnlySites().
|
static |
Tell if a site is a parsimony informative site.
At least two distinct characters must be present.
site | a Site. |
Definition at line 374 of file SiteTools.cpp.
References bpp::SymbolListTools::getCounts(), and isConstant().
|
static |
Tell if a site has more than 2 distinct characters.
site | a Site. |
Definition at line 397 of file SiteTools.cpp.
References getNumberOfDistinctCharacters().
|
static |
Compute the joint entropy between two sites.
where is the frequency of the pair
.
site1 | First site |
site2 | Second site |
resolveUnknowns | Tell is unknown characters must be resolved. |
DimensionException | If the sites do not have the same length. |
EmptySiteException | If the sites have size 0. |
Definition at line 269 of file SiteTools.cpp.
|
static |
Compute the mutual information between two sites.
where and
are the frequencies of states
and
, and
is the frequency of the pair
.
site1 | First site |
site2 | Second site |
resolveUnknowns | Tell is unknown characters must be resolved. |
DimensionException | If the sites do not have the same length. |
EmptySiteException | If the sites have size 0. |
Definition at line 222 of file SiteTools.cpp.
|
static |
Compute the factorial diversity index of a site.
where is the number of times state
is observed in the site.
site | A site. |
EmptySiteException | If the site has size 0. |
Definition at line 304 of file SiteTools.cpp.
|
static |
Compute the Shannon entropy index of a site.
where is the frequency of state
.
site | A site. |
resolveUnknowns | Tell is unknown characters must be resolved. |
EmptySiteException | If the site has size 0. |
Definition at line 202 of file SiteTools.cpp.
Referenced by entropy().