bpp-phyl  2.2.0
bpp::HierarchicalClustering Class Reference

Hierarchical clustering. More...

#include <Bpp/Phyl/Distance/HierarchicalClustering.h>

+ Inheritance diagram for bpp::HierarchicalClustering:
+ Collaboration diagram for bpp::HierarchicalClustering:

Public Member Functions

 HierarchicalClustering (const std::string &method, bool verbose=false)
 Builds a new clustering object. More...
 
 HierarchicalClustering (const std::string &method, const DistanceMatrix &matrix, bool verbose=false) throw (Exception)
 
virtual ~HierarchicalClustering ()
 
HierarchicalClusteringclone () const
 
std::string getName () const
 
TreeTemplate< Node > * getTree () const
 Get the computed tree, if there is one. More...
 
virtual void setDistanceMatrix (const DistanceMatrix &matrix) throw (Exception)
 Set the distance matrix to use. More...
 
virtual void computeTree () throw (Exception)
 Compute the tree corresponding to the distance matrix. More...
 
void setVerbose (bool yn)
 
bool isVerbose () const
 

Static Public Attributes

static const std::string COMPLETE = "Complete"
 
static const std::string SINGLE = "Single"
 
static const std::string AVERAGE = "Average"
 
static const std::string MEDIAN = "Median"
 
static const std::string WARD = "Ward"
 
static const std::string CENTROID = "Centroid"
 

Protected Member Functions

std::vector< size_t > getBestPair () throw (Exception)
 Get the best pair of nodes to agglomerate. More...
 
std::vector< double > computeBranchLengthsForPair (const std::vector< size_t > &pair)
 Compute the branch lengths for two nodes to agglomerate. More...
 
double computeDistancesFromPair (const std::vector< size_t > &pair, const std::vector< double > &branchLengths, size_t pos)
 Actualizes the distance matrix according to a given pair and the corresponding branch lengths. More...
 
void finalStep (int idRoot)
 Method called when there ar eonly three remaining node to agglomerate, and creates the root node of the tree. More...
 
virtual NodegetLeafNode (int id, const std::string &name)
 Get a leaf node. More...
 
virtual NodegetParentNode (int id, Node *son1, Node *son2)
 Get an inner node. More...
 

Protected Attributes

std::string method_
 
DistanceMatrix matrix_
 
Treetree_
 
std::map< size_t, Node * > currentNodes_
 
bool verbose_
 
bool rootTree_
 

Detailed Description

Hierarchical clustering.

This class implements the complete, single, average (= UPGMA), median, ward and centroid linkage methods.

Definition at line 64 of file HierarchicalClustering.h.

Constructor & Destructor Documentation

◆ HierarchicalClustering() [1/2]

bpp::HierarchicalClustering::HierarchicalClustering ( const std::string &  method,
bool  verbose = false 
)
inline

Builds a new clustering object.

Parameters
methodThe linkage method to use. should be one of COMPLETE, SINGLE, AVERAGE, MEDIAN, WARD, CENTROID.
verboseTell if some progress information should be displayed.

Definition at line 85 of file HierarchicalClustering.h.

Referenced by clone().

◆ HierarchicalClustering() [2/2]

bpp::HierarchicalClustering::HierarchicalClustering ( const std::string &  method,
const DistanceMatrix &  matrix,
bool  verbose = false 
)
throw (Exception
)
inline

◆ ~HierarchicalClustering()

virtual bpp::HierarchicalClustering::~HierarchicalClustering ( )
inlinevirtual

Definition at line 95 of file HierarchicalClustering.h.

Member Function Documentation

◆ clone()

HierarchicalClustering* bpp::HierarchicalClustering::clone ( ) const
inline

Definition at line 97 of file HierarchicalClustering.h.

References HierarchicalClustering().

◆ computeBranchLengthsForPair()

vector< double > HierarchicalClustering::computeBranchLengthsForPair ( const std::vector< size_t > &  pair)
protectedvirtual

Compute the branch lengths for two nodes to agglomerate.

+---l1-----N1
|
+---l2-----N2

This method compute l1 and l2 given N1 and N2.

Parameters
pairThe indices of the nodes to be agglomerated.
Returns
A size 2 vector with branch lengths.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 104 of file HierarchicalClustering.cpp.

◆ computeDistancesFromPair()

double HierarchicalClustering::computeDistancesFromPair ( const std::vector< size_t > &  pair,
const std::vector< double > &  branchLengths,
size_t  pos 
)
protectedvirtual

Actualizes the distance matrix according to a given pair and the corresponding branch lengths.

Parameters
pairThe indices of the nodes to be agglomerated.
branchLengthsThe corresponding branch lengths.
posThe index of the node whose distance ust be updated.
Returns
The distance between the 'pos' node and the agglomerated pair.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 113 of file HierarchicalClustering.cpp.

◆ computeTree()

void AbstractAgglomerativeDistanceMethod::computeTree ( )
throw (Exception
)
virtualinherited

Compute the tree corresponding to the distance matrix.

This method implements the following algorithm: 1) Build all leaf nodes (getLeafNode method) 2) Get the best pair to agglomerate (getBestPair method) 3) Compute the branch lengths for this pair (computeBranchLengthsForPair method) 4) Build the parent node of the pair (getParentNode method) 5) For each remaining node, update distances from the pair (computeDistancesFromPair method) 6) Return to step 2 while there are more than 3 remaining nodes. 7) Perform the final step, and send a rooted or unrooted tree.

Implements bpp::DistanceMethod.

Reimplemented in bpp::BioNJ.

Definition at line 62 of file AbstractAgglomerativeDistanceMethod.cpp.

References bpp::Node::setDistanceToFather().

Referenced by HierarchicalClustering(), bpp::NeighborJoining::NeighborJoining(), and bpp::PGMA::PGMA().

◆ finalStep()

void HierarchicalClustering::finalStep ( int  idRoot)
protectedvirtual

Method called when there ar eonly three remaining node to agglomerate, and creates the root node of the tree.

Parameters
idRootThe id of the root node.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 173 of file HierarchicalClustering.cpp.

References bpp::Node::addSon(), and bpp::Node::setDistanceToFather().

◆ getBestPair()

vector< size_t > HierarchicalClustering::getBestPair ( )
throw (Exception
)
protectedvirtual

Get the best pair of nodes to agglomerate.

Define the criterion to chose the next pair of nodes to agglomerate. This criterion uses the matrix_ distance matrix.

Returns
A size 2 vector with the indices of the nodes.
Exceptions
ExceptionIf an error occured.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 60 of file HierarchicalClustering.cpp.

◆ getLeafNode()

Node * HierarchicalClustering::getLeafNode ( int  id,
const std::string &  name 
)
protectedvirtual

Get a leaf node.

Create a new node with the given id and name.

Parameters
idThe id of the node.
nameThe name of the node.
Returns
A pointer toward a new node object.

Reimplemented from bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 190 of file HierarchicalClustering.cpp.

References bpp::ClusterInfos::length, bpp::ClusterInfos::numberOfLeaves, and bpp::NodeTemplate< NodeInfos >::setInfos().

◆ getName()

std::string bpp::HierarchicalClustering::getName ( ) const
inlinevirtual
Returns
The name of the distance method.

Implements bpp::DistanceMethod.

Definition at line 100 of file HierarchicalClustering.h.

References method_.

◆ getParentNode()

Node * HierarchicalClustering::getParentNode ( int  id,
Node son1,
Node son2 
)
protectedvirtual

Get an inner node.

Create a new node with the given id, and set its sons.

Parameters
idThe id of the node.
son1The first son of the node.
son2The second son of the node.
Returns
A pointer toward a new node object.

Reimplemented from bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 200 of file HierarchicalClustering.cpp.

References bpp::Node::addSon(), bpp::Node::getDistanceToFather(), bpp::ClusterInfos::length, and bpp::ClusterInfos::numberOfLeaves.

◆ getTree()

TreeTemplate< Node > * HierarchicalClustering::getTree ( ) const
virtual

Get the computed tree, if there is one.

Returns
A copy of the computed tree if there is one, 0 otherwise.

Reimplemented from bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 54 of file HierarchicalClustering.cpp.

◆ isVerbose()

bool bpp::AbstractAgglomerativeDistanceMethod::isVerbose ( ) const
inlinevirtualinherited
Returns
True if verbose mode is enabled.

Implements bpp::DistanceMethod.

Definition at line 149 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::verbose_.

◆ setDistanceMatrix()

void AbstractAgglomerativeDistanceMethod::setDistanceMatrix ( const DistanceMatrix &  matrix)
throw (Exception
)
virtualinherited

Set the distance matrix to use.

Parameters
matrixThe matrix to use.
Exceptions
ExceptionIn case an incorrect matrix is provided (eg smaller than 3).

Implements bpp::DistanceMethod.

Reimplemented in bpp::BioNJ, bpp::NeighborJoining, and bpp::PGMA.

Definition at line 53 of file AbstractAgglomerativeDistanceMethod.cpp.

Referenced by bpp::AbstractAgglomerativeDistanceMethod::AbstractAgglomerativeDistanceMethod(), bpp::PGMA::setDistanceMatrix(), and bpp::NeighborJoining::setDistanceMatrix().

◆ setVerbose()

void bpp::AbstractAgglomerativeDistanceMethod::setVerbose ( bool  yn)
inlinevirtualinherited
Parameters
ynEnable/Disable verbose mode.

Implements bpp::DistanceMethod.

Definition at line 148 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::verbose_.

Member Data Documentation

◆ AVERAGE

const string HierarchicalClustering::AVERAGE = "Average"
static

Definition at line 70 of file HierarchicalClustering.h.

◆ CENTROID

const string HierarchicalClustering::CENTROID = "Centroid"
static

Definition at line 73 of file HierarchicalClustering.h.

◆ COMPLETE

const string HierarchicalClustering::COMPLETE = "Complete"
static

Definition at line 68 of file HierarchicalClustering.h.

◆ currentNodes_

std::map<size_t, Node*> bpp::AbstractAgglomerativeDistanceMethod::currentNodes_
protectedinherited

◆ matrix_

DistanceMatrix bpp::AbstractAgglomerativeDistanceMethod::matrix_
protectedinherited

◆ MEDIAN

const string HierarchicalClustering::MEDIAN = "Median"
static

Definition at line 71 of file HierarchicalClustering.h.

◆ method_

std::string bpp::HierarchicalClustering::method_
protected

Definition at line 76 of file HierarchicalClustering.h.

Referenced by getName().

◆ rootTree_

bool bpp::AbstractAgglomerativeDistanceMethod::rootTree_
protectedinherited

◆ SINGLE

const string HierarchicalClustering::SINGLE = "Single"
static

Definition at line 69 of file HierarchicalClustering.h.

◆ tree_

◆ verbose_

◆ WARD

const string HierarchicalClustering::WARD = "Ward"
static

Definition at line 72 of file HierarchicalClustering.h.


The documentation for this class was generated from the following files: