cluster

A cluster is an ordered set of hits related to a model which satisfy the model distance constraints.

cluster API reference

cluster

class macsypy.cluster.Cluster(hits, model, hit_weights)[source]

Handle hits relative to a model which collocates

__contains__(v_hit)[source]
Parameters

v_hit (macsypy.hit.ModelHit object) – The hit to test

Returns

True if the hit is in the cluster hits, False otherwise

__init__(hits, model, hit_weights)[source]
Parameters
__str__()[source]
Returns

a string representation of this cluster

__weakref__

list of weak references to the object (if defined)

_check_replicon_consistency()[source]
Raise

MacsypyError if all hits of a cluster are NOT related to the same replicon

fulfilled_function(*genes)[source]
Parameters

gene – The genes which must be tested.

Returns

the common functions between genes and this cluster.

Return type

set of string

property functions
Returns

The set of functions encoded by this cluster function mean gene name or reference gene name for exchangeables genes for instance

<model vers=”2.0”>

<gene a presence=”mandatory”/> <gene b presence=”accessory”/>

<exchangeable>

<gene c />

</exchangeable>

<gene/>

</model>

the functions for a cluster corresponding to this model wil be {‘a’ , ‘b’}

Return type

frozenset

property hit_weights
Returns

the different weight for the hits used to compute the score

Return type

macsypy.hit.HitWeight

property loner
Returns

True if this cluster is made of only some hits representing the same gene and this gene is tag as loner False otherwise: - contains several hits coding for different genes - contains one hit but gene is not tag as loner (max_gene_required = 1)

merge(cluster, before=False)[source]

merge the cluster in this one. (do it in place)

Parameters
  • cluster (macsypy.cluster.Cluster object) –

  • before (bool) – If False the hits of the cluster will be add at the end of this one, Otherwise the cluster hits will be inserted before the hits of this one.

Returns

None

Raises

MacsypyError – if the two clusters have not the same model

property multi_system
Returns

True if this cluster is made of only one hit representing a multi_system gene False otherwise:

  • contains several hits

  • contains one hit but gene is not tag as loner (max_gene_required = 1)

replace(old, new)[source]

replace hit old in this cluster by new one. (work in place)

Parameters
Returns

None

property replicon_name
Returns

The name of the replicon where this cluster is located

Return type

str

property score
Returns

The score for this cluster

Return type

float

build_clusters

macsypy.cluster.build_clusters(hits, rep_info, model, hit_weights)[source]

From a list of filtered hits, and replicon information (topology, length), build all lists of hits that satisfied the constraints:

  • max_gene_inter_space

  • loner

  • multi_system

If Yes create a cluster A cluster contains at least two hits separated by less or equal than max_gene_inter_space Except for loner genes which are allowed to be alone in a cluster

Parameters
  • hits (list of macsypy.hit.ModelHit objects) – list of filtered hits

  • rep_info (macsypy.Indexes.RepliconInfo object) – the replicon to analyse

  • model (macsypy.model.Model object) – the model to study

Returns

list of regular clusters, the special clusters (loners not in cluster and multi systems)

Return type

tuple with 2 elements

  • true_clusters which is list of Cluster objects

  • true_loners: a dict { str function: :class:macsypy.hit.Loner | :class:macsypy.hit.LonerMultiSystem object}