
This module implements class relative to hit and some functions to do some computation on hit objects.


Modelize a hmm hit on the replicon. There is only one Corehit for a CoreGene.


Modelize a hit and its relation to the Model.


Parent class of Loner, MultiSystem. It’s inherits from ModelHit.


Modelize “true” Loner.


Modelize hit which can be used in several Systems (same model)


Modelize a hit representing a gene Loner and MultiSystem at same time.


The weights apply to the hit to compute score


Return the best hit for a given function


Sort hits


Choose among svereal multisystem hits the best one


If several profile hit the same gene return the best hit

digraph inheritanced9e8cfc1bc { rankdir=LR; size="8.0, 12.0"; "AbstractCounterpartHit" [URL="#macsypy.hit.AbstractCounterpartHit",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Abstract Class to handle ModelHit wit equivalent for instance Loner or MultiSystem hit"]; "ModelHit" -> "AbstractCounterpartHit" [arrowsize=0.5,style="setlinewidth(0.5)"]; "CoreHit" [URL="#macsypy.hit.CoreHit",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Handle the hits filtered from the Hmmer search."]; "Loner" [URL="#macsypy.hit.Loner",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Handle hit which encode for a gene tagged as loner and which not clustering with other hit."]; "AbstractCounterpartHit" -> "Loner" [arrowsize=0.5,style="setlinewidth(0.5)"]; "LonerMultiSystem" [URL="#macsypy.hit.LonerMultiSystem",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Handle hit which encode for a gene"]; "Loner" -> "LonerMultiSystem" [arrowsize=0.5,style="setlinewidth(0.5)"]; "MultiSystem" -> "LonerMultiSystem" [arrowsize=0.5,style="setlinewidth(0.5)"]; "ModelHit" [URL="#macsypy.hit.ModelHit",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Encapsulates a :class:`macsypy.report.CoreHit`"]; "MultiSystem" [URL="#macsypy.hit.MultiSystem",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Handle hit which encode for a gene tagged as loner and which not clustering with other hit."]; "AbstractCounterpartHit" -> "MultiSystem" [arrowsize=0.5,style="setlinewidth(0.5)"]; }

And a diagram showing the interaction between CoreGene, ModelGene, Model, Hit, Loner, … interactions


The diagram above represents the models, genes and hit generated from the definitions below.

<model name="A" inter_gene_max_space="2">
    <gene name="abc" presence="mandatory"/>
    <gene name="def" presence="accessory"/>

<model name="B" inter_gene_max_space="5">
    <gene name="def" presence="mandatory"/>
            <gene name="abc"/>
    <gene name="ghj" presence="accessory"

hit API reference


class macsypy.hit.CoreHit(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]

Handle the hits filtered from the Hmmer search. The hits are instanciated by HMMReport.extract() method In one run of MacSyFinder, there exists only one CoreHit per gene These hits are independent of any macsypy.model.Model instance.


Return True if two hits are totally equivalent, False otherwise.


other (macsypy.report.CoreHit object) – the hit to compare to the current object


the result of the comparison

Return type



compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.


other (macsypy.report.CoreHit object) – the hit to compare to the current object


True if self is > other, False otherwise


To be hashable, it’s needed to be put in a set or used as dict key

__init__(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]
  • gene (macsypy.gene.CoreGene object) – the gene corresponding to this profile

  • hit_id (str) – the identifier of the hit

  • hit_seq_length (int) – the length of the hit sequence

  • replicon_name (str) – the name of the replicon

  • position_hit (int) – the rank of the sequence matched in the input dataset file

  • i_eval (float) – the best-domain evalue (i-evalue, “independent evalue”)

  • score (float) – the score of the hit

  • profile_coverage (float) – percentage of the profile that matches the hit sequence

  • sequence_coverage (float) – percentage of the hit sequence that matches the profile

  • begin_match (int) – where the hit with the profile starts in the sequence

  • end_match (int) – where the hit with the profile ends in the sequence


Compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.


other (macsypy.report.CoreHit object) – the hit to compare to the current object


True if self is < other, False otherwise


Useful information on the CoreHit: regarding Hmmer statistics, and sequence information

Return type



list of weak references to the object (if defined)


the position of the hit (rank in the input dataset file)

Return type



class macsypy.hit.ModelHit(hit, gene_ref, gene_status)[source]

Encapsulates a macsypy.report.CoreHit This class stores a CoreHit that has been attributed to a putative system. Thus, it also stores:

  • the system,

  • the status of the gene in this system, (‘mandatory’, ‘accessory’, …

  • the gene in the model for which it’s an occurrence

for one gene it can exist several ModelHit instance one for each Model containing this gene


Return self==value.


Return self>value.


To be hashable, it’s needed to be put in a set or used as dict key

__init__(hit, gene_ref, gene_status)[source]

Return self<value.


Return str(self).


list of weak references to the object (if defined)

property hit

The CoreHit below this ModelHit

Return type

macsypy.hit.CoreHit oject

property loner

True if the hit represent a loner macsypy.Gene.ModelGene, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.

  • a hit representing a loner gene but include in a cluster is not a true loner

  • a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)

Return type


property multi_model

True if the hit represent a multi_model macsypy.Gene.ModelGene, False otherwise.

Return type


property multi_system

True if the hit represent a multi_system macsypy.Gene.ModelGene, False otherwise.

Return type



class macsypy.hit.AbstractCounterpartHit(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

Abstract Class to handle ModelHit wit equivalent for instance Loner or MultiSystem hit

__init__(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

Return str(self).

property counterpart

The set of hits that can play the same role

property loner

True if the hit represent a loner macsypy.Gene.ModelGene, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.

  • a hit representing a loner gene but include in a cluster is not a true loner

  • a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)

Return type


property multi_system

True if the hit represent a multi_system macsypy.Gene.ModelGene, False otherwise.

Return type



class macsypy.hit.Loner(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

Handle hit which encode for a gene tagged as loner and which not clustering with other hit.

__init__(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

hit that is outside a cluster, the gene_ref is a loner

property loner

True if the hit represent a loner macsypy.Gene.ModelGene, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.

  • a hit representing a loner gene but include in a cluster is not a true loner

  • a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)

Return type



class macsypy.hit.MultiSystem(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

Handle hit which encode for a gene tagged as loner and which not clustering with other hit.

__init__(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

hit that is outside a cluster, the gene_ref is a loner

property multi_system

True if the hit represent a multi_system macsypy.Gene.ModelGene, False otherwise.

Return type



class macsypy.hit.LonerMultiSystem(hit, gene_ref=None, gene_status=None, counterpart=None)[source]
Handle hit which encode for a gene
  • gene tagged as multi-system

  • and gene tagged as loner also

  • and the hit do not clustering with other hits.

__init__(hit, gene_ref=None, gene_status=None, counterpart=None)[source]

hit that is outside a cluster, the gene_ref is loner and multi_system



class macsypy.hit.HitWeight(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, out_of_cluster: float = 0.7)[source]

The weight to compute the cluster and system score see user documentation macsyfinder functionning for further details by default

  • itself = 1

  • exchangeable = 0.8

  • mandatory = 1

  • accessory = 0.5

  • neutral = 0

  • out_of_cluster = 0.7


list of weak references to the object (if defined)


macsypy.hit.get_best_hit_4_func(function, hits, key='score')[source]

select the best Loner among several ones encoding for same function

  • score

  • i_evalue

  • profile_coverage

  • function (str) – the name of the function fulfill by the hits (all hits must have same function)

  • hits (sequence of macsypy.hit.ModelHit object) – the hits to filter.

  • key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’


the best hit

Return type

macsypy.hit.ModelHit object



Sort macsypy.hit.ModelHit per function


model_hits – a sequence of macsypy.hit.ModelHit


dict {str function name: [model_hit, …] }






macsypy.hit.get_best_hits(hits, key='score')[source]

If several hits match the same protein, keep only the best match based either on

  • score

  • i_evalue

  • profile_coverage

  • hits ([ macsypy.hit.CoreHit object, …]) – the hits to filter, all hits must match the same protein.

  • key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’


the list of the best hits

Return type

[ macsypy.hit.CoreHit object, …]