The Parser of Systems definitions¶
The system parser object “SystemParser” instanciates Systems and Genes objects from XML system definitions (see Macromolecular systems definition). The parsing consists in three phases.
Phase 1.
- each Gene is parsed from the System it is defined
- From the list of System to detect, the list of Systems to parse is established
Phase 2.
- For each system to parse
- create the System
- add this System to the system_bank
- create the Genes defined in this System with their attributes but not their Homologs
- add these Genes in the gene_bank
Phase 3.
For each System to search
For each Gene defined in this System:
- create the Homologs by encapsulating Genes from the gene_bank
- add the Gene to the System
For instance:
Syst_1
<system inter_gene_max_space="10">
<gene name=”A” mandatory=”1” loner="1">
<homologs>
<gene name=”B” sys_ref=”Syst_2”>
</homologs>
</gene>
<system>
Syst_2
<system inter_gene_max_space="15">
<gene name=”B” mandatory=”1”>
<homologs>
<gene name=”B” sys_ref=”Syst_1”
<gene name=”C” sys_ref=”Syst_3”>
</homologs>
</gene>
<system>
Syst_3
<system inter_gene_max_space="20">
<gene name=”c” mandatory=”1” />
<system>
With the example above:
- the Syst_1 has a gene_A
- the gene_A has homolog gene_B
- the gene_B has a reference to Syst_2
- gene_B attributes from the Syst_2 are used to build the Gene
- the Syst_2 has attributes as defined in the corresponding XML file (inter_gene_max_space ,…)
Contrariwise:
- the gene_B has no Homologs
- the Syst_2 has no Genes
Note
The only “full” Systems (i.e., with all corresponding Genes created) are those to detect.
SystemParser API reference¶
-
class
macsypy.system_parser.
SystemParser
(cfg, system_bank, gene_bank)[source]¶ Build a System instance from the corresponding System definition described in the XML file (named after the system’s name) found at the dedicated location (“-d” command-line option).
-
__init__
(cfg, system_bank, gene_bank)[source]¶ Constructor
Parameters: - cfg (
macsypy.config.Config
object) – the configuration object of this run - system_bank (
macsypy.system.SystemBank
object) – the system factory - gene_bank (
macsypy.gene.GeneBank
object) – the gene factory
- cfg (
-
__weakref__
¶ list of weak references to the object (if defined)
-
_create_genes
(system, system_node)[source]¶ Create genes belonging to the systems. Be careful, the returned genes have not their homologs/analogs set yet.
Parameters: - system (
macsypy.system.System
object) – the System currently parsing - system_node (:class”Et.ElementTree object) – the element gene
Returns: a list of the genes belonging to the system.
Return type: [
macsypy.gene.Gene
, …]- system (
-
_create_system
(system_name, system_node)[source]¶ Parameters: - system_name (string) – the name of the system to create. This name must match a XML file in the definition directory (“-d” option in the command-line)
- system_node (:class”Et.ElementTree object.) – the node corresponding to the system.
Returns: the system corresponding to the name.
Return type: macsypy.system.System
object.
-
_fill
(system, system_node)[source]¶ Fill the system with genes found in this system definition. Add homologs to the genes if necessary.
Parameters: - system (
macsypy.system.System
object) – the system to fill - system_node (:class”Et.ElementTree object) – the “node” in the XML hierarchy corresponding to the system
- system (
-
_parse_analog
(node, gene_ref, curr_system)[source]¶ Parse a xml element gene and build the corresponding object
Parameters: - node (
xml.etree.ElementTree.Element
object.) – a “node” corresponding to the gene element in the XML hierarchy - gene_ref (class:macsypy.gene.Gene object.) – the gene which this gene is homolog to
Returns: the gene object corresponding to the node
Return type: macsypy.gene.Analog
object- node (
-
_parse_homolog
(node, gene_ref, curr_system)[source]¶ Parse a xml element gene and build the corresponding object
Parameters: - node (
xml.etree.ElementTree.Element
object.) – a “node” corresponding to the gene element in the XML hierarchy - gene_ref (class:macsypy.gene.Gene object) – the gene which this gene is homolog to
Returns: the gene object corresponding to the node
Return type: macsypy.gene.Homolog
object- node (
-
check_consistency
(systems)[source]¶ Check the consistency of the co-localization features between the different values given as an input: between XML definitions, configuration file, and command-line options.
Parameters: systems (list of class:macsypy.system.System object) – the list of systems to check Raise: macsypy.macsypy_error.SystemInconsistencyError
if one test fails(see feature)
In the different possible situations, different requirements need to be fulfilled (“mandatory_genes” and “accessory_genes” consist of lists of genes defined as such in the system definition):
- If: min_mandatory_genes_required = None ; min_genes_required = None
- Then: min_mandatory_genes_required = min_genes_required = len(mandatory_genes)
always True by Systems design
- If: min_mandatory_genes_required = value ; min_genes_required = None
- Then: min_mandatory_genes_required <= len(mandatory_genes)
- AND min_genes_required = min_mandatory_genes_required
always True by design
- If: min_mandatory_genes_required = None ; min_genes_required = Value
- Then: min_mandatory_genes_required = len(mandatory_genes)
- AND min_genes_required >= min_mandatory_genes_required
- AND min_genes_required <= len(mandatory_genes+accessory_genes)
to be checked
- If: min_mandatory_genes_required = Value ; min_genes_required = Value
- Then: min_genes_required <= len(accessory_genes+mandatory_genes)
- AND min_genes_required >= min_mandatory_genes_required
- AND min_mandatory_genes_required <= len(mandatory_genes)
to be checked
-
parse
(systems_2_detect)[source]¶ - Parse systems definition in XML format to build the corresponding system objects,
- and add them to the system factory after checking its consistency.
To get the system ask it to system_bank :param systems_2_detect: a list with the names of the systems to parse (eg ‘T2SS’) :type systems_2_detect: list of string
-