The align package
This package uses a pangenome as a reference to compute elements for a given genome, or a given set of proteins. As such, analysis that are usually run on multiple genomes can be run on the single genome or set of proteins that is provided. This subpackage depends on many of the other subpackages to run its analysis. This package depends on the following packages:
formats, to check the pangenome status.
annotate, to read the given input files that can be gff or gbff.
cluster, to write gene sequences from annotations.
RGP, to eventually compute RGP and spot predictions.
It depends on the following modules:
pangenome
utils
Submodules
ppanggolin.align.alignOnPang module
- ppanggolin.align.alignOnPang.align(pangenome, proteinFile, output, tmpdir, identity=0.8, coverage=0.8, defrag=False, cpu=1, getinfo=False, draw_related=False)[source]
- ppanggolin.align.alignOnPang.alignSeqToPang(pangFile, seqFile, output, tmpdir, cpu=1, defrag=False, identity=0.8, coverage=0.8, is_nucl=False, code=11)[source]
- ppanggolin.align.alignOnPang.getFam2RGP(pangenome, multigenics)[source]
associates families to the RGP they belong to, and those they are bordering
- ppanggolin.align.alignOnPang.getFam2spot(pangenome, output, multigenics)[source]
reads a pangenome object and returns a dictionnary of family to RGP and family to spot, that indicates where each family is
- ppanggolin.align.alignOnPang.linkNewGenomeFamilies(orgPangenome, formerPangenome, blastTab)[source]
- ppanggolin.align.alignOnPang.projectRGP(pangenome, annotation, output, tmpdir, identity=0.8, coverage=0.8, defrag=False, cpu=1, translation_table=11, pseudo=False)[source]