phold is sensititve annotation tool for bacteriophage genomes and metagenomes using protein strucutal homology.

phold uses the ProstT5 protein language model to translate protein amino acid sequences to the 3Di token alphabet used by foldseek. Foldseek is then used to search these against a database of 803k protein structures mostly predicted using Colabfold.

Alternatively, you can specify protein structures that you have pre-computed for your phage(s) instead of using ProstT5.

The phold databse consists of approximately 803k protein structures generated using Colabfold from the following databases:

Google Colab Notebooks

If you don’t want to install phold locally, you can run it without any code using one of the following Google Colab notebooks: