################### T-Coffee Web Server ################### .. warning:: This chapter is currently under maintenance. Its aim is to describe what the T-Coffee web server is doing programmatically when you are using it; it provides technical details about the restrictions and commands in use. In this chapter we describe briefly the available T-Coffee web servers and their usage (command lines and examples files); if you need more advanced options, you should use the T-Coffee package via command lines. The web server is a reduced version of the T-Coffee package, containing all T-Coffee modes for aligning protein, RNA or DNA sequences; it also contains the evaluation and downstream analysis tools (TCS, iRMSD/APDB, STRIKE, T-RMSD). Currently, all reformatting utilities are not available on the web server, however, you can choose some reformatting options related to the ouput format. Here we present briefly the webserver and the T-Coffee commands it contains. ******* General ******* Most of the following command lines used by the web server contains lots of different options called with flags; here is a summary of the options common to all comand lines: - **-in**: your input file - **-output**: specifies the output files you require - **-maxnseq**: maximum number of sequences you can run (variable depending on the mode) - **-maxlength**: maximum length of your sequences (variable depending on the mode) - **-case=upper**: all residues/nucleotids will be upper case - **-seqnos=off**: the size of the sequences is not indicated on the MSA - **-outorder=input**: orders the sequences in the final MSA as in the input dataset - **-run_name**: name of the job on the cluster - **-multi_core=4**: uses only 4 cores on your server when running the job - **-quiet=stdout**: standard verbose output ******************* T-Coffee Simple MSA ******************* The central part of the web server is the T-Coffee aligner, you can use it to align any kind of sequences (Protein, RNA or DNA alike). The command it runs is the following: :: $#: t_coffee -in=data_93c5fbb0.in -mode=regular -output=score_html clustalw_aln fasta_aln \ score_ascii phylip -maxnseq=150 -maxlen=10000 -case=upper -seqnos=off -outorder=input \ -run_name=result -multi_core=4 -quiet=stdout ***************** Protein Sequences ***************** Expresso (using 3D structures) ============================== :: $#: t_coffee -in=data_93c5fbb0.in -mode=expresso -blast=LOCAL -pdb_db=/db/pdb/derived_data_\ format/blast/2016-01-01/pdb_seqres.fa -evaluate_mode=t_coffee_slow -output=score_html \ clustalw_aln fasta_aln score_ascii phylip -maxnseq=150 -maxlen=2500 -case=upper -seqnos= \ off -outorder=input -run_name=result -multi_core=4 -quiet=stdout M-Coffee (combining multiple methods) ===================================== :: $#: t_coffee -in=data_93c5fbb0.in Mpcma_msa Mmafft_msa Mclustalw_msa Mdialigntx_msa Mpoa_msa \ Mmuscle_msa Mprobcons_msa Mt_coffee_msa -output=score_html clustalw_aln fasta_aln \ score_ascii phylip -tree -maxnseq=150 -maxlen=2500 -case=upper -seqnos=off -outorder=input \ -run_name=result -multi_core=4 -quiet=stdout PSI/TM-Coffee (transmembrane proteins) ====================================== Two options are available (in addition to the choice of the database): without transmembrane prediction or with prediction (displayed color coded on the html file). :: $#: tmcoffee.sh -in data_9df741d4.in -mode psicoffee -blast_server LOCAL --search-db 'UniRef50 \ -- Very Fast/Rough' --search-type '' -prot_min_sim 50 -prot_max_sim 90 -prot_min_cov 70 \ --search-out 'clustalw_aln fasta_aln score_ascii phylip score_html' -maxnseq 1000 -maxlen \ =5000 -case upper -seqnos=off -outorder input -run_name result -multi_core 4 -quiet=stdout $#: tmcoffee.sh -in data_9df741d4.in -mode psicoffee -blast_server LOCAL --search-db 'UniRef50 \ -- Very Fast/Rough' --search-type 'transmembrane' -prot_min_sim 50 -prot_max_sim 90 \ -prot_min_cov 70 --search-out 'clustalw_aln fasta_aln score_ascii phylip score_html' -maxnseq 1000 -maxlen 5000 -case upper -seqnos off -outorder input -run_name result ยบ -multi_core 4 -quiet=stdout PSI-Coffee (homology extension) =============================== :: $#: t_coffee -in=data_93c5fbb0.in -mode=psicoffee -blast=LOCAL -protein_db=/db/ncbi/201511/ \ blast/db/nr.fa -output=score_html clustalw_aln fasta_aln score_ascii phylip -maxnseq=150 \ -maxlen=2500 -case=upper -seqnos=off -outorder=input -run_name=result -multi_core=4 \ -quiet=stdout ************* RNA Sequences ************* R-Coffee (using 2D prediction) ============================== :: $#: t_coffee -in=data_29091222.in -method=mafft_msa muscle_msa probconsRNA_msa -output= \ score_html clustalw_aln fasta_aln score_ascii phylip -maxnseq=150 -maxlen=2500 -case=upper \ -seqnos=off -outorder=input -run_name=result -tree -special_mode=rcoffee -method_limits= \ consan_pair 5 150 -multi_core=4 -quiet=stdout SARA-Coffee (using 3D structures) ============================== SARA-Coffee is a bit complicated to run, it uses several third party packages and is run through a script and environment variables we set up; here is what it looks like: :: ##: export X3DNA=/data/www-cn/sara_coffee_package/X3DNA; ##: export PDB_DIR=/data/www-cn/sara_coffee_package/PDBdir/; ##: export NO_REMOTE_PDB_DIR=1; ##: unset MAFFT_BINARIES; ##: cd $CACHE_4_TCOFFEE ##: ln -s /data/www-cn/sara_coffee_package/pdb_entry_type.txt); $#: t_coffee -in data_3e6e7aec.in -method sara_pair -template_file \ /data/www-cn/sara_coffee_package/TEMPLATEFILE,RNA -extend_mode rna2 -relax_lib 0 -transform \ dna2rna -run_name=result -output score_html clustalw_aln -case=upper -seqnos=off -outorder= \ input -multi_core=4 -pdb_min_sim 0 -quiet stdout RM-Coffee (combining multiple methods) ====================================== Not yet available... ************* DNA Sequences ************* M-Coffee (combining multiple methods) ===================================== For now, M-Coffee by default is the same for DNA, RNA and protein sequences alike. There is no specific M-Coffee for DNA sequences. Pro-Coffee (homologous promoter regions) ======================================== :: $#: t_coffee -in=data_476efe5f.in -mode=procoffee -output=score_html clustalw_aln fasta_aln \ score_ascii phylip -maxnseq=150 -maxlen=10000 -case=upper -seqnos=off -outorder=input \ -run_name=result -multi_core=4 -quiet=stdout **************** Evaluation Tools **************** TCS (Transitive Consistency Score) ================================== :: ##: tcs.sh -infile data_a98d61a6.in -in Mproba_pair -score 1 -output clustalw_aln fasta_aln \ phylip score_ascii tcs_weighted tcs_replicate score_html -maxnseq 1000 -maxlen 8000 \ -seqnos=off -run_name result -multi_core 4 --filter-type column --filter-min 4 --filter-max \ 9 --filter-gap yes -quiet=stdout iRMSD/APDB (MSA structural evaluation) (under maintenance...) ====================================== :: $#: t_coffee -other_pg apdb -aln data_c7151320.in -apdb_outfile default -outfile default \ -io_format hsg3 -output score_html -maximum_distance 10 -md_threshold 2.0 -similarity_ \ threshold 70 -template_file EXPRESSO -run_name result -quiet stdout T-RMSD (structural clustering) ============================== :: $#: t_coffee -in=data_b89d3438.in -mode=expresso -cache=$PWD -blast=LOCAL -pdb_db=/db/pdb/ \ derived_data_format/blast/2016-01-01/pdb_seqres.fa -evaluate_mode=t_coffee_slow -output= \ aln score_html -maxnseq=150 -maxlen=2500 -case=upper -outorder=input -run_name=result \ -multi_core=4 -quiet=stdout; t_coffee -other_pg trmsd result.aln -template_file \ result_pdb1.template_list -output color_html 2>&1; [ -e result.struc_tree.consensus ] STRIKE (MSA evaluation with single structure) ============================================= :: $#: wget ftp://ftp.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt $#: install ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST $#: t_coffee -other_pg strike -template_file PDB -pdb_db -blast_server LOCAL