PHI-base(Pathogen Host Interactions),a database that catalogues experimentally verified pathogenicity, virulence and effector genes from fungal, Oomycete and bacterial pathogens, which infect animal, plant, fungal and insect hosts. The mission of PHI-base is to provide expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions. We have PHI-base terms attached to genes in thousands of species in Ensembl Bacteria, Fungi and Protists. Information is also given on the target gene sites of some anti-infective chemistries.
Software DIAMOND was used for sequences mapping and annotation based on database (version 4.6, updated in November, 2018, including 6438 genes, 11340 interactions, 263 pathogens, 194 hosts, 510 diseases), with default parameters.
Input filesFasta file of nucleic acid or amino acid query sequences.
Results
1. Mapping and annotation results
2. Statistics of alignment
3. Distribution of E-values.
1. Mapping and annotation results
Query_id :ID of query sequences
Subject_id :ID of mapped sequences in database
Query_start :the start position of query sequences covered by alignment
Query _end :the end position of query sequences covered by alignment
Subject_start:the start position of subject sequences covered by alignment
Subject_end:the end position of subject sequences covered by alignment
Align_length:the length of sequences covered by alignment
Positive:counts of positive-scoring matches(Base or amino acid)
Gap:number of gaps
Coverage:the percentage of query covered by alignment to the database sequence
Identity(%) :identity of alignment (percentity)
E_value :Expcet values of alignment, the lower the better
Score:Score of alignment, the higher the better
DB_Type:the database of reference genes
Accession:the ID of reference genes in source database
Gene_name:gene name
Pathogen_NCBI _ID:NCBI taxonomy ID of the pathogenic species
Pathogen_species:systematic name of the pathogenic species
Disease_name:name of the disease caused by the pathogen host interaction
Host_Descripton:descripton of host
Host_NCBI_ ID:NCBI taxonomy ID of the host organism
Experimental_host:common name of the host organism
Function:function of proteins
2. Statistics of alignment
showing the number of mapped and unmapped results.
3. Distribution of E-values.
E value is the expected value of the alignment. The smaller the E value, the more the reliability. We divided the E value into five ranges and show the number with a pie chart.