Welcome on Bactmentha Documentation !
I. Project Overview
BactMentha, a unified resource that integrates, through a fully automated workflow,
bacterial-host PPIs for three organisms (human, mouse and rat) from IMEx consortium databases, with annotations
from: the BastionHub resource, the Virulence Factor DataBase (VFDB), and clinically relevant bacterial strain
information from the World Health Organisation (WHO) and the UK Health and Safety Executive (HSE), through the
Pathogens Portal. BactMentha provides easy access to the available experimentally identified binding regions of
the interaction partners, which are reported as biological features in IMEx-annotated interaction data. These
experimental annotations are complemented with putative interaction interfaces predicted using the mimicINT tool.
II. Data Gathering
Protein interaction data gathering and annotation rely on a computational pipeline,
the BactMentha workflow, which consists of three main steps: (i) the bacteria-host protein interaction data
collection; (ii) the functional and clinical/safety annotation of bacterial proteins and strains, respectively;
and (iii) the protein interaction interface prediction. In the first step, the BactMentha workflow retrieves
the available PPIs for three host organisms (human, mouse and rat) from the International Molecular Exchange
(IMEx) Consortium databases via the PSICQUIC web service. Only PPIs associated with a valid PubMed identifier are kept. Next, for each host
organism, the PPI data is filtered to retain only the interactions with bacterial proteins by checking the
taxonomy identifier of the host interacting protein. The resulting dataset is further parsed to extract relevant
fields including, when available, experimentally identified binding regions and mutation data impacting protein
interactions. In the second step, the pipeline assigns functional annotations to the bacterial proteins in the
BactMentha dataset by performing BLAST pairwise sequence comparisons against sequences automatically taken
from BastionHub, a database
of translocated effectors, and in the Virulence Factor DataBase (VFDB). The functional association is retained only if the two
proteins share at least 30% sequence identity over at least 75% of alignment coverage. In the third step,
the BactMentha workflow integrates the mimicINT pipeline to predict putative interaction interfaces by exploiting
known domain-domain and SLIM-domain interaction templates. The prediction procedure is run only on the set of
bacteria-human interactions using default parameters and first checks whether any of the bacterial proteins
contain at least one domain or SLIM for which an interaction template is available. Secondly, if the interacting
human proteins contain the cognate domain, it infers the interaction interface. Finally, bacterial strains in
the dataset are annotated with available priority categories (i.e., critical, high and medium) according to the
2024 World Health Organization (WHO) priority pathogens list (https://www.who.int/publications/i/item/9789240093461),
and with hazard group classification from the 2023 Health and Safety Executive (HSE) Approved List of Biological
Agents (https://www.hse.gov.uk/pubns/misc208.pdf).
III. User guide
IV. Access and Usage
No registration or login is necessary to access BactMentha database.
BactMentha is licensed under Creative Commons Attribution 4.0 International
V. Contributions, Funding and Collaborations
Authors contributions:
Lou Bergogne: Data Curation, Formal Analysis, Investigation, Methodology, Software, Visualization, Writing –
Original Draft Preparation; Mégane Boujeant: Data Curation, Investigation, Methodology, Software. Marta Iannuccelli:
Data Curation, Investigation; Christine Brun: Funding Acquisition, Supervision, Writing – Review & Editing. Luana Licata:
Data Curation, Funding Acquisition, Investigation, Supervision, Writing – Review & Editing; Andreas Zanzoni: Conceptualization,
Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Supervision, Writing –
Original Draft Preparation, Writing – Review & Editing.
Funding:
This work received support from the JPI HDHL-INTIMIC action co-funded by the Agence Nationale de la Recherche
(ANR-17-HDIM-0001) and from the French government under the France 2030 investment plan, as part of the Initiative
d'Excellence d'Aix-Marseille Université - A*MIDEX (AMX-21-PEP-043).
V. Useful links
BastionHub:
ELM database:
IMEx Consortium:
InterPro database:
Mimicint:
The Approved List of biological agents (from HSE):
VFDB:
WHO priority groups definition:
ELM database:
IMEx Consortium:
InterPro database:
Mimicint:
The Approved List of biological agents (from HSE):
VFDB:
WHO priority groups definition:
VI. Glossary
ANR: Agence Nationale de la Recherche
AMU: Aix-Marseille Université
API: Application Programming Interface
BLAST: Basic Local Alignment Search Tool
BR: Binding regions (Experimentally detected regions of interactions)
CSV: Comma-Separated Values
EBI: European Bioinformatics Institute
ELM: Eukaryotic Linear Motif
HSE: Health and Safety Executive
HUPO: Human Proteome Organization
IMEx Consortium: International Molecular Exchange Consortium
INSERM: Institut National de la Santé et de la Recherche Médicale
JPI: Joint Programming Initiative
MESR: Ministère de l'Enseignement Supérieur et de la Recherche
MI: mimicINT interfaces (inferred regions of interactions)
NCBI: National Center for Biotechnology Information
OS: Ontology Search
PA: Protein Annotation (referring to bacterial proteins being annotated as virulence factors or effectors)
PPI: Protein-protein interaction
SLiM: Short Linear Motif
TAGC: Theories and Approaches of Genomic Complexity
VFDB: Virulence Factor DataBase
WHO: World Health Organization
AMU: Aix-Marseille Université
API: Application Programming Interface
BLAST: Basic Local Alignment Search Tool
BR: Binding regions (Experimentally detected regions of interactions)
CSV: Comma-Separated Values
EBI: European Bioinformatics Institute
ELM: Eukaryotic Linear Motif
HSE: Health and Safety Executive
HUPO: Human Proteome Organization
IMEx Consortium: International Molecular Exchange Consortium
INSERM: Institut National de la Santé et de la Recherche Médicale
JPI: Joint Programming Initiative
MESR: Ministère de l'Enseignement Supérieur et de la Recherche
MI: mimicINT interfaces (inferred regions of interactions)
NCBI: National Center for Biotechnology Information
OS: Ontology Search
PA: Protein Annotation (referring to bacterial proteins being annotated as virulence factors or effectors)
PPI: Protein-protein interaction
SLiM: Short Linear Motif
TAGC: Theories and Approaches of Genomic Complexity
VFDB: Virulence Factor DataBase
WHO: World Health Organization