Title of publication:
Year of Publication:
medium resp. publishing house / place:
related to project:
The human leukocyte antigen (HLA) class II proteins present peptides to CD4+ T cells through an interaction with T cell receptors (TCRs). Thus, HLA proteins are key players in shaping immunogenicity and immunodominance. Nevertheless, factors governing peptide presentation by HLA-II proteins are still poorly understood. To address this problem, we profiled the blood transcriptome and immunopeptidome of 20 healthy individuals and integrated the profiles with publicly available immunopeptidomics datasets. In depth multi-omics analysis identified expression levels and subcellular locations as import sequence-independent features governing presentation. Levering this knowledge, we developed the Peptide Immune Annotator Multimodal (PIA-M) tool, as a novel pan multimodal transformer-based framework that utilises sequence-dependent along with sequence-independent features to model presentation by HLA-II proteins. PIA-M illustrated a consistently superior performance relative to existing tools across two independent test datasets (area under the curve: 0.93 vs. 0.84 and 0.95 vs. 0.86), respectively. Besides achieving a higher predictive accuracy, PIA-M with its Rust-based pre-processing engine, had significantly shorter runtimes. PIA-M is freely available with a permissive licence as a standalone pipeline and as a webserver (https://hybridcomputing.ikmb.uni-kiel.de/pia). In conclusion, PIA-M enables a new state-of-the-art accuracy in predicting peptide presentation by HLA-II proteins in vivo.