Here we present the NEPdb, a database containing more than 17,000 validated human immunogenic and non-immunogenic neoepitope entries with human leukocyte antigens (HLAs) and T cell information, curated from published literatures.
Also, NEPdb provides pan-cancer level predicted neoepitopes derived from common cancer somatic mutations, based on NetMHCpan 4.0 and HLAthena.
Data content | HLA-Ⅰ data | HLA-Ⅱ data | Total data |
---|---|---|---|
Entry (Total) | 12239 | 5310 | 17549 |
Entry (Positive) | 155 | 18 | 173 |
Entry (Negative) | 12084 | 5292 | 17376 |
Tumor type | 22 | 11 | 23 |
HLA allele | 60 | 35 | 95 |
Gene | 2063 | 811 | 2068 |
Protein sequence | 2332 | 895 | 2337 |
Name | Count |
---|---|
Cancer gene | 683 |
Non-synonymous mutation | 16745 |
Neopeptide | 516036 |
HLA class Ⅰ | 95 |
Total prediction | 49023420 |
Overall performance of nine HLA class Ⅰ prediction algorithms (immunogenic data from NEPdb)
Nine commonly used peptide-MHC binding prediction algorithms were respectively evaluated based on our positive samples from Validated Neopeptide Dataset.
NetMHCcons 1.1, NetMHCpan 4.0 and HLAthena performed better than others under this criterion.