The rapid mutational evolution of SARS-CoV-2 may adversely impact B cell therapies which aim to prevent or treat the infection by targeting a small subset of viral proteins. T cell therapies may avoid this based on the ability of T cells to target peptides from across the entire viral genome, providing the opportunity to avoid highly mutagenic targets. We hypothesized that SARS-CoV-2 proteins had different mutation rates based on 1) length [=quantity of amino acids (AA) available to mutate], 2) whether they were structural or non-structural (structural with more exposure to antibody pressure) and 3) locations which enhance binding, cell penetration, and transmissibility such as the RBD region of Spike (S). If true, targets for T cell therapy could be selected from less mutagenic proteins. Using the Insight 68 SARS-CoV-2 database to sort and compare all sequences of this virus downloaded from NCBI, we selected the most frequent isolates (equaling 50% or more of the total) for each viral protein. At a minimum, all sequences present in 5% or more of the total isolates were included. We noted the length of each protein and its sequence heterogeneity which was calculated as the percentage of isolates that represented <5% of the total isolates (more isolates =higher heterogeneity). After identifying the proteins which had the highest variation, we examined the isolates of these proteins across 12 of the most frequent HLA class 1 A and B alleles (HLA-A*02:01, *01:01, *03:01, *11:01, *23:01, *24:02, HLA-B-*07:02, *08:01, *15:01, *35:01, *40:01, *44:02) to assess the effects of T cell elution (EL) binding scores on mutations using netMHCpan software. In terms of length, nsp 3 was the longest protein (1922 AA) followed by S (1273 AA), RdRp (932 AA), and nsp2 (628 AA). S, nsp3, N, and nsp2 had the highest sequence heterogeneity with 75.35%, 50.4%, 42.72%, and 34.36% of all isolates representing <5% of the total sequences for that protein. Examining isolates from these 5 proteins further, we found that the S protein contained 66 insertion/deletion and sequence mismatches amongst the 14 most common S sequences, 60 of which were in the S1 subunit (29 localized to the RBD region), with 6 sequence changes in the less antibody exposed S2 subunit. In contrast, the longest and second most heterogeneous protein, nsp3 (a non-structural protein) contained only 8 relevant sequence alterations amongst its 11 most frequent isolates. The shorter non-structural proteins, RdRP and nsp2 each demonstrated 2 relevant sequence alterations amongst 3 isolates, while the shortest protein analyzed, N, which is a structural protein, contained 5 relevant alterations amongst 13 isolates. The impact of the identified mutations on potential T cell target peptides were analyzed. In the case of S, there was extensive variation across a number of HLA restrictions, again focused in the S1 subunit. For example, the EL score for the HLA-A*03:01 restricted peptide RQIAPGQTGK present in 6 isolates was 0.9045, dropping to 0.0 in 8 isolates with the substitution of S for R and N for K at AA positions 403 and 412. EL scores also varied with mutations in the other analyzed proteins, but not nearly as widely or significantly due to a far smaller number and complexity of mutations in these proteins. There were instances of peptide targets which were identical between different HLA restrictions, most commonly occurring in alleles within an HLA superfamily. Considering all of the mutations across the 5 analyzed proteins, EL score was affected the most for HLA restriction *B15:01 (21 times) versus HLA-B40:01 (3 times) and HLA-A*02:01 (5 times). While AA length contributes to the quantity of mutations in SARS-CoV-2 proteins, exposure to combined humoral and cellular immunologic pressure as well as changes which enhance infectivity primarily drive mutational events. Peptide targets for T cell therapies are more likely to be preserved in protein regions with a reduced exposure to antibody pressure (which may include the S2 subunit of S) and location at sites unlikely to contribute to virulence/infectivity. While longer proteins will have higher overall mutation rates, in the absence of other mutational pressures, the rate of mutation at any particular location should be similar for all target peptides. The question as to whether peptides recognized together with more frequent HLA alleles or superfamilies will provide stronger T cell pressure for mutation should be further examined.
Rudolph:Tevogen Bio: Current Employment. Grosso:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. Ji:Insight 68: Current Employment. Boyle:Tevogen Bio: Current Employment. Chhipa:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. Gervasi:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. Wong:Tevogen Bio: Current Employment. Chen:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. O'Connor:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. O'Neill:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company. Flomenberg:Tevogen Bio: Current Employment, Current equity holder in publicly-traded company.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal