Advertisement
Research ArticleCOVID-19 Free access | 10.1172/JCI152702
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Duerr, R. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Dimartino, D. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Marier, C. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Zappile, P. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Wang, G. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Lighter, J. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Elbel, B. in: JCI | PubMed | Google Scholar
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Troxel, A. in: JCI | PubMed | Google Scholar |
1Department of Microbiology, NYU Grossman School of Medicine, New York, New York, USA.
2Genome Technology Center, Office of Science and Research, NYU Langone Health, New York, New York, USA.
3Department of Pathology,
4Department of Pediatric Infectious Diseases, and
5Department of Population Health, NYU Grossman School of Medicine, New York, New York, USA.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Find articles by Heguy, A. in: JCI | PubMed | Google Scholar
Published August 10, 2021 - More info
The efficacy of COVID-19 mRNA vaccines is high, but breakthrough infections still occur. We compared the SARS-CoV-2 genomes of 76 breakthrough cases after full vaccination with BNT162b2 (Pfizer/BioNTech), mRNA-1273 (Moderna), or JNJ-78436735 (Janssen) to unvaccinated controls (February–April 2021) in metropolitan New York, including their phylogenetic relationship, distribution of variants, and full spike mutation profiles. The median age of patients in the study was 48 years; 7 required hospitalization and 1 died. Most breakthrough infections (57/76) occurred with B.1.1.7 (Alpha) or B.1.526 (Iota). Among the 7 hospitalized cases, 4 were infected with B.1.1.7, including 1 death. Both unmatched and matched statistical analyses considering age, sex, vaccine type, and study month as covariates supported the null hypothesis of equal variant distributions between vaccinated and unvaccinated in χ2 and McNemar tests (P > 0.1), highlighting a high vaccine efficacy against B.1.1.7 and B.1.526. There was no clear association among breakthroughs between type of vaccine received and variant. In the vaccinated group, spike mutations in the N-terminal domain and receptor-binding domain that have been associated with immune evasion were overrepresented. The evolving dynamic of SARS-CoV-2 variants requires broad genomic analyses of breakthrough infections to provide real-life information on immune escape mediated by circulating variants and their spike mutations.
The novel betacoronavirus SARS-CoV-2 arose as a new human pathogen at the end of 2019, and rapidly spread to every corner of the globe, causing a pandemic of enormous proportions, with 179 million cumulative infections and almost 4 million deaths counted worldwide as of mid-June 2021 (1). As soon as the causative agent was identified and sequenced (2), massive efforts to develop vaccines were initiated and tested simultaneously in multiple clinical trials. These efforts led to the rapid deployment of several of these vaccines by December 2020, less than a year after the first SARS-CoV-2 viral sequence had been determined. In the United States, vaccination started in December 2020, using 2 novel mRNA-based vaccines, BNT162b2 (Pfizer/BioNTech) and mRNA-1273 (Moderna), both using the SARS-CoV-2 spike mRNA sequence of the original Wuhan variant. The Janssen COVID-19 vaccine, JNJ-78436735, was also deployed shortly after the mRNA vaccines, and these vaccines were shown to have high efficacy in clinical trials (3–5). With millions of people vaccinated in multiple countries, the vaccines have proven to be highly effective in the real world (6–12). A very small percentage of breakthrough infections have occurred, as expected (13–23). Contemporaneously with the vaccination efforts in several countries, new SARS-CoV-2 variants emerged, 4 of which were designated by the WHO as variants of concern (VOCs), B.1.1.7 (Alpha), B.1.351 (Beta), P.1 (Gamma), and B.1.617.2 (Delta), which arose originally in the United Kingdom, South Africa, Brazil, and India, respectively (24, 25). VOCs are classified as such if they are more transmissible, cause more severe disease, or cause a significant reduction in neutralization by antibodies generated via previous infection or after vaccination. In addition, there are variants of interest (VOIs), which carry mutations that have been associated with changes to receptor binding, reduced neutralization by anti–SARS-CoV-2 antibodies, or may increase transmissibility or severity. One of these variants is B.1.526 (Iota), which arose in New York City in late December 2020 (26, 27). Although it is not yet clear that any particular VOC or VOI is associated with vaccine breakthrough, data from Israel and from Washington, USA, suggest that there might be higher vaccine breakthrough rates with VOCs (13, 28).
In addition, in vitro selection, deep-mutational scanning of spike libraries, yeast display, and epidemiological and structural studies have revealed critical spike mutations that can escape monoclonal antibodies and convalescent sera (29–35). Thus, it is possible that certain amino acid mutations in spike, irrespective of affiliation to a specific VOC or VOI, are critical for vaccine breakthrough, but this has only been rudimentarily studied (18, 28). There is scarcity of data about determinants of vaccine breakthrough. The few reported studies have included breakthrough cases after first or second immunization; however, breakthrough cases after full vaccination remained moderate or low (13–23). Thus, comprehensive studies with site-specific mutation analyses are needed on a larger number of fully vaccinated individuals from different geographic regions. Here, we added such data from the New York metropolitan area. We carried out full SARS-CoV-2 genome sequencing of SARS-CoV-2–positive individuals 14 days after their completed vaccination series with mostly Pfizer and Moderna vaccines in the multi-center NYU Langone Health System. Analyses included statistical comparison of variant distribution as well as mutation rates at every residue in spike between vaccinated and unvaccinated individuals.
Alpha and Iota variants dominate vaccine breakthrough infections in metropolitan New York. A total of 126,367 fully vaccinated individuals were recorded in our electronic health records by April 30, 2021, of whom the majority (123,511, 98%) were vaccinated with mRNA vaccines (Pfizer/BioNTech = 103,166 and Moderna = 20,345), and the rest (2856) with the adenovirus-based Janssen COVID-19 vaccine, administered as a single dose. We recorded 101 cases of vaccine breakthrough infection (77 Pfizer/BioNTech, 17 Moderna, and 7 Janssen) between February 1 and April 30, 2021, representing 1.4% of the 7147 total SARS-CoV-2–positive cases in our healthcare system and 0.08% of the fully vaccinated population in our medical records. Out of the 101 cases, 76 cases (75%) yielded full SARS-CoV2 genomes (61 Pfizer/BioNTech, 11 Moderna, and 4 Janssen) that passed quality control (QC) and allowed us to determine the Pango lineage and mutations across the viral genome, including the spike gene. The median cycle threshold (Ct) value for the 101 breakthroughs was 27 (range 13–42). As expected, the 25 excluded breakthrough cases with low genome coverage and failed QC had significantly higher Ct values (median: 34, range: 27–42) than the 76 samples with full viral genome coverage (P < 0.0001, Mann Whitney test), which had Ct values below 36 (median: 24, range: 13–36) (Figure 1A).
RT-qPCR Ct values in postvaccine breakthrough infections. (A) Ct plots of samples that yielded a full genome with sufficient coverage to determine lineage and mutations (QC passed: >23,000 bp and >4000× coverage) compared with those that failed (QC failed: <23,000 bp or <4000× coverage). (B) Ct plots of all samples that passed QC, by lineages, B.1.1.7 (Alpha) and B.1.526 (Iota), compared with all others. Hospitalized cases are shown in larger sized black symbols.
Although B.1.1.7 (Alpha) infection has been associated with overall higher viral load and lower Ct values (36, 37), the Ct values in our recorded breakthrough infections were not significantly different for B.1.1.7 compared with B.1.526 (Iota) or all other variants (Figure 1B).
Among the 76 COVID-19 cases after vaccination with adequate SARS-CoV-2 genome coverage, the median age was 48 years, 37 were male and 39 were female. Seven required hospitalization for COVID-19, among whom there was one death in an elderly patient with multiple comorbidities who already was on in-home oxygen previous to post-vaccination COVID-19 infection and had a lengthy stay at the ICU (Tables 1 and 2 and Supplemental Table 1; supplemental material available online with this article; https://doi.org/10.1172/JCI152702DS1).
The distribution of Pango lineages in the 76 vaccine breakthroughs was as follows: 26 (34.2%) had B.1.1.7, 31 (40.7%) had B.1.526 (including sublineages B.1.526.1 and B.1.526.2), 1 (1.3%) had P.1, and 18 (23.7%) had other variants (Supplemental Figure 1). Among the 1046 sequences from the group of unvaccinated patients, 304 (29.0%) had B.1.1.7, 423 (40.4%) had B.1.526 (including sublineages B.1.526.1, B.1.526.2, and B.1.526.3), 12 (0.07%) had P.1, and 307 (29.3%) had other variants. Of the 7 COVID-19 hospitalizations, 4 were infected with B.1.1.7, including the fatal case, 2 with B.1.526, and 1 with B.1 (containing P681H; not a VOC or VOI). All hospitalizations due to COVID-19 occurred in patients who got infected less than 60 days after completion of the vaccination series (Table 1). There were no hospitalizations in patients who were infected more than 60 days after completion of the vaccination series (Table 2).
VOCs and VOIs are equally distributed in vaccinated and unvaccinated individuals with infections. To compare vaccinated breakthrough infection cases with unvaccinated controls statistically, we included 1046 unvaccinated individuals from the same study cohort who became SARS-CoV-2 infected in the same study months (February–April 2021) as the vaccine breakthrough cases. A χ2 test for rejecting the null hypothesis of equal Pango lineage distributions (B.1.1.7, B.1.526, and other variants combined) between vaccinated and unvaccinated patients resulted in a P value of 0.70.
To address confounding and other sources of bias arising from the use of observational data, we estimated a propensity score for the likelihood of full vaccination (38), and successfully matched all 76 vaccinated patients to unvaccinated patients, including age, sex, county of residence, and study month (February, March, and April 2021) as covariates. The standardized mean difference between the matched pairs was 0.0263, reduced by 96.9% from 0.738 prior to matching.
Supplemental Table 2 shows the distribution of variants in the matched pairs. McNemar’s test of the null hypothesis of equal distributions between vaccinated and unvaccinated patients, assessing the 3 VOCs/VOIs separately, could not be calculated due to sparse data. When we collapsed the table to reflect all VOCs/VOIs compared with other variants, McNemar’s test resulted in a P value of 0.692 (Supplemental Table 3). Thus, vaccinated and unvaccinated individuals in the metropolitan New York area were similarly affected by the regionally circulating VOCs and VOIs. In addition, there was no clear association among vaccinated patients between type of vaccine received and Pango lineage (χ2 test, P = 0.63).
Widespread phylogenetic dispersal of vaccine breakthrough sequences among unvaccinated controls. As a way to ascertain potential bias in our sampling, we carried out a phylogenetic analysis of our 76 breakthrough sequences in the context of 1046 unvaccinated SARS-CoV-2–positive controls (selected randomly as part of our greater New York genomic surveillance area) together with subsampled sequences from the United States as well as globally, the latter 2 groups based on Nextstrain builds of GISAID sequences (Figure 2). The main variants circulating in the New York area (purple) between the months of February and April 2021 were B.1.1.7 and B.1.526. Accordingly, our vaccine breakthrough samples (orange branch symbols and gray rays) mostly engaged B.1.1.7 and B.1.526 clades and were interspersed among the unvaccinated controls as well as other United States sequences. There was no evidence of extensive clustering that might indicate onward transmissions or transmission chains of vaccine breakthrough infections. Instead, they were widely distributed and appeared to mostly involve independent clusters of infections.
Maximum likelihood tree of SARS-CoV-2 vaccine breakthrough, unvaccinated matched control, and global reference sequences. IQ tree of 4923 SARS-CoV-2 full-genome sequences (base pairs 202–29,657 according to Wuhan-Hu-1 as reference), including 76 vaccine breakthrough (orange) and 1046 unvaccinated control SARS-CoV-2 sequences from our NYU cohort (greater NYC area) (purple) together with 1361 other US (cyan) and 2440 non-US global reference sequences (black). The tree was generated with a GTR+I+G substitution model and 1000 bootstrap replicates. The substitution scale of the tree is indicated at the bottom right. The branches of the tree are colored as indicated. Vaccine breakthrough sequences are highlighted by orange triangles as branch symbols and gray rays radiating from the root to the outer rim of the tree. Hospitalizations due to COVID-19 among the vaccine breakthrough infections are indicated by black triangles (h). The variants responsible for most vaccine breakthrough infections in our study cohort are labeled with respective Pango lineages (WHO classification in parenthesis).
Enrichment of N-terminal domain deletions and receptor binding domain escape mutations in vaccine breakthrough compared with unvaccinated control sequences. To screen whether vaccine breakthrough preferentially occurred with distinct vaccine escape mutations, we performed a comparative analysis of spike mutations between case and control groups. In our aligned data set of 76 vaccine breakthrough and 1046 unvaccinated control sequences, spike mutations (compared with Wuhan-Hu-1) occurred at 230 amino acid residues. While most of these sites were more frequently mutated in control cases (182 sites), and with 1 site (D614) being equally mutated in both groups (100% D614G), 47 sites exhibited increased mutation rates in the vaccine breakthrough group (Figure 3A). Interestingly, the degree of enrichment (Δ mut) was higher at the 47 breakthrough-enriched sites compared with the 182 sites enriched in controls. Contributing factors presumably included random mutations in the absence of immune pressure in controls, adaptive selection of immune escape mutations in vaccine breakthroughs, but also uneven case numbers in both groups. When we disregarded unique mutations per data set in our calculations, the mutation analysis yielded 23 distinct spike sites with enriched mutations in breakthrough infections (Figure 3B and Supplemental Table 4). Although individual sites did not achieve significance in Fisher’s exact tests, the array of sieved mutation sites drew a striking pattern of N-terminal domain (NTD) deletions (ΔY144 and ΔV69-H70), receptor binding domain (RBD) mutations (E484K, N501Y, and K417N/T), an S1 mutation known to modulate the RBD up or down positioning (A570D/V) (39), a mutation right in front of the furin binding site known to affect/improve S1/S2 cleavage (P681H/R) (40), and also C-terminal mutations in S2 (T716I, S982A, T1027I, and D1118H), all of which have been associated with enhanced immune evasion, ACE2 receptor binding, and/or recurrence in VOCs/VOIs (24, 41). The overrepresentation was most pronounced for E484K, followed by A570D/V, P681H/R, and ΔY144, which surpassed background spike mutation levels in unvaccinated controls (compared with Wuhan-Hu-1) by more than 12-fold. Higher levels of NTD deletion ΔY144 as well as S1 mutation A570D/V were based on the (nonsignificant) overrepresentation of the B.1.1.7 variant in breakthrough cases (34.2%) compared with nonvaccinated controls (29.0%), and, in the case of ΔY144, were supported by a slight difference in frequencies of sublineage B.1.526.1 with its characteristic ΔY144 deletion in breakthrough (9.2%) versus control cases (8.4%; Supplemental Figure 1). Higher levels of P681H/R mutations in breakthrough cases traced back to infections with B.1.1.7 but also other viruses carrying this mutation (e.g., within lineage B.1.575 and B.1.1.519). RBD E484K mutations were found in different B.1.526 subsets (Iota and B.1.526.2 sublineage) and occurred more frequently in breakthrough compared with control cases (Supplemental Figure 1).
Site-specific spike mutation analysis in SARS-CoV-2 vaccine breakthrough sequences compared with unvaccinated matched controls. (A) Comparison of site-specific amino acid mutation (mut) frequencies in spike in 76 vaccine breakthrough sequences (Vacc) compared with 1046 unvaccinated matched controls (Unvacc) from the same cohort. The Wuhan-Hu-1 sequence served as reference to call mutations per site, and all spike mutation sites of the study sequences are shown along the x axis according to their spike position (n = 230). The mirror plot displays differences of mutation frequencies between Vacc and Unvacc groups; orange bars (top) refer to higher mutation rates in Vacc sequences, whereas black bars (bottom) refer to higher mutation rates in Unvacc matched controls. (B) Enrichment of spike mutations in SARS-CoV-2 vaccine breakthrough sequences. All sites with greater spike mutation rates in Vacc compared with Unvacc controls are shown; sites with unique occurrences of mutations in breakthrough cases were disregarded. Mutation sites in the spike NTD, RBD, the C-terminal S1 region affecting RBD, and S1/S2 interface region that have been associated with VOCs, neutralization immune escape, and/or affecting important biological functions of the virus are highlighted. The dashed black line indicates the average mutation frequency across all spike residues in the unvaccinated control data set compared with Wuhan-Hu-1 as reference (n = 1046).
Our data from a large metropolitan healthcare system in the greater New York City area underline a high vaccine efficacy in fully vaccinated individuals more than 14 days after the last dose of vaccine. Efficacy is maintained against several circulating variants including VOCs and VOIs. Compared with the large number of SARS-CoV-2 infections among unvaccinated individuals, the recorded breakthrough cases between February and April 2021 (n = 76) remained at approximately 1% of total infections, with the caveat that both breakthrough cases as well as unvaccinated controls were not exhaustively screened and covered.
Despite the overall effectiveness of vaccination, our full spike mutation analysis revealed a broad set of spike mutations (n = 23) to be elevated in the vaccine breakthrough group. The analysis indicates that adaptive selection is in progress that may subsequently come into full effect. At this point, the breakthrough cases and differences in mutation rates between case and control groups are still too low to draw meaningful conclusions. However, the modest overrepresentation of functionally important spike mutations including NTD deletions ΔH69-V70 and ΔY144 together with RBD mutations E484K and N501Y as well as A570D/V (S1 mutation modulating RBD up/down positioning) and P681H/R (next to the S1/S2 interface tuning cleavage) may indicate a starting sieve effect at individual or combinations of functional mutations. Spike mutations and deletions reported to confer neutralization escape in vitro (25, 29, 30), or regulation of biological processes of the virus (39, 40), might thus be responsible for a sieve effect in a real-life situation (i.e., among vaccinated individuals).
During the time of our sample and data collection, there were 2 major variants arising in the New York City metro area constituting about two-thirds of all cases, B.1.1.7, which was first reported in the United Kingdom (42), and B.1.526, which arose in New York City around December of 2020 (26, 27). B.1.1.7 was deemed a VOC by the WHO and CDC and was associated with higher viral loads in infected individuals (36, 37), enhanced epidemiological spread (42–44), an extended window of acute infection (45), less effective innate and adap’tive immune clearance (46), and increased death rates in elderly patients and/or individuals with comorbidities (47–50).
The B.1.1.7 variant (e.g., through the RBD N501Y mutation but also through NTD deletions) acquired an enhanced affinity to ACE2, higher infectivity and virulence (51–54), while maintaining sensitivity to neutralization though with slightly impaired nAb titers (51, 55–57). The B.1.526 variant and its derivatives were designated VOIs because of the presence of RBD mutations such as E484K or S477N that favor immune evasion (25, 27).
To study whether the more transmissible B.1.1.7 or B.1.526, because of its critical RBD mutations and prevalence/place of origin in our city, were overrepresented in the breakthrough cases, we performed a comparative statistical analysis. Extending a few other studies reported so far (13–21), we focused on a strict threshold of more than 14 days after last dose of vaccination and we performed both unmatched and matched analyses side-by-side. We adjusted for the confounding factors of sex, age, study month, and residence area, though we are aware that other confounding factors resulting from differences in sampling or behavioral factors between groups might also play a role. Therefore, we cannot guarantee a representative set of breakthrough infections; it is possible that infected individuals had milder symptoms after vaccination and were less likely to seek testing, narrowing down the pool of breakthrough infections to more severe cases including VOCs/VOIs.
These caveats reinforce our finding that B.1.1.7 and B.1.526 are not preferentially represented in the vaccine breakthrough group but distributed at similar proportions as other variants in case and control groups, implying that the studied mRNA vaccines are comparably effective against these predominant variants in the NYC area. This is in agreement with recent data from Israel evaluating B.1.1.7 in this context (13). A sieve effect of the B.1.526 variant has not been studied in this detail, except for 2 studies with small sample sizes of n = 2 and n = 11 that did not allow strong conclusions (15, 19). A recent CDC study reported a total number of 10,262 SARS-CoV-2 vaccine breakthrough infections across the USA as of the end of April 2021, causing hospitalization in 10% of cases (n = 995) (20), a rate identical to our study.
Interestingly, the median Ct of our breakthrough infections, including those yielding an inadequate genome, was 26, with 50% of them (51 samples) exhibiting Ct values of 25 or less. It implies a moderate to high viral load in many of our breakthrough cases, at least in the nasopharynx from where our samples were collected. These moderate to high viral loads suggest the possibility of transmission to others (58, 59). Although our phylogenetic analysis does not provide evidence of transmission to others from our breakthrough cases at this time, this should be expected with the growing number of breakthrough cases in the next months. This may have significant epidemiologic and clinical significance if transmissible breakthrough infections carry specific spike mutations associated with immune evasion.
In conclusion, our data indicate that vaccine breakthrough in fully mRNA-vaccinated individuals is not more frequent with the VOC Alpha or the VOI Iota, which underlines the high efficacy of mRNA COVID-19 vaccines against different variants. The increased presence of mutations in key regions of the spike protein in the vaccine breakthrough group is of potential concern and requires continued monitoring. Genomic surveillance of vaccine breakthrough cases should be carried out on a broader scale throughout the United States and the entire world to achieve higher case numbers and the inclusion of different VOCs and VOIs.
Study design and sample collection. The design is an observational case control study of SARS-CoV-2 vaccine breakthrough infections in the NYU Langone Health system, a large healthcare system in the New York City metro area, with primary care hospitals located in Manhattan (New York County), Brooklyn (Kings County), and Mineola and Lake Success (Nassau County). The case group consisted of individuals who tested positive by real-time RT-PCR for SARS-CoV-2 RNA regardless of Ct, any time after 14 days of inoculation with the second dose of BNT162b2 (Pfizer/BioNTech) or mRNA-1273 (Moderna) vaccines, or with the single-dose Janssen vaccine, according to our electronic health records. The control group consisted of randomly selected full genome–sequenced SARS-CoV-2–positive cases in our health system and which had a Ct of 30 or less and were collected in the same time period as the breakthrough infections. Nasopharyngeal swabs were sampled from individuals suspected to have an infection with SARS-CoV-2 as part of clinical diagnostics or hospital admittance. Samples were collected in 3 mL viral transport medium (VTM; universal transport medium or equivalent; Copan). Clinical testing was performed using various FDA emergency use authorization–approved platforms for detection of SARS-CoV-2 (i.e., the Roche Cobas 6800 SARS-CoV-2 [90% of the samples in this study] and Cepheid Xpert SARS-CoV-2 or SARS-CoV-2/Flu/RSV assays).
RNA extraction, cDNA synthesis, and library preparation and sequencing. RNA was extracted from 400 μL of each nasopharyngeal swab specimens using the MagMAX Viral/Pathogen Nucleic Acid Isolation Kit on the KingFisher flex system (Thermo Fisher Scientific) following the manufacturer’s instructions. Total RNA (11 μL) was converted to first-strand cDNA by random priming using the Superscript IV first-strand synthesis system (Invitrogen, catalog 180901050). Libraries were prepared using Swift Normalase Amplicon Panel (SNAP) for SARS-CoV-2 and a SARS-CoV-2 additional Genome Coverage Panel (catalog SN-5X296 core kit, 96rxn), using 10 μL first-strand cDNA following the manufacturer’s instructions (60). Final libraries were run on Agilent TapeStation 2200 with high-sensitivity DNA ScreenTape to verify the amplicon size of about 450 bp. Normalized pools were run on the Illumina NovaSeq 6000 system with the SP 300 cycle flow cell. Run metrics consisted of 150 paired-end cycles with dual indexing reads. Typically, 2 pools representing 2 full 96-well plates (192 samples) were sequenced on each SP300 NovaSeq flow cell.
Sequenced read processing. Sequencing reads were demultiplexed using the Illumina bcl2fastq2 Conversion Software v2.20 and adapters and low-quality bases were trimmed with Trimmomatic v.0.36 (61). The Burrows-Wheeler Alignment tool v.0.7.17 (62) was used for mapping reads to the SARS-CoV-2 reference genome (NC_045512.2, wuhCor1) and mapped reads were soft-clipped to remove SNAP tiled primer sequences using Primerclip v.0.3.8 (63). BCFtools v.1.9 (64) was used to call mutations and assemble consensus sequences, which were then assigned phylogenetic lineage designations according to Pango nomenclature (65). Sequences that did not yield a near-complete viral genome (< 23,000 bp, < 4000× coverage) were discarded from further analysis. All sequences are publicly available in the Sequence Read Archive (BioProject PRJNA751078) and were deposited in GISAID (GISAID accession numbers are provided in Supplemental Table 1).
Phylogenetic analyses. SARS-CoV-2 full-genome sequences were aligned using MAFFT v.7 (66). The alignment was cropped to base pairs 202-29657 according to the Wuhan-Hu-1 reference to remove N- and C-terminal regions with unassigned base pairs. Maximum likelihood IQ trees were performed using the IQ-TREE XSEDE tool, multicore version 2.1.2, on the Cipres Science gateway v.3.3 (67). GTR+I+G was chosen as the best-fit substitution model. Support values were generated with 1000 bootstrap replicates and the ultrafast bootstrapping method. Phylogenetic trees were visualized in Interactive Tree Of Life (iTOL) v.6 (68). The tree was constructed using 76 vaccine breakthrough and 1046 unvaccinated control SARS-CoV-2 sequences from our NYU cohort (greater NYC area) together with 3801 global reference sequences for a total of 4923 SARS-CoV-2 genomic sequences. The reference sequences were retrieved from a Nextstrain build with North America–focused global subsampling (41) and included 1361 other US sequences.
Mutation analysis. Spike amino acid counts and calculations of site-specific mutation frequencies compared with Wuhan-Hu-1 as reference were done on MAFFT-aligned SARS-CoV-2 sequences using R (69) and R Studio (70) with scripts based on the seqinr and tidyverse (dplyr, lubridate, magrittr) packages. The calculations exclusively considered residues that were accurately covered by sequencing (i.e., nonambiguous characters and gaps). Fisher’s exact tests and multiplicity corrections (Benjamini-Hochberg) were done in Program R/RStudio. Multiplicity corrected P values (q) less than 0.05 were considered significant. Highlighter analyses were performed on MAFFT-aligned SARS-CoV-2 amino acid sequences of the spike genomic region. SARS-CoV-2 vaccine breakthrough sequences were compared with Wuhan-Hu-1 as master using the Highlighter tool provided by the Los Alamos HIV sequence database (71).
Statistics. To address confounding and other sources of bias arising from the use of observational data, we estimated a propensity score for the likelihood of full vaccination, and matched vaccinated to unvaccinated patients, including age, sex, county of residence, and study month (February, March, April 2021) as covariates (38). Propensity-score matching was implemented using the nearest neighbor strategy and a 1:1 ratio without replacement, with the MatchIt algorithm in RStudio v.1.4.1106 (72). Before and after matching, we evaluated the presence of 3 Pango lineages, B.1.1.7, B.1.526, and P.1, compared with all other lineages combined. On unmatched data, we calculated a Pearson χ2 test statistic; on matched data we calculated McNemar’s test. The tables used for these calculations are in the Supplemental Material (Supplemental Tables 1 and 2). Comparisons of Ct values and mutation counts between groups were made using nonparametric Mann-Whitney tests or Kruskal-Wallis tests with Dunn’s multiplicity correction.
Study approval. This study was approved by the NYU Langone Health Institutional Review Board, protocol number i21-00493.
RD designed the study, analyzed data and wrote the manuscript; DD generated sequencing data; CM analyzed genomic data; PZ generated sequencing data; GW collected samples and performed clinical SARS-CoV-2 detection; JL reviewed clinical information; BE monitored epidemiological data and generated lists of breakthrough infections; ABT performed statistical analyses; AH designed the study, generated genomic data, and wrote the manuscript. All authors reviewed and edited the manuscript.
The authors wish to thank Vincent Setang and NYU Langone Health DataCore for support extracting the data for this study from clinical databases, and Maria Agüero-Rosenfeld and the clinical laboratory technicians for assistance in testing, saving, and retrieving specimens, especially Joanna Fung for her assistance with RNA extraction project. We also thank Joan Cangiarella for her continuous support of genomic surveillance for SARS-CoV-2 at NYU Langone Health, including providing institutional funding for this study. We are grateful to the authors and submitting laboratories who deposited data in GISAID, in particular to those whose sequences we used to create the phylogenetic tree. These are all named in Supplemental Table 5. The NYU Langone Health Genome Technology Center is partially supported by the Cancer Center Support Grant P30CA016087 at the Laura and Isaac Perlmutter Cancer Center.
Address correspondence to: Ralf Duerr, Department of Microbiology, NYU Grossman School of Medicine, Alexandria Center for Life Science (ACLS), West Tower, 430 East 29th Street, Room 323, New York, New York 10016, USA. Phone: 212.263.4159; Email: ralf.duerr@nyulangone.org. Or to: Adriana Heguy, Department of Pathology, NYU Grossman School of Medicine, Genome Technology Center, NYU Langone Health, 550 First Avenue, MSB 294A, New York, New York 10016, USA. Phone: 212.263.8048; Email: adriana.heguy@nyulangone.org.
Conflict of interest: The authors have declared that no conflict of interest exists.
Copyright: © 2021, American Society for Clinical Investigation.
Reference information: J Clin Invest. 2021;131(18):e152702.https://doi.org/10.1172/JCI152702.