Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases
-
Infect. Genet. Evol. · Sep 2021
Global variation in SARS-CoV-2 proteome and its implication in pre-lockdown emergence and dissemination of 5 dominant SARS-CoV-2 clades.
SARS-CoV-2 is currently causing major havoc worldwide with its efficient transmission and propagation. To track the emergence as well as the persistence of mutations during the early stage of the pandemic, a comparative analysis of SARS-CoV-2 whole proteome sequences has been performed by considering manually curated 31,389 whole genome sequences from 84 countries. Among the 7 highly recurring (percentage frequency≥10%) mutations (Nsp2:T85I, Nsp6:L37F, Nsp12:P323L, Spike:D614G, ORF3a:Q57H, N protein:R203K and N protein:G204R), N protein:R203K and N protein: G204R are co-occurring (dependent) mutations. ⋯ These clades have evolved during the early stage of the pandemic and have disseminated across several countries. Further, Nsp10 is found to be highly resistant to mutations, thus, it can be exploited for drug/vaccine development and the corresponding gene sequence can be used for the diagnosis. Concisely, the study reports the SARS-CoV-2 antigens diversity across the globe during the early stage of the pandemic and facilitates the understanding of viral evolution.