Sequencing the SARS-CoV-2 genome has massively aided the effort to understand and contribute to the control of the COVID-19 pandemic, through comparative genetics and molecular epidemiology. However, due to the very limited sampling of closely related non-human coronaviruses, the origin of some genomic regions remains an open question. One region that has probably attracted the most interest and speculation is the polybasic furin cleavage site insertion in the Spike open-reading frame of SARS-CoV-2, absent from all closely related Sarbecoviruses sampled to date (Andersen et al., 2020).
The most notable effort to identify the origin of this furin site has been made by William Gallaher (2020), suggesting a copy-choice recombination error between the proximal ancestor of SARS-CoV-2 and a yet unsampled betacoronavirus. This is based on detectable sequence homology between the SARS-CoV-2 oligomer insert and a downstream region of another bat coronavirus genome, HKU9-1. Could this recombination event have occurred within the SARS-CoV-2 Sarbecovirus lineage? While investigating the sequence similarity between the SARS-CoV-2 genome and the newly sampled RmYN02 sequence (Zhou et al., 2020) we identified an overlooked fragment of sequence homology with the SARS-CoV-2 furin site, providing a clue about the likely sequence this recombined from (MacLean/Lytras et al., 2020).