SARS-CoV-2 variants that may display undesirable characteristics such as immune escape,
increased transmissibility or pathogenicity. Early prediction for emergence of new strains
with these features is critical for pandemic preparedness. We present Strainflow, a
supervised and causally predictive model using unsupervised latent space features of SARS-
CoV-2 genome sequences. Strainflow was trained and validated on 0.9 million sequences …