and test data follow the same distribution. In medical imaging which increasingly begun
acquiring datasets from multiple sites or scanners, this identical distribution assumption
often fails to hold due to systematic variability induced by site or scanner dependent factors.
Therefore, we cannot simply expect a model trained on a given dataset to consistently work
well, or generalize, on a dataset from another distribution. In this work, we address this …