features for tasks such as classification. But there has been less exploration in learning the
factors of variation apart from the classification signal. By augmenting autoencoders with
simple regularization terms during training, we demonstrate that standard deep architectures
can discover and explicitly represent factors of variation beyond those relevant for
categorization. We introduce a cross-covariance penalty (XCov) as a method to disentangle …