We address the problems of channel estimation and optimal training sequence design for multiple-input multiple-output systems over flat fading channels in the presence of colored interference. In practice, knowledge of the unknown channel is often obtained by sending known training symbols to the receiver. During the training period, we obtain the best linear unbiased estimates of the channel parameters based on the received training block. We determine the optimal training sequence set that minimizes the mean square error of the channel estimator under a total transmit power constraint. In order to obtain the advantage of the optimal training sequence design, long-term statistics of the interference correlation are needed at the transmitter. Hence, this information needs to be estimated at the receiver and fed back to the transmitter. Obviously it is desirable that only a minimal amount of information needs to be fed back from the receiver to gain the advantage in reducing the estimation error of the short-term channel fading parameters. We develop such a feedback strategy in this paper.