On the dangers of cross-validation. An experimental evaluation

RB Rao, G Fung, R Rosales - Proceedings of the 2008 SIAM international …, 2008 - SIAM
Proceedings of the 2008 SIAM international conference on data mining, 2008SIAM
Cross validation allows models to be tested using the full training set by means of repeated
resampling; thus, maximizing the total number of points used for testing and potentially,
helping to protect against overfitting. Improvements in computational power, recent
reductions in the (computational) cost of classification algorithms, and the development of
closed-form solutions (for performing cross validation in certain classes of learning
algorithms) makes it possible to test thousand or millions of variants of learning models on …
Abstract
Cross validation allows models to be tested using the full training set by means of repeated resampling; thus, maximizing the total number of points used for testing and potentially, helping to protect against overfitting. Improvements in computational power, recent reductions in the (computational) cost of classification algorithms, and the development of closed-form solutions (for performing cross validation in certain classes of learning algorithms) makes it possible to test thousand or millions of variants of learning models on the data. Thus, it is now possible to calculate cross validation performance on a much larger number of tuned models than would have been possible otherwise. However, we empirically show how under such large number of models the risk for overfitting increases and the performance estimated by cross validation is no longer an effective estimate of generalization; hence, this paper provides an empirical reminder of the dangers of cross validation. We use a closed-form solution that makes this evaluation possible for the cross validation problem of interest. In addition, through extensive experiments we expose and discuss the effects of the overuse/misuse of cross validation in various aspects, including model selection, feature selection, and data dimensionality. This is illustrated on synthetic, benchmark, and real-world data sets.
Society for Industrial and Applied Mathematics
以上显示的是最相近的搜索结果。 查看全部搜索结果