作者
César Ferri, Peter Flach, José Hernández-Orallo
发表日期
2002/7/8
期刊
Icml
卷号
2
页码范围
139-146
简介
ROC analysis is increasingly being recognised as an important tool for evaluation and comparison of classifiers when the operating characteristics (ie class distribution and cost parameters) are not known at training time. Usually, each classifier is characterised by its estimated true and false positive rates and is represented by a single point in the ROC diagram. In this paper, we show how a single decision tree can represent a set of classifiers by choosing different labellings of its leaves, or equivalently, an ordering on the leaves. In this setting, rather than estimating the accuracy of a single tree, it makes more sense to use the area under the ROC curve (AUC) as a quality metric. We also propose a novel splitting criterion which chooses the split with the highest local AUC. To the best of our knowledge, this is the first probabilistic splitting criterion that is not based on weighted average impurity. We present experiments suggesting that the AUC splitting criterion leads to trees with equal or better AUC value, without sacrificing accuracy if a single labelling is chosen.
引用总数
200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320247142817212931372416301723171515141518142362
学术搜索中的文章