Approximate block coordinate descent for large scale hierarchical classification

A Charuvaka, H Rangwala - Proceedings of the 30th Annual ACM …, 2015 - dl.acm.org
Proceedings of the 30th Annual ACM Symposium on Applied Computing, 2015dl.acm.org
In real world, we often encounter hierarchical classification problems with large number of
categories and deep hierarchies. In addition, majority of the categories do not have sufficient
examples for training classifiers with good generalization performance. Usually, the feature
space is also large, and especially so for text classification problems. Binary, multi-class, or
multi-label classification approaches that treat the hierarchical classification as a flat
classification problem, disregarding the hierarchical relationships, fail to leverage the …
In real world, we often encounter hierarchical classification problems with large number of categories and deep hierarchies. In addition, majority of the categories do not have sufficient examples for training classifiers with good generalization performance. Usually, the feature space is also large, and especially so for text classification problems. Binary, multi-class, or multi-label classification approaches that treat the hierarchical classification as a flat classification problem, disregarding the hierarchical relationships, fail to leverage the relatedness of the categories in the learning process and, consequently, perform poorly. Several approaches for hierarchical classification have been proposed in literature, but a majority of them are not sufficiently scalable to address large scale classification problems. In this paper, we study a hierarchical classification method that addresses large scale classification problem within regularized risk minimization framework. Specifically, the method studied here exploits hierarchical relationships between categories by imposing the constraint that the learned model vectors for a category should be similar to its parent category. We study and analyze an approximate block coordinate descent procedure and compare its performance to a previously proposed exact coordinate descent method for this problem. We further examine the performance of this method on various aspects of the hierarchical classification problem on large hierarchical text classification datasets.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果