Feature selection for multi-label learning based on kernelized fuzzy rough sets

Y Li, Y Lin, J Liu, W Weng, Z Shi, S Wu - Neurocomputing, 2018 - Elsevier
Y Li, Y Lin, J Liu, W Weng, Z Shi, S Wu
Neurocomputing, 2018Elsevier
Feature selection is an essential pre-processing part in multi-label learning. Multi-label
learning is usually used to deal with many complicated tasks, in which each sample is
associated with multiple labels simultaneously. Fuzzy rough set model is one of the most
effective ways for multi-label learning. However, it treats feature space and label space
separately, and only uses features to describe sample structure information. In this paper,
we fully consider the internal correlation between feature space and label space while …
Abstract
Feature selection is an essential pre-processing part in multi-label learning. Multi-label learning is usually used to deal with many complicated tasks, in which each sample is associated with multiple labels simultaneously. Fuzzy rough set model is one of the most effective ways for multi-label learning. However, it treats feature space and label space separately, and only uses features to describe sample structure information. In this paper, we fully consider the internal correlation between feature space and label space while fusing kernelized information from respective spaces. Moreover, we integrate fuzzy rough set with multiple kernel learning to finally realize feature selection. To be specific, firstly, we leverage one kind of kernel function to reveal the similarity between samples in feature space, and another one to assess the degree of label overlap between samples in label space. Secondly, we combine the kernelized information from the two spaces through linear combination to achieve precisely the lower approximation and construct a robust multi-label kernelized fuzzy rough set model, called RMFRS in this paper. Meanwhile, we discuss its properties and give theoretical analysis. Finally, we define a measurement criterion for selecting optimal features to evaluate the performance of the proposed algorithm. As many as 10 publicly available data sets are used to validate the effectiveness of our methods, and the result shows a distinct advantage over the state-of-the-art.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果