作者
Sotiris B Kotsiantis, Dimitris Kanellopoulos, Panagiotis E Pintelas
发表日期
2006/1
期刊
International journal of computer science
卷号
1
期号
2
页码范围
111-117
简介
(ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.
引用总数
200720082009201020112012201320142015201620172018201920202021202220232024566122628485156707910614918121620419680
学术搜索中的文章
SB Kotsiantis, D Kanellopoulos, PE Pintelas - International journal of computer science, 2006