Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm

MC Limas, JB Ordieres Meré… - Data Mining and …, 2004 - Springer
Data Mining and Knowledge Discovery, 2004Springer
A new method of outlier detection and data cleaning for both normal and non-normal
multivariate data sets is proposed. It is based on an iterated local fit without a priori metric
assumptions. We propose a new approach supported by finite mixture clustering which
provides good results with large data sets. A multi-step structure, consisting of three phases,
is developed. The importance of outlier detection in industrial modeling for open-loop control
prediction is also described. The described algorithm gives good results both in simulations …
Abstract
A new method of outlier detection and data cleaning for both normal and non-normal multivariate data sets is proposed. It is based on an iterated local fit without a priori metric assumptions. We propose a new approach supported by finite mixture clustering which provides good results with large data sets. A multi-step structure, consisting of three phases, is developed. The importance of outlier detection in industrial modeling for open-loop control prediction is also described. The described algorithm gives good results both in simulations runs with artificial data sets and with experimental data sets recorded in a rubber factory. Finally, some discussion about this methodology is exposed.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果