Selection of the bandwidth parameter in a Bayesian kernel regression model for genomic-enabled prediction

S Pérez-Elizalde, J Cuevas, P Pérez-Rodríguez… - Journal of agricultural …, 2015 - Springer
Journal of agricultural, biological, and environmental statistics, 2015Springer
One of the most widely used kernel functions in genomic-enabled prediction is the Gaussian
kernel. Selection of the bandwidth parameter for kernel regression has generally been
based on cross-validation. We propose a Bayesian method for estimating the bandwidth
parameter h of a Gaussian kernel as the modal component of the joint posterior distribution
of h and the form parameter φ φ. We present a theory for the Bayesian selection of h in a
Transformed Gaussian Kernel (TGK) model and its application in two plant breeding …
Abstract
One of the most widely used kernel functions in genomic-enabled prediction is the Gaussian kernel. Selection of the bandwidth parameter for kernel regression has generally been based on cross-validation. We propose a Bayesian method for estimating the bandwidth parameter h of a Gaussian kernel as the modal component of the joint posterior distribution of h and the form parameter . We present a theory for the Bayesian selection of h in a Transformed Gaussian Kernel (TGK) model and its application in two plant breeding datasets (maize and wheat) that were already predicted using the kernel averaging (KA) model in the context of Reproducing Kernel Hilbert Spaces (RKHS KA). We also compared the prediction accuracy of the proposed method with a model that also uses a Gaussian kernel and estimates the bandwidth parameter using a restricted maximum likelihood method (GK REML). Results for the wheat dataset show that the predictive ability of TGK was at least as good as the predictive ability of model RKHS KA, with TGK showing a significantly smaller Predictive Mean Squared Error (PMSE) than the other two approaches. The TGK model was statistically a better predictor than methods GK REML and RKHS KA in terms of mean PMSE and mean correlations in seven (out of 17) trait-environment combinations in the wheat dataset. Fewer differences were found between models for the maize data; the TGK model generally had similar or inferior prediction accuracy than GK REML and RKHS KA in various analyses. The superiority of GK REML over TGK based on mean PMSE was clear in seven maize traits.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果