Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data

LS Chen, CM Hutter, JD Potter, Y Liu… - The American Journal of …, 2010 - cell.com
LS Chen, CM Hutter, JD Potter, Y Liu, RL Prentice, U Peters, L Hsu
The American Journal of Human Genetics, 2010cell.com
Genome-wide association studies (GWAS) have successfully identified susceptibility loci
from marginal association analysis of SNPs. Valuable insight into genetic variation
underlying complex diseases will likely be gained by considering functionally related sets of
genes simultaneously. One approach is to further develop gene set enrichment analysis
methods, which are initiated in gene expression studies, to account for the distinctive
features of GWAS data. These features include the large number of SNPs per gene, the …
Genome-wide association studies (GWAS) have successfully identified susceptibility loci from marginal association analysis of SNPs. Valuable insight into genetic variation underlying complex diseases will likely be gained by considering functionally related sets of genes simultaneously. One approach is to further develop gene set enrichment analysis methods, which are initiated in gene expression studies, to account for the distinctive features of GWAS data. These features include the large number of SNPs per gene, the modest and sparse SNP associations, and the additional information provided by linkage disequilibrium (LD) patterns within genes. We propose a "gene set ridge regression in association studies (GRASS)" algorithm. GRASS summarizes the genetic structure for each gene as eigenSNPs and uses a novel form of regularized regression technique, termed group ridge regression, to select representative eigenSNPs for each gene and assess their joint association with disease risk. Compared with existing methods, the proposed algorithm greatly reduces the high dimensionality of GWAS data while still accounting for multiple hits and/or LD in the same gene. We show by simulation that this algorithm performs well in situations in which there are a large number of predictors compared to sample size. We applied the GRASS algorithm to a genome-wide association study of colon cancer and identified nicotinate and nicotinamide metabolism and transforming growth factor beta signaling as the top two significantly enriched pathways. Elucidating the role of variation in these pathways may enhance our understanding of colon cancer etiology.
cell.com
以上显示的是最相近的搜索结果。 查看全部搜索结果