A data mining paradigm for identifying key factors in biological processes using gene expression data

J Li, L Zheng, A Uchiyama, L Bin, TM Mauro… - Scientific reports, 2018 - nature.com
J Li, L Zheng, A Uchiyama, L Bin, TM Mauro, PM Elias, T Pawelczyk, M Sakowicz-Burkiewicz…
Scientific reports, 2018nature.com
A large volume of biological data is being generated for studying mechanisms of various
biological processes. These precious data enable large-scale computational analyses to
gain biological insights. However, it remains a challenge to mine the data efficiently for
knowledge discovery. The heterogeneity of these data makes it difficult to consistently
integrate them, slowing down the process of biological discovery. We introduce a data
processing paradigm to identify key factors in biological processes via systematic collection …
Abstract
A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.
nature.com
以上显示的是最相近的搜索结果。 查看全部搜索结果