TY - JOUR
T1 - Integrative Analysis of High-throughput Cancer Studies With Contrasted Penalization
AU - Shi, Xingjie
AU - Liu, Jin
AU - Huang, Jian
AU - Zhou, Yong
AU - Shia, Benchang
AU - Ma, Shuangge
PY - 2014/2
Y1 - 2014/2
N2 - In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms "classic" meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.
AB - In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms "classic" meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance.
KW - Contrasted penalization
KW - High-throughput cancer studies
KW - Integrative analysis
KW - Marker selection
UR - http://www.scopus.com/inward/record.url?scp=84892545864&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84892545864&partnerID=8YFLogxK
U2 - 10.1002/gepi.21781
DO - 10.1002/gepi.21781
M3 - Article
C2 - 24395534
AN - SCOPUS:84892545864
SN - 0741-0395
VL - 38
SP - 144
EP - 151
JO - Genetic Epidemiology
JF - Genetic Epidemiology
IS - 2
ER -