TY - JOUR
T1 - Methods for high-throughput MethylCap-Seq data analysis.
AU - Rodriguez, Benjamin A.T.
AU - Frankhouser, David
AU - Murphy, Mark
AU - Trimarchi, Michael
AU - Tam, Hok Hei
AU - Curfman, John
AU - Huang, Rita
AU - Chan, Michael W.Y.
AU - Lai, Hung Cheng
AU - Parikh, Deval
AU - Ball, Bryan
AU - Schwind, Sebastian
AU - Blum, William
AU - Marcucci, Guido
AU - Yan, Pearlly
AU - Bundschuh, Ralf
N1 - Funding Information:
Based on “A scalable, flexible workflow for MethylCap-seq data analysis”, by Benjamin AT Rodriguez, Hok-Hei Tam, David Frankhouser, Michael Trimarchi, Mark Murphy, Chris Kuo, Deval Parikh, Bryan Ball, Sebastian Schwind, John Curfman, William Blum, Guido Marcucci, Pearlly Yan and Ralf Bundschuh which appeared in Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on. © 2011 IEEE [15]. This work was supported by NCI Comprehensive Cancer Center Support Grant P30 CA016058 (PY and GM) and CA102031 (GM), as well as 5 P50 CA140158-03 (GM and RB). This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center. This article has been published as part of BMC Genomics Volume 13 Supplement 6, 2012: Selected articles from the IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) 2011. The full contents of the supplement are available online at http://www. biomedcentral.com/bmcgenomics/supplements/13/S6.
PY - 2012
Y1 - 2012
N2 - Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups. The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization. Here, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J. This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
AB - Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups. The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization. Here, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J. This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
UR - http://www.scopus.com/inward/record.url?scp=84876071513&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876071513&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-13-s6-s14
DO - 10.1186/1471-2164-13-s6-s14
M3 - Article
C2 - 23134780
AN - SCOPUS:84876071513
SN - 1471-2164
VL - 13 Suppl 6
JO - BMC Genomics
JF - BMC Genomics
ER -