MaskDNA-PGD: An innovative deep learning model for detecting DNA methylation by integrating mask sequences and adversarial PGD training as a data augmentation method

Zhiwei Zheng, Nguyen Quoc Khanh Le, Matthew Chin Heng Chua

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

DNA methylation occurs in mammals’ various diseases, such as cancer and myocardial pain. For a long time, scholars have tried to use machine learning and deep learning to learn the characteristics of DNA sequences with high precision for methylation classifications. However, these studies primarily innovated in encoding and seldom employed deep neural networks for predictions. Hence, this research proposes a framework with random masking and adversarial sample generation in the previous process. Our proposed novel classification model approach composes of convolutional neural network (CNN), bidirectional long short term memory (Bi-LSTM) and attention mechanism as predictors. The benchmark illustrates the automation and advancement of the proposed framework, which can accurately binarily classify diverse DNA methylation. Random masking and adversarial sample generation are proven effective by conducting ablation experiments. In detail, our model achieved the best accuracy of 85.07%, 94.97%, and 92.17% in predicting multi-species N4-methylcytosine, 5-methylcytosine, and N6-methyladenine sites, respectively. Moreover, by comparing performance with two other methods using the same datasets and indexes, the proposed model (namely MaskDNA-PGD) successfully surpasses it. Finally, our MaskDNA-PGD can be freely accessed via https://github.com/willyzzz/MaskDNA-PGD.

Original languageEnglish
Article number104715
JournalChemometrics and Intelligent Laboratory Systems
Volume232
DOIs
Publication statusPublished - Jan 15 2023

Keywords

  • Adversarial network
  • Bidirectional long short term memory
  • Convolutional neural network
  • Data augmentation
  • DNA methylation
  • Sequence encoding

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Computer Science Applications
  • Process Chemistry and Technology
  • Spectroscopy

Fingerprint

Dive into the research topics of 'MaskDNA-PGD: An innovative deep learning model for detecting DNA methylation by integrating mask sequences and adversarial PGD training as a data augmentation method'. Together they form a unique fingerprint.

Cite this