Gene expression module discovery using gibbs sampling. Genome informatics. International Conference on Genome Informatics

C.J. Wu, Y. Fu, T.M. Murali, S. Kasif

Research output: Contribution to journalArticlepeer-review

26 Citations (Scopus)

Abstract

Recent advances in high throughput profiling of gene expression have catalyzed an explosive growth in functional genomics aimed at the elucidation of genes that are differentially expressed in various tissue or cell types across a range of experimental conditions. These studies can lead to the identification of diagnostic genes, classification of genes into functional categories, association of genes with regulatory pathways, and clustering of genes into modules that are potentially co-regulated by a group of transcription factors. Traditional clustering methods such as hierarchical clustering or principal component analysis are difficult to deploy effectively for several of these tasks since genes rarely exhibit similar expression pattern across a wide range of conditions. Bi-clustering of gene expression data is a promising methodology for identification of gene groups that show a coherent expression profile across a subset of conditions. This methodology can be a first step towards the discovery of co-regulated and co-expressed genes or modules. Although bi-clustering (also called block clustering) was introduced in statistics in 1974 few robust and efficient solutions exist for extracting gene expression modules in microarray data. In this paper, we propose a simple but promising new approach for bi-clustering based on a Gibbs sampling paradigm. Our algorithm is implemented in the program GEMS (Gene Expression Module Sampler). GEMS has been tested on synthetic data generated to evaluate the effect of noise on the performance of the algorithm as well as on published leukemia datasets. In our preliminary studies comparing GEMS with other bi-clustering software we show that GEMS is a reliable, flexible and computationally efficient approach for bi-clustering gene expression data.
Original languageEnglish
Pages (from-to)239-248
Number of pages10
JournalGenome Inform
Volume15
Issue number1
Publication statusPublished - 2004
Externally publishedYes

Keywords

  • Cluster Analysis
  • Computational Biology
  • Databases, Nucleic Acid
  • Gene Expression
  • Humans
  • Leukemia
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • article
  • biological model
  • biology
  • cluster analysis
  • DNA microarray
  • gene expression
  • genetics
  • human
  • leukemia
  • methodology
  • nucleic acid database

Fingerprint

Dive into the research topics of 'Gene expression module discovery using gibbs sampling. Genome informatics. International Conference on Genome Informatics'. Together they form a unique fingerprint.

Cite this