Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes

Theo H.E. Meuwissen; Mike E. Goddard

doi:doi:10.1051/gse:2003058

All issues

Volume 36 / No 2 (March-April 2004)

Genet. Sel. Evol., 36 2 (2004) 191-205

Abstract

Free Access

Issue		Genet. Sel. Evol. Volume 36, Number 2, March-April 2004


Page(s)		191 - 205
DOI		https://doi.org/10.1051/gse:2003058

Genet. Sel. Evol. 36 (2004) 191-205
DOI: 10.1051/gse:2003058

Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes

Theo H.E. Meuwissen^a and Mike E. Goddard^b

^a Institute for Animal Science, Agricultural University of Norway, 1432 Ås, Norway
^b Institute of Land and Food Resources, University of Melbourne, Parkville, 3052 Australia, and Victorian Institute of Animal Science, Attwood, Victoria, 3049 Australia

(Received 17 December 2002; accepted 23 October 2003)

Abstract
The ordinary-, penalized-, and bootstrap t-test, least squares and best linear unbiased prediction were compared for their false discovery rates (FDR), i.e. the fraction of falsely discovered genes, which was empirically estimated in a duplicate of the data set. The bootstrap- t-test yielded up to 80% lower FDRs than the alternative statistics, and its FDR was always as good as or better than any of the alternatives. Generally, the predicted FDR from the bootstrapped P-values agreed well with their empirical estimates, except when the number of mRNA samples is smaller than 16. In a cancer data set, the bootstrap- t-test discovered 200 differentially regulated genes at a FDR of 2.6%, and in a knock-out gene expression experiment 10 genes were discovered at a FDR of 3.2%. It is argued that, in the case of microarray data, control of the FDR takes sufficient account of the multiple testing, whilst being less stringent than Bonferoni-type multiple testing corrections. Extensions of the bootstrap simulations to more complicated test-statistics are discussed.

Key words: microarray data / gene expression / non-parametric bootstrapping / t-test / false discovery rates

Correspondence and reprints: Theo H.E. Meuwissen This email address is being protected from spambots. You need JavaScript enabled to view it.

© INRA, EDP Sciences 2004