Issue |
Genet. Sel. Evol.
Volume 36, Number 2, March-April 2004
|
|
---|---|---|
Page(s) | 191 - 205 | |
DOI | https://doi.org/10.1051/gse:2003058 |
DOI: 10.1051/gse:2003058
Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes
Theo H.E. Meuwissena and Mike E. Goddardba Institute for Animal Science, Agricultural University of Norway, 1432 Ås, Norway
b Institute of Land and Food Resources, University of Melbourne, Parkville, 3052 Australia, and Victorian Institute of Animal Science, Attwood, Victoria, 3049 Australia
(Received 17 December 2002; accepted 23 October 2003)
Abstract
The ordinary-, penalized-, and bootstrap
t-test, least squares and best
linear unbiased prediction were compared for their false discovery rates
(FDR), i.e. the fraction of falsely discovered genes, which was empirically
estimated in a duplicate of the data set. The bootstrap-
t-test yielded up to
80% lower FDRs than the alternative statistics, and its FDR was always as
good as or better than any of the alternatives. Generally, the predicted FDR
from the bootstrapped
P-values agreed well with their empirical estimates,
except when the number of mRNA samples is smaller than 16. In a cancer data
set, the bootstrap-
t-test discovered 200 differentially regulated genes at a
FDR of 2.6%, and in a knock-out gene expression experiment 10 genes were
discovered at a FDR of 3.2%. It is argued that, in the case of microarray
data, control of the FDR takes sufficient account of the multiple testing,
whilst being less stringent than Bonferoni-type multiple testing
corrections. Extensions of the bootstrap simulations to more complicated
test-statistics are discussed.
Key words: microarray data / gene expression / non-parametric bootstrapping / t-test / false discovery rates
Correspondence and reprints: Theo H.E. Meuwissen theo.meuwissen@iha.nlh.no
© INRA, EDP Sciences 2004