Free Access
Genet. Sel. Evol.
Volume 33, Number 4, July-August 2001
Page(s) 337 - 367
DOI: 10.1051/gse:2001122

Genet. Sel. Evol. 33 (2001) 337-367

Sampling genotypes in large pedigrees with loops

Soledad A. Fernándeza, b, Rohan L. Fernandoa, c, Bernt Guldbrandtsend, Liviu R. Totira and Alicia L. Carriquiryb, c

a  Department of Animal Science, Iowa State University, 225 Kildee Hall, Ames, IA 50011, USA
b  Department of Statistics, Iowa State University, 225 Kildee Hall, Ames, IA 50011, USA
c  Lawrence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA
d  Danish Institute of Animal Science, Foulum, Denmark

(Received 19 October 2000; accepted 23 February 2001)

Markov chain Monte Carlo (MCMC) methods have been proposed to overcome computational problems in linkage and segregation analyses. This approach involves sampling genotypes at the marker and trait loci. Scalar-Gibbs is easy to implement, and it is widely used in genetics. However, the Markov chain that corresponds to scalar-Gibbs may not be irreducible when the marker locus has more than two alleles, and even when the chain is irreducible, mixing has been observed to be slow. These problems do not arise if the genotypes are sampled jointly from the entire pedigree. This paper proposes a method to jointly sample genotypes. The method combines the Elston-Stewart algorithm and iterative peeling, and is called the ESIP sampler. For a hypothetical pedigree, genotype probabilities are estimated from samples obtained using ESIP and also scalar-Gibbs. Approximate probabilities were also obtained by iterative peeling. Comparisons of these with exact genotypic probabilities obtained by the Elston-Stewart algorithm showed that ESIP and iterative peeling yielded genotypic probabilities that were very close to the exact values. Nevertheless, estimated probabilities from scalar-Gibbs with a chain of length 235 000, including a burn-in of 200 000 steps, were less accurate than probabilities estimated using ESIP with a chain of length 10 000, with a burn-in of 5 000 steps. The effective chain size (ECS) was estimated from the last 25 000 elements of the chain of length 125 000. For one of the ESIP samplers, the ECS ranged from 21 579 to 22 741, while for the scalar-Gibbs sampler, the ECS ranged from 64 to 671. Genotype probabilities were also estimated for a large real pedigree consisting of 3 223 individuals. For this pedigree, it is not feasible to obtain exact genotype probabilities by the Elston-Stewart algorithm. ESIP and iterative peeling yielded very similar results. However, results from scalar-Gibbs were less accurate.

Key words: genotype sampler / Markov chain Monte Carlo / peeling

Correspondence and reprints: Rohan L. Fernando

© INRA, EDP Sciences 2001