Sections
How many SNPs? | How many individuals? | How to reduce false-positive findings?
Excerpt
The presence of linkage disequilibrium in the human genome
allows the investigator to evaluate a large extent of the common
genetic variation with selected markers. Estimates of the number
of SNPs necessary to account for most of the common sequence variation
across the genome (e.g., to account for SNPs with minor allele frequencies
of 1% or greater) have varied over time, with estimates
ranging from as few as 10,000–100,000 to over a million
SNPs. Thanks to the HapMap and the ENCODE resequencing projects
(International HapMap Consortium 2003 and www.hapmap.org),
we are starting to have a better idea of how efficiently we have and
can cover the genetic information using SNP assays. These massive
genotyping and resequencing projects have confirmed the segments
of long LD in the genome and thus the possibility of using tag SNPs
(see discussion above) for each of these segments. Using the data
from several hundred thousand SNPs should be sufficient to cover
most common variants in Caucasians; more SNPs will be necessary
in African populations with shorter LD distances (for a review,
see Hirschhorn and Daly 2005). Based on the information
in the HapMap and ENCODE databases, Barrett and Cardon (2006)
estimated the coverage of commercially available whole-genome SNP
panels from Illumina (www.Illumina.com) and Affymetrix (www.affymetrix.com).
The Illumina 300k chip offered coverage of 75% in Caucasians
and the Affymetrix 500K 65%. These values dropped dramatically
for Yorubans, to 28% and 41%, respectively. While
these whole-genome arrays with 300,000 to 500,000 SNP assays offered
relatively good coverage of more common variants, rare SNPs are
not well covered. While these earlier arrays still had relatively
substantial holes in the genomewide coverage, especially in African
populations, the development in this area is exponential. For example,
Affymetrix now offers a 1.8 million SNP array, and Illumina a 1
million SNP chip. Incomplete coverage should soon become a problem of
the past (Schuster 2008).