Population Genetics. Matthew B. Hamilton. Читать онлайн. Mreadz. MREADZ.COM

Название	Population Genetics
Автор произведения	Matthew B. Hamilton
Жанр	Биология
Серия
Издательство	Биология
Год выпуска	0
isbn	9781118436899

Скачать книгу

of the green labeled amelogenin locus). A full color version of this figure is available on the textbook website.

This is an example of the DNA sequence found at a microsatellite locus. This sequence is the 24.1 allele from the fibrinogen alpha chain gene, or FGA locus (Genbank accession no. AY749636; see Figure 2.8). The integral repeat is the 4 bp sequence CTTT, and most alleles have sequences that differ by some number of full CTTT repeats. However, there are exceptions where alleles have sequences with partial repeats or stutters in the repeat pattern, for example, the TTTCT and CTC sequences imbedded in the perfect CTTT repeats. In this case, the 24.1 allele is 1 bp longer than the 24‐allele sequence.

GCCCCATAGGTTTTGAACTCACAGATTAAACTGTAACCAAAATAAAATTAGGCATTATTTACAAGCTAGTTT CTTT CTTT CTTT TTTCT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTTT CTC CTTC CTTC CTTT CTTC CTTT CTTT TTTGCTGGCA ATTACAGACAAATCAA

Table 2.4 Expected numbers of each of the three MN blood group genotypes under the null hypotheses of Hardy–Weinberg. Genotype frequencies are based on a sample of 1066 Chukchi individuals, a native people of eastern Siberia (Roychoudhury and Nei 1988).

Frequency of M = = 0.4184 Frequency of N = = 0.5816
Genotype	Observed	Expected number of genotypes	Observed – Expected
MM	165	= 1066 × (0.4184)² = 186.61	−21.6
MN	562	= 1066 × 2(0.4184)(0.5816) = 518.80	43.2
NN	339	= 1066 × (0. 5816)² = 360.58	−21.6

In more general terms, the expected frequency of an event, p, times the number of trials or samples, n, gives the expected number of events or np. To test the hypothesis that p is the frequency of an event in an actual population, we compare np with images . Close agreement suggests that the parameter and the estimate are the same quantity. But a large disagreement instead suggests that p and images are likely to be different probabilities. The chi‐squared (χ²) distribution is a statistical test commonly used to compare np and images . The χ² test provides the probability of obtaining the difference (or more) between the observed images and expected (np) number of outcomes by chance alone if the null hypothesis is true. As the difference between the observed and expected grows larger, it becomes less probable that the parameter and the parameter estimate are actually the same but differ in a given sample due to chance. The χ² statistic is:

(2.7) equation

where ∑ (pronounced “sigma”) indicates taking the sum of multiple terms.

The χ² formula makes intuitive sense. In the numerator, there is a difference between the observed and Hardy–Weinberg expected number of individuals. This difference is squared, like a variance, since we do not care about the direction of the difference but only the magnitude of the difference. Then, in the denominator, we divide by the expected number of individuals to make the squared difference relative. For example, a squared difference of 4 is small if the expected number is 100 (it is 4%) but relatively larger if the expected number is 8 (it is 50%). Adding all of these relative squared differences gives the total relative squared deviation observed over all genotypes.

(2.8) equation

We need to compare our statistic to values from the χ² distribution. But, first, we need to know how much information, or the degrees of freedom (commonly abbreviated as df), was used to estimate the χ² statistic. In general, degrees of freedom are based on the number of categories of data: df = no. of classes compared − no. of parameters estimated −1 for the χ² test itself. In this case, df = 3–1 − 1 = 1 for three genotypes and one estimated allele frequency (with two alleles: the other allele frequency is fixed once the first has been estimated).

Figure 2.9 shows a χ² distribution for one degree of freedom. Small deviations of the observed from the expected are more probable since they leave more area of the distribution to the right of the χ² value. As the χ² value gets larger, the probability that the difference between the observed and expected is just due to chance sampling decreases (the area under the curve to the right gets smaller). Another way of saying this is that as the observed and expected get increasingly different, it becomes more improbable that our null hypothesis of Hardy–Weinberg is actually the process that is determining genotype frequencies. Using Table 2.5, we see that a χ² value of 7.46 with 1 df has a probability between 0.01 and 0.001. The conclusion is that the observed genotype frequencies would be observed less than 1% of the time in a population that actually had Hardy–Weinberg expected genotype frequencies. Under the null hypothesis, we do not expect this much difference or more from Hardy–Weinberg expectations to occur often. By convention, we would reject chance as the explanation for the differences if the χ² value had a probability of 0.05 or less. In other words, if chance explains the difference in five trials out of 100 or less, then we reject the hypothesis that the observed and expected patterns are the same. The critical value above which we reject the null hypothesis for a χ² test is 3.84 with 1 df, or in notation χ²_{0.05, 1} = 3.84. In this case, we can clearly see an excess of heterozygotes and deficits of homozygotes, and employing the χ² test allows us to conclude that Hardy–Weinberg expected genotype frequencies are not present in the population.

Graph depicts x power 2 distribution with one degree of freedom. The x power 2 value for the Hardy–Weinberg test with MN blood group genotypes as well as the critical value to reject the null hypothesis are shown. The area under the curve to the right of the arrow indicates the probability of observing that much or more difference between the observed and expected outcomes.

Figure 2.9 A χ² distribution with one degree of freedom. The χ² value for the Hardy–Weinberg test with MN blood group genotypes as well as the critical value to reject the null hypothesis are shown. The area under the curve to the right of the arrow indicates the probability of observing that much or more difference between the observed and expected outcomes.

Скачать книгу

Population Genetics. Matthew B. Hamilton

Информация о произведении: