The HardyWeinberg Theorem
By Jon Baker, National Marine Fisheries Service and BioLab
It is fair to say that the HardyWeinberg theorem or law is the foundation of population genetics. Population geneticists use the theorem regularly during the course of their research. Because of its importance, understanding how it works and what it means is crucial to understanding genetic changes in populations. This activity will introduce you to HardyWeinberg theorem by presenting the equation, defining the mathematical notation used, explaining the theorem through an example, describing the assumptions under which the theorem is true, revealing the consequences, and providing problems for you to solve.
Before we get to the HardyWeinberg theorem specifically we will have to gain an understanding of the fundamental genetic structure of populations. We will begin with a one locus, two allele example. Imagine that a locus has two alleles, A_{1} and A_{2}. It is a simple jump to see that there are three possible genotypes, A_{1}A_{1}, A_{2}A_{2}, and A_{1}A_{2}. Each genotype has a genotypic frequency and the sum of all genotypic frequencies in the population must add up to 1. You have already calculated gene and genotypic frequencies but have not considered the probability of getting a particular genotype given the gene frequencies. In randomly mating diploid organisms, each parent contributes one allele of a gene pair. The probability of a given genotype from that mating is the product of the individual gene frequencies. For example, if the A_{1 }allele has a frequency of .25 in the population then chance that a sperm is carrying that allele is .25 and the chance that an egg is carrying that allele is .25 which means the chance of those two coming together and forming a A_{1}A_{1} genotype is the product of those two probabilities, .25 x .25 or (.25)^{2} or (A_{1})^{2}. Similarly the A_{2} allele would have a frequency of .75 and the chance of a A_{2}A_{2} genotype is (.75)^{2} or (A_{2})^{2}. The probability of getting a A_{1}A_{2} is a different story. Since you can get this genotype two ways (the sperm can contribute the A_{1} and the egg the A_{2} or the sperm can contribute the A_{2} and the egg the A_{1}) the probability is calculated differently. It is the sum of the two probabilities or (.25 x .75) + (.75 x .25) = 2(.25 x .75).
Let us introduce a commonly used notation. The A_{1} allele will be called p (A_{1} = p) and the A_{2 }allele q (A_{2} = q). Applying this notation to the calculation of genotypic frequencies from above we have:
A_{1}A_{1} = p^{2}
A_{2}A_{2} = q^{2}
A_{1}A_{2} = 2pq
Because the genotypic frequencies must add to 1 it follows that:
p^{2 }+ 2pq + q^{2} = 1
It is this equation that is the essence of the HardyWeinberg equilibrium. It should look familiar from your algebra class if you substitute x and y for the p and q.
The Theorem
So, after all this you are probably wondering, "what in the heck is this theorem anyway"? Well the Theorem tells us about how randomly mating populations should behave. Provided that the assumptions for the theorem are true (we will get to the assumptions later), the initial frequency of alleles does not change from generation to generation (over time) and both the genotypic and allele frequencies will reach equilibrium in ONE generation of random mating (that is fast). Now the best way to illustrate how the theorem works is by doing the math. So let's give it a try. We'll take a break from salmon for a bit here.
Demonstrating HardyWeinberg
Imagine there is a breed of dog in which two, codominant alleles determine the length of the tail. AA will produce a long tail, Aa  a medium tail, and aa  a stub tail. A breeder really likes this breed of dog but can not determine which kind to breed. So she buys 75 AA and 25 aa dogs and then lets them randomly mate.
So at time 0 or t_{0} when breeding begins p^{2} = .75, 2pq = 0.0, q^{2} = .25
And clearly, p_{0} = .75 and q_{0} = .25
Now let's apply the HardyWeinberg equation, p^{2} + 2pq + q^{2 }= 1, to predict the genotypic frequencies after the first round of matings.
When we do, we see,
LONG MEDIUM SHORT
p_{0}^{2} = (.75)^{2} = .5625, 2p_{0}q_{0} = 2(.75 x .25) = .375, q_{0}^{2} = (.25)^{2} = .0625
So after one generation we see that the frequency of each genotype is
AA = .5625 Aa = .375 aa = .0625
Now that the alleles have been sorted into puppies and therefore genotypes, what is the frequency for each allele at t_{1}? Well using these genotypic frequencies and Method #3 used to calculate allele frequencies given the genotypic frequencies in Activity 5 we see:
p_{1} = (.5625) + (.375) / 2 q_{1 }= (.0625) + (.375) / 2
= .75 = .25
After one generation the frequency of each allele is the same as when we began. Also, we now have the alleles assorted into new genotypes. When we started there was only AA and aa dogs, now there are AA, Aa and aa  long, medium and short tailed dogs! Now if we were to let the dogs breed again, what would be the result? We are starting with the same allele frequencies and we will let the dogs randomly mate. Well what do you think? You bet! The result is that there will be no change in either the gene frequencies or genotypic frequencies and this took just one generation. So what we have just done is derived the HardyWeinberg theorem and you can see that after just one generation genetic equilibrium is reached  as long as the ASSUMPTIONS are met.
The Assumptions
As you can see from the example above, if you leave a population of alleles alone, the gene and genotypic frequencies will reach equilibrium very quickly and remain there indefinitely. However, the "leave the population of alleles alone" clause is based on the following assumptions:
 Random union of gametes (mating)
 Discrete generations
 Large population
 No mutation
 No immigration
 No emigration
 No differential fertility or survival (selection)
This is where the fun in population genetics begins. You see, no population meets all these assumptions perfectly and therefore are always evolving.
Take Home Message
The HardyWeinberg theorem shows that gene and genotypic frequencies will reach equilibrium in one generation. Also, genetic variation can not be lost as a result of mating. So without selection, genetic drift, nonrandom mating, mutation, emigration and immigration genetic variation will be conserved.
Questions
Determine if each locus is in HardyWeinberg equilibrium. To do this you must calculate the allele frequencies, genotypic frequencies, expected genotypic frequencies from the known allele frequencies, and then test the expected genotypic frequencies against the observed genotypic frequencies using chisquare. Whew!
























































































































































































































































































































































































































































Notation
The simplest example used to introduce the HardyWeinberg theorem is of a population with a single locus and two alleles. The alleles will be called A and a and three possible genotypes AA, Aa and aa. Since in population genetics we like to count things, we must count the number of individuals with each genotype found in the population. So we denote the number of each genotype found in the population by N_{AA}, N_{Aa} and N_{aa}. We will also use just N to denote the number of individuals in the entire population so,
N = N_{AA} + N_{Aa} + N_{aa}
As you may have already guessed, because population geneticists like to calculate the frequencies of things, we will need to determine the frequencies of the three genotypes and of each allele. Here is how you do it!
Genotypic frequencies
The frequency of each genotype is calculated by dividing the number of individuals of a particular genotype by the number of individuals in the population. A letter will be used to denote each genotype
D = N_{AA} / N H = N_{Aa} / N R = N_{aa} / N So D + H + R = 1
Gene frequencies
To calculate the frequency of each allele lets use this notation, p = A and q = a, so
p = 2N_{AA} + N_{Aa} / 2N q = 2N_{aa} + N_{Aa} / 2N
So p is the fraction of A alleles in the population (gene pool) and q is the fraction of a alleles. These frequencies are also the probability of drawing that allele from the gene pool!