The Hardy-Weinberg Theorem

By Jon Baker, National Marine Fisheries Service and BioLab

It is fair to say that the Hardy-Weinberg theorem or law is the foundation of population genetics. Population geneticists use the theorem regularly during the course of their research. Because of its importance, understanding how it works and what it means is crucial to understanding genetic changes in populations. This activity will introduce you to Hardy-Weinberg theorem by presenting the equation, defining the mathematical notation used, explaining the theorem through an example, describing the assumptions under which the theorem is true, revealing the consequences, and providing problems for you to solve.

Before we get to the Hardy-Weinberg theorem specifically we will have to gain an understanding of the fundamental genetic structure of populations. We will begin with a one locus, two allele example. Imagine that a locus has two alleles, A1 and A2. It is a simple jump to see that there are three possible genotypes, A1A1, A2A2, and A1A2. Each genotype has a genotypic frequency and the sum of all genotypic frequencies in the population must add up to 1. You have already calculated gene and genotypic frequencies but have not considered the probability of getting a particular genotype given the gene frequencies. In randomly mating diploid organisms, each parent contributes one allele of a gene pair. The probability of a given genotype from that mating is the product of the individual gene frequencies. For example, if the A1 allele has a frequency of .25 in the population then chance that a sperm is carrying that allele is .25 and the chance that an egg is carrying that allele is .25 which means the chance of those two coming together and forming a A1A1 genotype is the product of those two probabilities, .25 x .25 or (.25)2 or (A1)2. Similarly the A2 allele would have a frequency of .75 and the chance of a A2A2 genotype is (.75)2 or (A2)2. The probability of getting a A1A2 is a different story. Since you can get this genotype two ways (the sperm can contribute the A1 and the egg the A2 or the sperm can contribute the A2 and the egg the A1) the probability is calculated differently. It is the sum of the two probabilities or (.25 x .75) + (.75 x .25) = 2(.25 x .75).

Let us introduce a commonly used notation. The A1 allele will be called p (A1 = p) and the A2 allele q (A2 = q). Applying this notation to the calculation of genotypic frequencies from above we have:

A1A1 = p2

A2A2 = q2

A1A2 = 2pq

Because the genotypic frequencies must add to 1 it follows that:

p2 + 2pq + q2 = 1

It is this equation that is the essence of the Hardy-Weinberg equilibrium. It should look familiar from your algebra class if you substitute x and y for the p and q.

The Theorem

So,  after all this you are probably wondering, "what in the heck is this theorem anyway"? Well the Theorem tells us about how randomly mating populations should behave. Provided that the assumptions for the theorem are true (we will get to the assumptions later), the initial frequency of alleles does not change from generation to generation (over time) and both the genotypic and allele frequencies will reach equilibrium in ONE generation of random mating (that is fast). Now the best way to illustrate how the theorem works is by doing the math. So let's give it a try.  We'll take a break from salmon for a bit here.

Demonstrating Hardy-Weinberg

Imagine there is a breed of dog in which two, codominant alleles determine the length of the tail. AA will produce a long tail, Aa - a medium tail, and aa - a stub tail. A breeder really likes this breed of dog but can not determine which kind to breed. So she buys 75 AA and 25 aa dogs and then lets them randomly mate.

So at time 0 or t0 when breeding begins p2 = .75, 2pq = 0.0, q2 = .25

And clearly, p0 = .75 and q0 = .25

Now let's apply the Hardy-Weinberg equation, p2 + 2pq + q2 = 1, to predict the genotypic frequencies after the first round of matings.

When we do, we see,

LONG                               MEDIUM                            SHORT

p02 = (.75)2 = .5625,     2p0q0 = 2(.75 x .25) = .375,     q02 = (.25)2 = .0625

So after one generation we see that the frequency of each genotype is

AA = .5625      Aa = .375       aa = .0625

Now that the alleles have been sorted into puppies and therefore genotypes, what is the frequency for each allele at t1? Well using these genotypic frequencies and Method #3 used to calculate allele frequencies given the genotypic frequencies in Activity 5 we see:

p1 = (.5625) + (.375) / 2                   q1 = (.0625) + (.375) / 2

                                  = .75                                              = .25

After one generation the frequency of each allele is the same as when we began. Also, we now have the alleles assorted into new genotypes. When we started there was only AA and aa dogs, now there are AA, Aa and aa - long, medium and short tailed dogs! Now if we were to let the dogs breed again, what would be the result? We are starting with the same allele frequencies and we will let the dogs randomly mate. Well what do you think? You bet! The result is that there will be no change in either the gene frequencies or genotypic frequencies and this took just one generation. So what we have just done is derived the Hardy-Weinberg theorem and you can see that after just one generation genetic equilibrium is reached - as long as the ASSUMPTIONS are met.

The Assumptions

As you can see from the example above, if you leave a population of alleles alone, the gene and genotypic frequencies will reach equilibrium very quickly and remain there indefinitely. However, the "leave the population of alleles alone" clause is based on the following assumptions:

  1. Random union of gametes (mating)
  2. Discrete generations
  3. Large population
  4. No mutation
  5. No immigration
  6. No emigration
  7. No differential fertility or survival (selection)

This is where the fun in population genetics begins. You see, no population meets all these assumptions perfectly and therefore are always evolving.

 Take Home Message

The Hardy-Weinberg theorem shows that gene and genotypic frequencies will reach equilibrium in one generation. Also, genetic variation can not be lost as a result of mating. So without selection, genetic drift, non-random mating, mutation, emigration and immigration genetic variation will be conserved.

Questions

Determine if each locus is in Hardy-Weinberg equilibrium. To do this you must calculate the allele frequencies, genotypic frequencies, expected genotypic frequencies from the known allele frequencies, and then test the expected genotypic frequencies against the observed genotypic frequencies using chi-square. Whew!

 

 

Locus 1

 

Locus 2

 

Locus 3

 

Locus 4

 

Locus 5

 

Sample

 

089-1

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

A

 

089-2

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

089-3

 

A

 

A

 

A

 

A

 

A

 

B

 

B

 

B

 

A

 

B

 

089-4

 

A

 

A

 

A

 

A

 

B

 

B

 

B

 

B

 

A

 

A

 

089-5

 

A

 

A

 

A

 

A

 

A

 

B

 

B

 

B

 

A

 

B

 

089-6

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

089-7

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

089-8

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

A

 

089-9

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

089-10

 

A

 

A

 

A

 

B

 

A

 

B

 

B

 

B

 

A

 

A

 

089-11

 

A

 

A

 

A

 

A

 

B

 

B

 

A

 

A

 

A

 

A

 

089-12

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

B

 

089-13

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

B

 

089-14

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

A

 

089-15

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

089-16

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

B

 

089-17

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

A

 

089-18

 

A

 

A

 

A

 

A

 

A

 

B

 

B

 

B

 

A

 

B

 

089-19

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

A

 

089-20

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

089-21

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

089-22

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

B

 

A

 

A

 

089-23

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

A

 

089-24

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

089-25

 

A

 

A

 

A

 

A

 

B

 

B

 

A

 

B

 

A

 

A

 

089-26

 

A

 

A

 

B

 

B

 

B

 

B

 

A

 

B

 

A

 

A

 

089-27

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

A

 

089-28

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

B

 

A

 

A

 

089-29

 

A

 

A

 

A

 

A

 

B

 

B

 

A

 

B

 

A

 

B

 

089-30

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

A

 

A

 

B

 

089-31

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

A

 

089-32

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

A

 

A

 

A

 

089-33

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

089-34

 

A

 

A

 

A

 

B

 

A

 

B

 

A

 

B

 

A

 

B

 

089-35

 

A

 

A

 

A

 

A

 

A

 

B

 

A

 

A

 

A

 

B

 

089-36

 

A

 

A

 

A

 

A

 

A

 

A

 

A

 

B

 

B

 

B

 

 

 

 

 

Notation

The simplest example used to introduce the Hardy-Weinberg theorem is of a population with a single locus and two alleles. The alleles will be called A and a and three possible genotypes AA, Aa and aa. Since in population genetics we like to count things, we must count the number of individuals with each genotype found in the population. So we denote the number of each genotype found in the population by NAA, NAa and Naa. We will also use just N to denote the number of individuals in the entire population so,

N = NAA + NAa + Naa

As you may have already guessed, because population geneticists like to calculate the frequencies of things, we will need to determine the frequencies of the three genotypes and of each allele. Here is how you do it!

 

Genotypic frequencies

The frequency of each genotype is calculated by dividing the number of individuals of a particular genotype by the number of individuals in the population. A letter will be used to denote each genotype

D = NAA / N H = NAa / N R = Naa / N

So D + H + R = 1

 

Gene frequencies

To calculate the frequency of each allele lets use this notation, p = A and q = a, so

p = 2NAA + NAa / 2N

q = 2Naa + NAa / 2N

So p is the fraction of A alleles in the population (gene pool) and q is the fraction of a alleles. These frequencies are also the probability of drawing that allele from the gene pool!