No matter which pathway is used, a number of enzymes are required to complete the steps of recombination. The genes that code for these enzymes were first identified in E. coli by the isolation of mutant cells that were deficient in recombination. This research revealed that the recA gene encodes a protein necessary for strand invasion. Meanwhile, the recB, recC, and recD genes code for three polypeptides that join together to form a protein complex known as RecBCD; this complex has the capacity to unwind double-stranded DNA and cleave strands. Two other genes, ruvA and ruvB, encode enzymes that catalyze branch migration, while Holliday structures are resolved by the protein resolvase, which is product of the ruvC gene. Several enzymes involved in DNA replication, such as ligase and DNA polymerase, also contribute to recombination (Clark, 1973).
In eukaryotes, recombination has been perhaps most thoroughly studied in the budding yeast Saccharomyces cerevisiae. Many of the enzymes identified in this yeast have also been found in other organisms, including mammalian cells. Such studies reveal that the Rad genes (named for the fact that their activity was found to be sensitive to radiation) play a key role in eukaryotic recombination. In particular, the Rad51 gene, which is homologous to recA, encodes a protein (called Rad51) that has recombinase activity. This gene is highly conserved, but the accessory proteins that assist Rad51 appear to vary among organisms. For example, the Rad52 protein is found in both yeast and humans, but it is missing in Drosophila melanogaster and C. elegans.
In eukaryotic cells, single-stranded DNA (ssDNA) becomes rapidly coated with the protein RPA (replication protein A). RPA has a higher affinity for ssDNA than Rad51, and it therefore can inhibit recombination by blocking Rad51's access to the single strand needed for invasion. In yeast, however, binding of Rad51 to ssDNA is enhanced by the proteins Rad52 and the complex Rad55-Rad57. Once access has been gained, Rad51 polymerizes on the DNA strand to form what is called a presynaptic filament, which is a right-handed helical filament containing six Rad51 molecules and 18 nucleotides per helical repeat. The search for DNA homology and formation of the junction between homologous regions is then carried out within the catalytic center of the filament.
In addition to proteins that assist Rad51 activity, there are also some proteins that inhibit it. In yeast, for instance, the helicase Srs2 dismantles the Rad51-ssDNA complex, while the proteins Sgs1 and BLM inhibit the complex. It is thought that these proteins play a role in preventing recombination during DNA replication when it is not needed.
In humans, the tumor suppressor genes BRCA1 and BRCA2 also play a role in regulating recombination. Individuals who are heterozygous for BRCA2 are subject to increased risk for breast and ovarian cancer; loss of both alleles causes Fanconi's anemia, a genetic disease characterized by predisposition to cancer, among other defects. BRCA2 appears to promote Rad51 binding to ssDNA (Li & Heyer, 2008; Modesti & Kanaar, 2001).
Perhaps the most fundamental property of all living things is the ability to reproduce. All organisms inherit the genetic information specifying their structure and function from their parents. Likewise, all cells arise from preexisting cells, so the genetic material must be replicated and passed from parent to progeny cell at each cell division. How genetic information is replicated and transmitted from cell to cell and organism to organism thus represents a question that is central to all of biology. Consequently, elucidation of the mechanisms of genetic transmission and identification of the genetic material as DNA were discoveries that formed the foundation of our current understanding of biology at the molecular level.
The classical principles of genetics were deduced by Gregor Mendel in 1865, on the basis of the results of breeding experiments with peas. Mendel studied the inheritance of a number of well-defined traits, such as seed color, and was able to deduce general rules for their transmission. In all cases, he could correctly interpret the observed patterns of inheritance by assuming that each trait is determined by a pair of inherited factors, which are now called genes. One gene copy (called an allele) specifying each trait is inherited from each parent. For example, breeding two strains of peas—one having yellow seeds, and the other green seeds—yields the following results (Figure 3.1). The parental strains each have two identical copies of the gene specifying yellow (Y) or green (y) seeds, respectively. The progeny plants are therefore hybrids, having inherited one gene for yellow seeds (Y) and one for green seeds (y). All these progeny plants (the first filial, or F1, generation) have yellow seeds, so yellow (Y) is said to be dominant and green (y) recessive. The genotype (genetic composition) of the F1 peas is thus Yy, and their phenotype (physical appearance) is yellow. If one F1 offspring is bred with another, giving rise to F2 progeny, the genes for yellow and green seeds segregate in a characteristic manner such that the ratio between F2 plants with yellow seeds and those with green seeds is 3:1.
Mendel's findings, apparently ahead of their time, were largely ignored until 1900, when Mendel's laws were rediscovered and their importance recognized. Shortly thereafter, the role of chromosomes as the carriers of genes was proposed. It was realized that most cells of higher plants and animals are diploid—containing two copies of each chromosome. Formation of the germ cells (the sperm and egg), however, involves a unique type of cell division (meiosis) in which only one member of each chromosome pair is transmitted to each progeny cell (Figure 3.2). Consequently, the sperm and egg are haploid, containing only one copy of each chromosome. The union of these two haploid cells at fertilization creates a new diploid organism, now containing one member of each chromosome pair derived from the male and one from the female parent. The behavior of chromosome pairs thus parallels that of genes, leading to the conclusion that genes are carried on chromosomes.
The fundamentals of mutation, genetic linkage, and the relationships between genes and chromosomes were largely established by experiments performed with the fruit fly, Drosophila melanogaster. Drosophila can be easily maintained in the laboratory, and they reproduce about every two weeks, which is a considerable advantage for genetic experiments. Indeed, these features continue to make Drosophila an organism of choice for genetic studies of animals, particularly the genetic analysis of development and differentiation.
In the early 1900s, a number of genetic alterations (mutations) were identified in Drosophila, usually affecting readily observable characteristics such as eye color or wing shape. Breeding experiments indicated that some of the genes governing these traits are inherited independently of each other, suggesting that these genes are located on different chromosomes that segregate independently during meiosis (Figure 3.3). Other genes, however, are frequently inherited together as paired characteristics. Such genes are said to be linked to each other by virtue of being located on the same chromosome. The number of groups of linked genes is the same as the number of chromosomes (four in Drosophila), supporting the idea that chromosomes are carriers of the genes.
Linkage between genes is not complete, however; chromosomes exchange material during meiosis, leading to recombination between linked genes (Figure 3.4). The frequency of recombination between two linked genes depends on the distance between them on the chromosome; genes that are close to each other recombine less frequently than do genes farther apart. Thus, the frequencies with which different genes recombine can be used to determine their relative positions on the chromosome, allowing the construction of genetic maps (Figure 3.5). By 1915, nearly a hundred genes had been defined and mapped onto the four chromosomes of Drosophila, leading to general acceptance of the chromosomal basis of heredity.
Early genetic studies focused on the identification and chromosomal localization of genes that control readily observable characteristics, such as the eye color of Drosophila. How these genes lead to the observed phenotypes, however, was unclear. The first insight into the relationship between genes and enzymes came in 1909, when it was realized that the inherited human disease phenylketonuria (see Molecular Medicine in Chapter 2) results from a genetic defect in metabolism of the amino acid phenylalanine. This defect was hypothesized to result from a deficiency in the enzyme needed to catalyze the relevant metabolic reaction, leading to the general suggestion that genes specify the synthesis of enzymes.
Clearer evidence linking genes with the synthesis of enzymes came from experiments of George Beadle and Edward Tatum, performed in 1941 with the fungus Neurospora crassa. In the laboratory, Neurospora can be grown on minimal or rich media similar to those discussed in Chapter 1 for the growth of E. coli. For Neurospora, minimal media consist only of salts, glucose, and biotin; rich media are supplemented with amino acids, vitamins, purines, and pyrimidines. Beadle and Tatum isolated mutants of Neurospora that grew normally on rich media but could not grow on minimal media. Each mutant was found to require a specific nutritional supplement, such as a particular amino acid, for growth. Furthermore, the requirement for a specific nutritional supplement correlated with the failure of the mutant to synthesize that particular compound. Thus, each mutation resulted in a deficiency in a specific metabolic pathway. Since such metabolic pathways were known to be governed by enzymes, the conclusion from these experiments was that each gene specified the structure of a single enzyme—the one gene-one enzyme hypothesis. Many en-zymes are now known to consist of multiple polypeptides, so the currently accepted statement of this hypothesis is that each gene specifies the structure of a single polypeptide chain.
Understanding the chromosomal basis of heredity and the relationship between genes and enzymes did not in itself provide a molecular explanation of the gene. Chromosomes contain proteins as well as DNA, and it was initially thought that genes were proteins. The first evidence leading to the identification of DNA as the genetic material came from studies in bacteria. These experiments represent a prototype for current approaches to defining the function of genes by introducing new DNA sequences into cells, as discussed later in this chapter.
The experiments that defined the role of DNA were derived from studies of the bacterium that causes pneumonia (Pneumococcus). Virulent strains of Pneumococcus are surrounded by a polysaccharide capsule that protects the bacteria from attack by the immune system of the host. Because the capsule gives bacterial colonies a smooth appearance in culture, encapsulated strains are denoted S. Mutant strains that have lost the ability to make a capsule (denoted R) form rough-edged colonies in culture and are no longer lethal when inoculated into mice. In 1928 it was observed that mice inoculated with nonencapsulated (R) bacteria plus heat-killed encapsulated (S) bacteria developed pneumonia and died. Importantly, the bacteria that were then isolated from these mice were of the S type. Subsequent experiments showed that a cell-free extract of S bacteria was similarly capable of converting (or transforming) R bacteria to the S state. Thus, a substance in the S extract (called the transforming principle) was responsible for inducing the genetic transformation of R to S bacteria.
In 1944 Oswald Avery, Colin MacLeod, and Maclyn McCarty established that the transforming principle was DNA, both by purifying it from bacterial extracts and by demonstrating that the activity of the transforming principle is abolished by enzymatic digestion of DNA but not by digestion of proteins (Figure 3.6). Although these studies did not immediately lead to the acceptance of DNA as the genetic material, they were extended within a few years by experiments with bacterial viruses. In particular, it was shown that, when a bacterial virus infects a cell, the viral DNA rather than the viral protein must enter the cell in order for the virus to replicate. Moreover, the parental viral DNA (but not the protein) is transmitted to progeny virus particles. The concurrence of these results with continuing studies of the activity of DNA in bacterial transformation led to acceptance of the idea that DNA is the genetic material.
Our understanding of the three-dimensional structure of DNA, deduced in 1953 by James Watson and Francis Crick, has been the basis for present-day molecular biology. At the time of Watson and Crick's work, DNA was known to be a polymer composed of four nucleic acid bases—two purines (adenine [A] and guanine [G]) and two pyrimidines (cytosine [C] and thymine [T])—linked to phosphorylated sugars. Given the central role of DNA as the genetic material, elucidation of its three-dimensional structure appeared critical to understanding its function. Watson and Crick's consideration of the problem was heavily influenced by Linus Pauling's description of hydrogen bonding and the α helix, a common element of the secondary structure of proteins (see Chapter 2). Moreover, experimental data on the structure of DNA were available from X-ray crystallography studies by Maurice Wilkins and Rosalind Franklin. Analysis of these data revealed that the DNA molecule is a helix that turns every 3.4 nm. In addition, the data showed that the distance between adjacent bases is 0.34 nm, so there are ten bases per turn of the helix. An important finding was that the diameter of the helix is approximately 2 nm, suggesting that it is composed of not one but two DNA chains.
From these data, Watson and Crick built their model of DNA (Figure 3.7). The central features of the model are that DNA is a double helix with the sugar-phosphate backbones on the outside of the molecule. The bases are on the inside, oriented such that hydrogen bonds are formed between purines and pyrimidines on opposite chains. The base pairing is very specific: A always pairs with T and G with C. This specificity accounts for the earlier results of Erwin Chargaff, who had analyzed the base composition of various DNAs and found that the amount of adenine was always equal to that of thymine, and the amount of guanine to that of cytosine. Because of this specific base pairing, the two strands of a DNA molecule are complementary: Each strand contains all the information required to specify the sequences of bases on the other.
The discovery of complementary base pairing between DNA strands immediately suggested a molecular solution to the question of how the genetic material could direct its own replication—a process that is required each time a cell divides. It was proposed that the two strands of a DNA molecule could separate and serve as templates for synthesis of new complementary strands, the sequence of which would be dictated by the specificity of base pairing (Figure 3.8). The process is called semiconservative replication because one strand of parental DNA is conserved in each progeny DNA molecule.
Direct support for semiconservative DNA replication was obtained in 1958 as a result of elegant experiments, performed by Matthew Meselson and Frank Stahl, in which DNA was labeled with isotopes that altered its density (Figure 3.9). E. coli were first grown in media containing the heavy isotope of nitrogen (15N) in place of the normal light isotope (14N). The DNA of these bacteria consequently contained 15N and was heavier than that of bacteria grown in 14N. Such heavy DNA could be separated from DNA containing 14N by equilibrium centrifugation in a density gradient of CsCl. This ability to separate heavy (15N) DNA from light (14N) DNA enabled the study of DNA synthesis. E. coli that had been grown in 15N were transferred to media containing 14N and allowed to replicate one more time. Their DNA was then extracted and analyzed by CsCl density gradient centrifugation. The results of this analysis indicated that all of the heavy DNA had been replaced by newly synthesized DNA with a density intermediate between that of heavy (15N) and that of light (14N) DNA molecules. The implication was that during replication, the two parental strands of heavy DNA separated and served as templates for newly synthesized progeny strands of light DNA, yielding double-stranded molecules of intermediate density. This experiment thus provided direct evidence for semiconservative DNA replication, clearly underscoring the importance of complementary base pairing between strands of the double helix.
The ability of DNA to serve as a template for its own replication was further established with the demonstration that an enzyme purified from E. coli (DNA polymerase) could catalyze DNA replication in vitro. In the presence of DNA to act as a template, DNA polymerase was able to direct the incorporation of nucleotides into a complementary DNA molecule.