Molecular Basis Of Inheritance Class 12 Biology Chapter 5 Notes

Structure of DNA

1. Deoxyribonucleotides:

DNA (deoxyribonucleic acid) is composed of repeating units called deoxyribonucleotides.
A deoxyribonucleotide consists of three main components: a deoxyribose sugar molecule, a phosphate group, and a nitrogenous base.

2. Base Pairs:

DNA is typically measured in terms of the number of base pairs (bp) it contains.
Base pairs are formed when two complementary nitrogenous bases interact through hydrogen bonds. The four nitrogenous bases in DNA are adenine (A), thymine (T), cytosine (C), and guanine (G).
The base pairs in DNA are A-T (adenine-thymine) and C-G (cytosine-guanine).

3. Double Helix Structure:

DNA has a double-helix structure, which is often described as a twisted ladder or spiral staircase.
The backbone of the DNA molecule is made up of alternating sugar (deoxyribose) and phosphate groups, which form the sides or rails of the ladder.
The nitrogenous bases project inward from the sugar-phosphate backbone and form the steps or rungs of the ladder.
The base pairs are held together by hydrogen bonds: A pairs with T, and C pairs with G.

4. Complementary Strands:

DNA consists of two complementary strands that run in opposite directions.
The strands are said to be antiparallel, meaning they run in opposite 5′ to 3′ directions.
This antiparallel arrangement allows the formation of complementary base pairs, where A always pairs with T and C always pairs with G.

5. Base Pairing Rules:

Adenine (A) always pairs with thymine (T), forming two hydrogen bonds between them.
Cytosine (C) always pairs with guanine (G), forming three hydrogen bonds between them.

6. Base Pairing Specificity:

The specificity of base pairing ensures that the genetic information is accurately replicated during DNA replication and maintained through generations.

7. Overall Function:

DNA serves as the genetic material in most organisms, carrying the instructions for the synthesis of proteins and other cellular components.
It stores, replicates, and transmits genetic information from one generation to the next.

Structure of Polynucleotide Chain

The chemical structure of a polynucleotide chain, whether it’s DNA or RNA, consists of several key components:

1. Nitrogenous Bases: A nucleotide, the monomer of nucleic acids, contains a nitrogenous base. There are two categories of nitrogenous bases: purines (adenine and guanine) and pyrimidines (cytosine, uracil in RNA, and thymine in DNA).

2. Pentose Sugar: The nucleotide also includes a pentose sugar. In DNA, the sugar is deoxyribose, while in RNA, it is ribose. The difference between them lies in the presence or absence of a hydroxyl (-OH) group at the 2′ position of the ribose sugar in RNA.

3. Phosphate Group: Each nucleotide is further connected to a phosphate group.

Here’s how these components come together to form a polynucleotide chain:

A nitrogenous base is attached to the 1′ carbon of the pentose sugar through a N-glycosidic linkage, forming a nucleoside. Examples include adenosine, deoxyadenosine, guanosine, deoxyguanosine, cytidine, deoxycytidine, uridine, and deoxythymidine.
When a phosphate group is linked to the 5′ carbon of the sugar through a phosphoester linkage, a nucleotide (or deoxynucleotide, depending on the type of sugar) is formed. This addition of the phosphate group to the nucleoside completes the nucleotide structure.
Two nucleotides can be linked together through a 3′-5′ phosphodiester linkage, connecting the 3′ carbon of one sugar to the 5′ carbon of the next sugar. This forms a dinucleotide. More nucleotides can be added in a similar manner to create a polynucleotide chain.
The backbone of the polynucleotide chain is formed by the repeating sugar-phosphate units, which create the sides or rails of the DNA or RNA molecule.
The nitrogenous bases are attached to the sugar moieties and project inward from the backbone, forming the steps or rungs of the DNA or RNA ladder.
In DNA, base pairing occurs between complementary nitrogenous bases: adenine (A) pairs with thymine (T), and cytosine (C) pairs with guanine (G). These base pairs are held together by hydrogen bonds.
The DNA molecule has a double-helix structure, with two complementary strands running in opposite directions (antiparallel). This structure is stabilized by the formation of base pairs and stacking interactions between adjacent base pairs.
The ends of the DNA strands are labeled as the 5′-end (with a free phosphate group) and the 3′-end (with a free hydroxyl group).

Packaging of DNA Helix

The packaging of the long DNA helix into the confined space of a cell nucleus is a remarkable feat of organization and compaction.

Here’s how it’s achieved in both prokaryotic and eukaryotic cells:

Prokaryotic Cells (e.g., E. coli):

DNA is organized in a region called the nucleoid.
DNA is not randomly scattered but forms large loops.
Proteins in the nucleoid interact with DNA through positively charged regions.
This interaction helps condense and organize the DNA within the cell.

Eukaryotic Cells (e.g., Mammalian Cells):

DNA is associated with histone proteins.
Histones are rich in positively charged amino acids.
DNA wraps around a histone octamer, forming a nucleosome.
A nucleosome contains about 200 base pairs of DNA.
Nucleosomes create a “beads-on-a-string” structure.
Chromatin fibers, consisting of nucleosomes and linker DNA, are further coiled and condensed to form chromosomes during cell division.
Non-histone chromosomal (NHC) proteins are involved in this process.

Number of Base Pairs in E. coli DNA:

Length of E. coli DNA: 1.36 mm
Distance between two consecutive base pairs: 0.34 nm (0.34 × 10^(-9) m/bp)
Calculation: Number of base pairs = Length of DNA / Distance between two base pairs
Result: Approximately 4 million base pairs in E. coli DNA.

Number of Nucleosomes in a Mammalian Cell:

Human genome size: Approximately 3.3 billion base pairs (bp)
Length of DNA in one nucleosome: About 200 bp
Calculation: Number of nucleosomes = Genome size / Length of DNA in one nucleosome
Result: Theoretically, about 16.5 million nucleosomes in a typical mammalian cell nucleus.

DNAs and DNase are distinct entities with different roles and properties

DNA (Deoxyribonucleic Acid):

DNA is a molecule found in the cell nucleus and sometimes in the mitochondria of eukaryotic cells, as well as in prokaryotic cells.
It carries genetic information and instructions for the development, functioning, growth, and reproduction of all known living organisms.
DNA consists of a double helix structure made up of nucleotides, with adenine (A), thymine (T), cytosine (C), and guanine (G) as the four nitrogenous bases.
DNA is responsible for encoding genes, which are segments of DNA that carry specific instructions for building proteins or performing other cellular functions.

DNase (Deoxyribonuclease):

DNase is an enzyme that catalyzes the hydrolysis of DNA into its constituent nucleotides (deoxyribonucleotides).
It plays a role in DNA degradation and is crucial for various cellular processes, including DNA repair, apoptosis (programmed cell death), and clearance of DNA from the bloodstream.
DNase can be found in different forms, such as DNase I and DNase II, each with specific functions. DNase I, for example, cleaves DNA at random sites, whereas DNase II primarily functions in lysosomes.
DNase is not involved in storing genetic information but rather in breaking down and recycling DNA molecules within cells or in extracellular environments.

The Genetic Material is DNA

Alfred Hershey and Martha Chase’s experiments with bacteriophages provided definitive evidence that DNA is the genetic material responsible for heredity.

Here is a summary of their key findings:

Bacteriophages (Phages): Bacteriophages are viruses that infect bacterial cells. They consist of a protein coat (capsid) and genetic material, either DNA or RNA. In the case of the experiments, they used T2 bacteriophages that had DNA as their genetic material.
Radioactive Labeling: Hershey and Chase used a clever method to label either the DNA or the protein of the bacteriophages with radioactive isotopes. They grew some phages in a medium containing radioactive phosphorus-32 (32P), which specifically labels DNA, and others in a medium containing radioactive sulfur-35 (35S), which labels proteins.
Infection of Bacterial Cells: These radioactive-labeled phages were allowed to infect Escherichia coli (E. coli) bacterial cells. During the infection, the phages inject their genetic material into the bacterial cells.
Blender Experiment: To separate the viral protein coats from the bacterial cells, Hershey and Chase used a blender. This mechanical action removed the outer protein coat of the phages, leaving only the genetic material inside the bacterial cells.
Centrifugation: After the blending, the mixture of bacterial cells and viral genetic material was spun in a centrifuge. This process separated the heavier bacterial cells from the lighter viral protein coats.
Radioactive Tracer: The key observation was that the bacterial cells that had been infected with phages containing radioactive DNA (32P) became radioactive, while those infected with phages containing radioactive proteins (35S) did not become radioactive.

These results conclusively demonstrated that it was the DNA from the phages that entered the bacterial cells and directed the synthesis of new phage particles within the bacterial host. The fact that the radioactive label was found inside the bacterial cells provided strong evidence that DNA, not protein, carried the genetic information responsible for heredity. This experiment played a crucial role in establishing DNA as the genetic material and paved the way for further discoveries in molecular biology.

Properties of Genetic Material (DNA versus RNA)

Here are some of the properties that distinguish DNA from RNA, and why DNA is the predominant genetic material:

Chemical Stability: DNA is chemically stable compared to RNA. RNA contains a 2′-OH (hydroxyl) group on its ribose sugar, making it more reactive and susceptible to hydrolysis. In contrast, DNA lacks this 2′-OH group, making it more stable over time.
Structural Stability: DNA has a double-stranded helical structure, which provides stability and protection to the genetic information. The complementary base pairing between adenine (A) and thymine (T), as well as cytosine (C) and guanine (G), ensures structural stability. RNA, typically single-stranded, is more prone to structural changes.
Replication: Both DNA and RNA can replicate, but DNA is more suited for this purpose. The double-stranded nature of DNA allows for accurate and efficient replication through the complementary base-pairing mechanism. RNA replication, while possible, is less reliable due to its single-stranded structure.
Mutation Rate: RNA molecules have a higher mutation rate compared to DNA. RNA viruses, for example, mutate more rapidly, leading to genetic diversity within viral populations. DNA’s greater stability results in a lower mutation rate, which is important for maintaining genetic integrity over long periods.
Expression of Genetic Information: RNA plays a central role in gene expression as messenger RNA (mRNA) carries the genetic code from DNA to ribosomes for protein synthesis. RNA can directly code for proteins, making it essential for gene expression. DNA, on the other hand, serves as the stable repository of genetic information.
Genetic Storage: DNA’s stability and double-stranded structure make it well-suited for long-term storage of genetic information. It can withstand environmental conditions and cellular processes without significant degradation. This property is crucial for preserving an organism’s genetic heritage.

While both DNA and RNA can function as genetic material, DNA is preferred for long-term storage and genetic stability. RNA, with its dynamic and versatile nature, is better suited for gene expression and rapid adaptation, as seen in RNA viruses. Each nucleic acid serves its unique role in the molecular processes of life.

RNA WORLD

RNA was likely the first genetic material: There is substantial evidence to suggest that RNA played a crucial role in the early stages of life’s evolution. RNA is versatile, as it can store genetic information and act as a catalyst (ribozyme) in various biochemical reactions. This dual functionality suggests that RNA may have been the precursor to both genetic information storage and enzymatic activity.
RNA as a genetic material: Early life forms might have used RNA as their genetic material, with information encoded in RNA sequences. RNA molecules can carry genetic information, replicate themselves (albeit less efficiently than DNA), and form simple structures that could have served as primitive genomes.
RNA as a catalyst: Ribozymes, which are RNA molecules capable of catalyzing chemical reactions, provide evidence for RNA’s role as a catalyst in early life. Some essential cellular processes, such as splicing of RNA molecules, are still catalyzed by ribozymes today.
Transition to DNA: Over time, DNA likely evolved from RNA as a more stable and reliable genetic material. DNA’s double-stranded structure and the presence of thymine instead of uracil are modifications that enhance its stability. DNA’s ability to resist changes and mutations may have conferred an evolutionary advantage, leading to its predominance as the genetic material in modern organisms.
DNA repair mechanisms: DNA evolved repair mechanisms to maintain its integrity. These repair mechanisms help correct errors and damage in the DNA sequence, contributing to DNA’s long-term stability.
The transition from RNA to DNA, and the emergence of complex cellular processes, are fascinating topics in the study of the origin of life. While the details of these processes continue to be investigated, it is clear that both RNA and DNA have played pivotal roles in the evolution and functioning of living organisms.

Replication

The process of DNA replication is a fundamental aspect of biology and is essential for the transmission of genetic information from one generation to the next. Watson and Crick’s proposal of the double helical structure of DNA in 1953 also included a scheme for DNA replication, which they termed “semiconservative DNA replication.” Here’s a brief overview of the process:

Semiconservative DNA Replication:

Separation of DNA Strands: DNA replication begins with the separation of the two complementary DNA strands. This separation is facilitated by an enzyme called DNA helicase, which unwinds the double helix by breaking the hydrogen bonds between base pairs. This results in the formation of two single-stranded DNA templates.
Priming: Before DNA polymerase (the enzyme responsible for DNA synthesis) can start building a new DNA strand, a short RNA primer is synthesized on each of the separated DNA strands. This primer provides a starting point for DNA polymerase.
DNA Synthesis: DNA polymerase adds nucleotides to the growing DNA strand in a 5′ to 3′ direction, using the complementary base-pairing rules (A pairs with T, and C pairs with G). The enzyme reads the template strand and selects the appropriate nucleotide to add to the newly forming strand.
Leading and Lagging Strands: Because DNA synthesis can only occur in the 5′ to 3′ direction, the two template strands are replicated differently. One strand, known as the leading strand, is synthesized continuously in the direction of the replication fork (where the DNA strands are unwinding). The other strand, called the lagging strand, is synthesized discontinuously in short fragments known as Okazaki fragments, away from the replication fork.
Ligase Joins Fragments: On the lagging strand, the RNA primers used to initiate DNA synthesis are eventually replaced by DNA, and the fragments are joined together by an enzyme called DNA ligase. This results in a continuous complementary strand.
Proofreading and Repair: DNA polymerase has a proofreading function to correct errors in the newly synthesized DNA strand. Additional repair mechanisms exist to fix any mismatched or damaged bases.
Result: After replication, each DNA molecule consists of one parental strand and one newly synthesized strand. This process ensures that genetic information is faithfully passed on to daughter cells during cell division.

Semiconservative DNA replication ensures that each new DNA molecule retains one strand from the original DNA molecule, providing genetic continuity between generations.

This process of DNA replication is fundamental to all living organisms and is a key feature that allows for the inheritance of genetic information with high fidelity.

The Experimental Proof

The experiment conducted by Matthew Meselson and Franklin Stahl in 1958 provided strong evidence for the semiconservative replication of DNA.

Here’s a summary of their experiment and its results:

Labeling DNA with Heavy Isotope (15N): E. coli bacteria were grown in a medium containing 15NH4Cl, where 15N is a heavy isotope of nitrogen. During DNA replication, the nitrogen in the nucleotides was incorporated from the medium, leading to the production of DNA with a higher density due to the presence of heavy nitrogen.
Transfer to Normal Medium (14N): The E. coli cells were then transferred to a medium containing normal 14NH4Cl, where 14N is the lighter isotope of nitrogen. This switch allowed the researchers to track the incorporation of the lighter nitrogen isotope into newly synthesized DNA.
Sampling at Different Time Intervals: Samples of DNA were collected at various time intervals as the cells continued to divide and replicate their DNA. The samples were taken at specific time points corresponding to one or more generations of bacterial growth.
Density Gradient Centrifugation: The DNA samples collected at different time intervals were subjected to centrifugation in a cesium chloride (CsCl) density gradient. The CsCl gradient allowed the separation of DNA molecules based on their densities.
Results: The results of the CsCl gradient centrifugation showed that after one generation of bacterial growth (20 minutes), the DNA had an intermediate density, indicating that it was a hybrid molecule consisting of both heavy (15N-labeled) and light (14N-labeled) DNA strands. After another generation (40 minutes), the DNA was composed of equal amounts of this hybrid DNA and “light” DNA (fully labeled with 14N).

The experiment demonstrated that DNA replication is semiconservative because the newly synthesized DNA strands in each generation contain one original (parental) strand (labeled with 15N) and one newly synthesized strand (labeled with 14N). This hybridization of heavy and light DNA strands resulted in intermediate-density DNA after one generation and eventually led to the production of DNA with the same density as the original “light” DNA after two generations.

If E. coli were allowed to grow for 80 minutes (four generations), the proportions of light and hybrid densities in the DNA molecule would continue to change, with the majority of DNA becoming lighter as more generations pass.

This experiment provided strong experimental support for Watson and Crick’s proposal of semiconservative DNA replication and has since become a classic example in molecular biology.

The Machinery and the Enzymes

DNA replication is a complex and highly regulated process in living cells, requiring the involvement of various enzymes and molecular components to ensure accuracy and efficiency.

Here are some key points about DNA replication:

DNA Polymerase: DNA replication is catalyzed by enzymes called DNA polymerases. These enzymes use a DNA template to synthesize a new complementary strand of DNA by adding deoxynucleotides (A, T, C, and G) to the growing strand. DNA polymerases work in the 5′ to 3′ direction, meaning they add nucleotides to the 3′ end of the growing DNA strand.
Efficiency and Speed: DNA replication must occur rapidly and with high accuracy to ensure that genetic information is faithfully copied. In E. coli, replication of its 4.6 million base pairs is completed in about 18 minutes, with an average rate of polymerization of approximately 2000 base pairs per second.
Accuracy: High-fidelity DNA replication is essential to prevent mutations. DNA polymerases have proofreading mechanisms to correct errors that may occur during replication. This proofreading activity helps maintain the integrity of the genetic information.
Energy Source: Deoxyribonucleoside triphosphates (dNTPs), which are the building blocks of DNA, also serve as an energy source for the polymerization reaction. When a dNTP is incorporated into the growing DNA strand, it releases energy from the hydrolysis of its two terminal phosphates.
Replication Fork: In long DNA molecules, replication occurs within a small region known as the replication fork. The replication fork is formed when the DNA double helix is unwound, and the two strands separate, creating a Y-shaped structure.
Leading and Lagging Strands: DNA polymerases synthesize DNA in a continuous manner on one strand (the leading strand) and discontinuously on the other strand (the lagging strand). On the lagging strand, short fragments of DNA called Okazaki fragments are synthesized in the 5′ to 3′ direction away from the replication fork.
DNA Ligase: The discontinuously synthesized fragments on the lagging strand are later joined together by an enzyme called DNA ligase. DNA ligase seals the gaps between Okazaki fragments, creating a continuous strand of DNA.
Origin of Replication: DNA replication does not start randomly at any location in the DNA molecule. In prokaryotes like E. coli, there are specific regions known as the origin of replication where replication initiates. These origins of replication have specific sequences recognized by initiator proteins.
Eukaryotic Replication: In eukaryotic cells, DNA replication occurs during the S-phase of the cell cycle and is tightly regulated. Eukaryotic DNA replication involves multiple origins of replication and additional complexities compared to prokaryotic replication.
Polyploidy: Failure in coordinating DNA replication and cell division can lead to polyploidy, a condition where an organism has extra sets of chromosomes. This can have serious consequences for the organism’s development and viability.

Overall, DNA replication is a highly orchestrated process involving multiple enzymes, proteins, and regulatory mechanisms to ensure the accurate duplication of genetic information in living cells.

Transcription

Transcription is the process by which genetic information encoded in DNA is copied into a complementary RNA molecule. Here are some key points about transcription:

Complementarity: Like DNA replication, transcription also relies on the principle of complementarity between nucleotide bases. However, in transcription, adenine (A) in the DNA template strand pairs with uracil (U) in the newly synthesized RNA strand, while cytosine (C) in DNA pairs with guanine (G) in RNA. This is because RNA contains uracil instead of thymine (T).
Selective Copying: During transcription, only a specific segment of DNA is copied into RNA. This segment, known as a gene, contains the instructions for making a particular protein or RNA molecule. Genes have specific start and stop signals that define the boundaries of the segment to be transcribed.
Template Strand: In transcription, only one of the DNA strands serves as a template for RNA synthesis. This strand is called the template strand, while the other strand is the non-template or coding strand. The RNA molecule is synthesized in the 5′ to 3′ direction, complementary to the template strand.
Single-Stranded RNA: Unlike DNA replication, which produces a double-stranded DNA molecule, transcription results in a single-stranded RNA molecule. This RNA molecule is complementary to the template strand of DNA and carries genetic information in the form of a linear sequence of nucleotides.
One Gene, One RNA: Each gene on the DNA molecule codes for a specific RNA molecule, which can subsequently be used to synthesize a protein. This one-to-one relationship between genes and RNA molecules ensures that the genetic information is accurately transferred to the protein synthesis machinery.
Preventing Double-Stranded RNA: If both DNA strands were transcribed simultaneously, they would produce two complementary RNA strands that could potentially form a double-stranded RNA molecule. Double-stranded RNA can interfere with normal cellular processes, including protein synthesis. Therefore, transcription is carefully regulated to avoid this situation.

In summary, transcription is a fundamental process that allows genetic information stored in DNA to be converted into RNA molecules, which can then serve as templates for protein synthesis. It is a selective and controlled process that ensures accurate transfer of genetic information and avoids the formation of double-stranded RNA.

Transcription Unit

Definition of a Transcription Unit:

A transcription unit in DNA consists of three key regions:

Promoter
Structural gene
Terminator

DNA Strand Polarity:

Within the structural gene of a transcription unit, DNA strands exhibit opposite polarities.
DNA-dependent RNA polymerase operates in the 5’→3′ direction.
The strand with 3’→5′ polarity acts as the template strand.
The strand with 5’→3′ polarity, with a sequence identical to RNA (except for thymine instead of uracil), is known as the coding strand. It does not code for any functional RNA.

Example DNA Sequence:

Template Strand: 3′ -ATGCATGCATGCATGCATGCATGC-5′
Coding Strand: 5′ -TACGTACGTACGTACGTACGTACG-3′

RNA Sequence Transcribed:

RNA sequence transcribed from the above DNA:

5′ -AUGCAUGCAUGCAUGCAUGCAUGC-3′

Promoter and Terminator:

The promoter is located upstream (towards the 5′ end) of the structural gene (with reference to the coding strand’s polarity).
The promoter contains a DNA sequence that serves as a binding site for RNA polymerase.
The presence of a promoter determines which strand acts as the template and coding strand.
Switching the position of the promoter with the terminator can reverse the definitions of coding and template strands.
The terminator is situated downstream (towards the 3′ end) of the coding strand.
The terminator typically marks the end of the transcription process.

Regulatory Sequences:

Additional regulatory sequences may exist both upstream and downstream of the promoter.
These regulatory sequences play roles in the regulation of gene expression.

Transcription Unit and Gene Definition

The Nature of a Gene:

A gene is the fundamental unit of inheritance, residing within DNA.
Defining a gene solely based on DNA sequence can be challenging.
DNA sequences encoding tRNA or rRNA molecules are also considered genes.

Monocistronic and Polycistronic Genes:

A cistron is a DNA segment coding for a polypeptide.
Structural genes in transcription units can be monocistronic (common in eukaryotes) or polycistronic (common in bacteria/prokaryotes).
In eukaryotes, monocistronic structural genes have segmented coding sequences; genes are split.
Exons are coding sequences appearing in mature or processed RNA.
Introns are non-coding sequences that do not appear in mature RNA.
The presence of introns complicates gene definition based on DNA segments.

Role of Promoter and Regulatory Sequences:

Character inheritance is influenced by promoter and regulatory sequences of a structural gene.
Regulatory sequences may be loosely termed “regulatory genes,” although they don’t encode RNA or proteins.

Types of RNA and the Transcription Process

RNA Types in Bacteria:

In bacteria, three main types of RNA are crucial for protein synthesis:

mRNA (messenger RNA): Provides the genetic code template.
tRNA (transfer RNA): Transports amino acids and reads the genetic code.
rRNA (ribosomal RNA): Plays structural and catalytic roles during translation.

Transcription Process in Bacteria:

Bacteria use a single DNA-dependent RNA polymerase for transcription.
Transcription involves three main steps: initiation, elongation, and termination.
RNA polymerase binds to a promoter to initiate transcription.
It uses nucleoside triphosphates as substrates and synthesizes RNA following the complementarity rule.
The helix is opened, and elongation proceeds.
Only a short RNA stretch remains bound to the enzyme.
When the polymerase reaches the terminator region, both the nascent RNA and RNA polymerase fall off, leading to transcription termination.

Regulation of Transcription Steps:

RNA polymerase catalyzes initiation, elongation, and termination.
Transient association with initiation factor (σ) and termination factor (ρ) alters RNA polymerase specificity.
σ factor assists in initiation, while ρ factor aids in termination.

Transcription and Translation Coupling in Bacteria:

In bacteria, mRNA requires no additional processing and is translated in the same compartment (no cytosol/nucleus separation).
Translation can begin before mRNA transcription is complete.
This allows transcription and translation to be coupled.

Complexities in Eukaryotic Transcription:

Eukaryotes have three RNA polymerases in the nucleus (plus organelle polymerases).
RNA polymerase I transcribes rRNAs, RNA polymerase III transcribes tRNAs, 5s rRNA, and snRNAs, while RNA polymerase II transcribes hnRNA.
Primary transcripts in eukaryotes contain both exons and introns and are non-functional.
Splicing removes introns and joins exons in a defined order.
hnRNA undergoes additional processing: capping (adding methyl guanosine triphosphate to the 5′-end) and tailing (adding adenylate residues to the 3′-end).
The fully processed hnRNA becomes mRNA, which is transported out of the nucleus for translation.

Significance of Eukaryotic Complexity:

Split-gene arrangements and introns are ancient genomic features.
Splicing reflects the dominance of RNA processes.
Understanding RNA and RNA-dependent processes in living systems is gaining importance.

Genetic Code: Deciphering the Language of Life

Replication and transcription involve copying nucleic acids.
However, translation transfers genetic information from nucleic acids to amino acids, posing a challenge due to the lack of complementarity between them.
Evidence suggested that changes in nucleic acids were responsible for changes in amino acids, leading to the concept of the genetic code.

Challenges in Deciphering the Genetic Code:

Deciphering the genetic code was a challenging interdisciplinary effort involving physicists, organic chemists, biochemists, and geneticists.
George Gamow proposed that a combination of three nucleotides (a triplet code) could code for the 20 amino acids, despite generating more codons than needed.

Deciphering the Genetic Code:

Har Gobind Khorana’s chemical method synthesized RNA molecules with defined base combinations.
Marshall Nirenberg’s cell-free protein synthesis system played a crucial role in deciphering the code.
Severo Ochoa’s enzyme (polynucleotide phosphorylase) helped polymerize RNA with defined sequences.
A genetic code checkerboard was prepared (Table 5.1).

Salient Features of the Genetic Code:

The codon is a triplet, with 61 codons coding for amino acids and 3 codons serving as stop codons.
Some amino acids are coded by multiple codons, making the code degenerate.
The codon is read in mRNA continuously without punctuations.
The code is nearly universal, except for some exceptions in mitochondrial codons and certain protozoans.
AUG serves a dual function, coding for Methionine (Met) and acting as an initiator codon.
UAA, UAG, and UGA are stop codons.

Predicting mRNA from Sequence of Amino Acids:

mRNA sequence: AUG UUU UUC UUC UUU UUU UUC
Use the checkerboard to predict the amino acid sequence:
- AUG: Methionine (Met)
- UUU: Phenylalanine (Phe)
- UUC: Phenylalanine (Phe)
- UUC: Phenylalanine (Phe)
- UUU: Phenylalanine (Phe)
- UUU: Phenylalanine (Phe)
- UUC: Phenylalanine (Phe)
Amino acid sequence: Met-Phe-Phe-Phe-Phe-Phe-Phe

Predicting mRNA from Sequence of Amino Acids:

Given amino acid sequence: Met-Phe-Phe-Phe-Phe-Phe-Phe
Try to predict the mRNA sequence.
There is difficulty in predicting the opposite direction due to the degeneracy of the genetic code.

Correlating Two Properties of Genetic Code:

The two properties of the genetic code that are correlated in this exercise are:

The degeneracy of the code, where some amino acids are coded by multiple codons.
The difficulty in predicting the mRNA sequence from an amino acid sequence due to the code’s degeneracy.

The genetic code is a fundamental concept that governs the translation of genetic information from nucleic acids to proteins, serving as the key to understanding how life’s instructions are carried out.

Mutations and the Genetic Code

Understanding Gene-DNA Relationships Through Mutations:

The relationship between genes and DNA is best understood through mutation studies.
Mutations, changes in the DNA sequence, can have various effects on genes and their functions.
Large deletions and rearrangements in a DNA segment can result in the loss or gain of a gene and its associated function.

Effect of Point Mutations:

Point mutations involve changes in a single base pair of the DNA sequence.
An example of a point mutation is the change of a single base pair in the gene for the beta-globin chain, resulting in the substitution of glutamate with valine and causing sickle cell anemia.

Understanding Point Mutations with an Example:

Point mutations that insert or delete a base in a structural gene can be illustrated with a simple example.
Consider a statement made up of three-letter words, similar to the genetic code i.e, RAM HAS RED CAP

Insertion Mutation:

Insert the letter “B” between “HAS” and “RED,” and rearrange the statement i.e, RAM HAS BRE DCA P

Insertion Mutation with Two Letters:

Insert two letters “BI” at the same place, rearrange i.e, RAM HAS BIR EDC AP

Insertion Mutation with Three Letters:

Insert three letters “BIG,” the statement now reads i.e, RAM HAS BIG RED CAP

Deletion Mutation:

Delete the letters “R,” “E,” and “D” one by one and rearrange:
- RAM HAS RED CAP
- RAM HAS EDC AP
- RAM HAS DCA P
- RAM HAS CAP

Conclusion – Frameshift Mutations:

Insertion or deletion of one or two bases changes the reading frame from the point of insertion or deletion.
These mutations are referred to as frameshift insertion or deletion mutations.
However, insertion or deletion of three or its multiples bases affects one or multiple codons and amino acids but does not alter the reading frame from that point onward.

Frameshift mutations can have profound effects on the protein encoded by the gene, potentially leading to non-functional or altered proteins and associated diseases.

tRNA – The Adapter Molecule

The Need for an Adapter Molecule:

Francis Crick realized early in the development of the genetic code concept that a mechanism was needed to read and link the genetic code to amino acids.
Amino acids lack structural characteristics to uniquely read the code.

tRNA (Transfer RNA) as an Adapter Molecule:

tRNA, originally known as sRNA (soluble RNA), was postulated to serve as an adapter molecule.
It plays a dual role:

Reading the genetic code.
Binding to specific amino acids.

Key Features of tRNA:

tRNA possesses an anticodon loop with bases complementary to the genetic code.
It also has an amino acid acceptor end where it binds to specific amino acids.
Each tRNA is specific for a particular amino acid.

Special tRNA for Initiation:

There is a specific tRNA for initiation called initiator tRNA.
However, there are no tRNAs for stop codons.

tRNA Secondary Structure:

The secondary structure of tRNA resembles a clover-leaf, as depicted in Figure 5.12.
In reality, tRNA is a compact molecule with an inverted L-shaped structure.

Transfer RNA (tRNA) plays a crucial role in the translation process by decoding the genetic code and ensuring the accurate assembly of amino acids during protein synthesis.

Translation: From mRNA to Polypeptide

Translation Process Overview:

Translation is the process of polymerizing amino acids to form a polypeptide.
The sequence and order of amino acids are determined by the mRNA’s base sequence.
Peptide bonds connect amino acids, requiring energy for formation.

Charging of tRNA:

Amino acids are activated in the presence of ATP and linked to their corresponding tRNA.
This process is known as charging of tRNA or aminoacylation of tRNA.
Two charged tRNAs brought close together favor the formation of a peptide bond.
A catalyst enhances the rate of peptide bond formation.

The Role of Ribosomes:

Ribosomes are the cellular factories responsible for protein synthesis.
They consist of structural RNAs and around 80 different proteins.
In an inactive state, ribosomes exist as two subunits: a large subunit and a small subunit.
When the small subunit encounters mRNA, translation begins.
The large subunit contains two sites for amino acids to bind, facilitating peptide bond formation.
Ribosomes also act as catalysts (23S rRNA in bacteria is a ribozyme) for peptide bond formation.

Translational Units in mRNA:

A translational unit in mRNA is the sequence of RNA between the start codon (AUG) and the stop codon.
This sequence codes for a polypeptide.
mRNA also contains untranslated regions (UTR) that are not translated.
UTRs are present at both the 5′-end (before the start codon) and the 3′-end (after the stop codon) and are essential for efficient translation.

Initiation and Elongation:

Translation initiation begins with the ribosome binding to the mRNA at the start codon (AUG), recognized only by the initiator tRNA.
The ribosome then moves to the elongation phase.
During elongation, complexes consisting of an amino acid linked to tRNA sequentially bind to the mRNA by forming complementary base pairs with the tRNA anticodon.
The ribosome advances from codon to codon along the mRNA.
Amino acids are added one by one, translating the DNA-encoded information into polypeptide sequences represented by mRNA.

Termination:

Translation terminates when a release factor binds to the stop codon.
This action releases the complete polypeptide from the ribosome.

Translation is a complex process that transforms the information stored in mRNA into functional polypeptides, critical for the structure and function of proteins in the cell.

Regulation of Gene Expression

Overview of Gene Expression Regulation:

Regulation of gene expression encompasses a broad term that operates at various levels.
Since gene expression leads to polypeptide formation, regulation can occur at multiple stages in eukaryotes:

Transcriptional level (formation of primary transcript)
Processing level (regulation of splicing)
Transport of mRNA from nucleus to cytoplasm
Translational level

Purpose of Gene Expression Regulation:

Genes in a cell are expressed to perform specific functions or sets of functions.
For example, the enzyme beta-galactosidase in E. coli is synthesized to catalyze the hydrolysis of lactose into galactose and glucose for energy.
Gene expression regulation is influenced by metabolic, physiological, or environmental conditions.
Development and differentiation from embryos to adult organisms involve coordinated regulation of gene expression in various sets of genes.

Control of Gene Expression in Prokaryotes:

Prokaryotes primarily control gene expression at the transcriptional level.
In a transcription unit, RNA polymerase’s activity at a given promoter is regulated through interaction with accessory proteins.
Regulatory proteins can act as activators (positive regulation) or repressors (negative regulation).
Promoter region accessibility in prokaryotic DNA is often regulated by protein interaction with operators, specific sequences.
The operator region is located adjacent to promoter elements in most operons.
Operator sequences bind repressor proteins, and each operon has its specific operator and repressor.
For example, the lac operator is specific to the lac operon and interacts exclusively with the lac repressor.

Regulation of gene expression is a complex and highly coordinated process influenced by various factors, including environmental conditions and the cell’s metabolic and physiological needs.

The Lac Operon: A Model of Gene Regulation

Discovery of the Lac Operon:

The elucidation of the lac operon resulted from the collaboration between geneticist Francois Jacob and biochemist Jacques Monod.
They were the first to describe a transcriptionally regulated system, a common occurrence in bacteria known as an operon.
Examples of operons include the lac operon, trp operon, ara operon, his operon, val operon, and more.

Components of the Lac Operon:

The lac operon is named after lactose and consists of:
A regulatory gene (i gene): Codes for the repressor protein.
Three structural genes (z, y, and a):

z gene: Codes for beta-galactosidase (β-gal), responsible for lactose hydrolysis.
y gene: Codes for permease, increasing cell permeability to β-galactosides.
a gene: Encodes a transacetylase.

All three gene products are essential for lactose metabolism and are part of the same metabolic pathway.

Lactose as the Inducer:

Lactose is the substrate for beta-galactosidase and regulates operon activation.
In the absence of a preferred carbon source like glucose, if lactose is present in the growth medium, it is transported into cells via permease.
This induces the operon as follows.

Operon Regulation:

The repressor protein, constitutively synthesized from the i gene, normally binds to the operator region of the operon.
This binding prevents RNA polymerase from transcribing the operon.
In the presence of an inducer (lactose or allolactose), the repressor is inactivated by interacting with the inducer.
This allows RNA polymerase access to the promoter, and transcription proceeds.

Substrate-Enzyme Regulation Analogy:

The regulation of the lac operon can be likened to the regulation of enzyme synthesis by its substrate.
Glucose or galactose cannot act as inducers for the lac operon.

Duration of lac Operon Expression:

In the presence of lactose, the lac operon is expressed as long as lactose is available and actively transported into the cell.

Types of Regulation:

Regulation of the lac operon by the repressor is known as negative regulation.
Positive regulation also controls the lac operon but is not discussed at this level.

The lac operon serves as a model for understanding gene regulation, offering insights into how genes are controlled in response to environmental conditions and substrate availability.

The Human Genome Project: Unlocking the Genetic Code

The Human Genome Project (HGP) was launched in 1990 as a groundbreaking effort to sequence the entire human genome.
This mega project aimed to determine the sequence of DNA bases in the human genome, which contains approximately 3 billion base pairs.
The estimated cost of sequencing was around 9 billion US dollars, a colossal undertaking that required advanced genetic engineering techniques and computational tools.
HGP was closely associated with the development of bioinformatics, a new field in biology.

Goals of HGP:

Identify Genes: Identify all 20,000-25,000 genes in human DNA.
Determine DNA Sequences: Sequence the 3 billion chemical base pairs that make up human DNA.
Database Storage: Store this information in databases.
Data Analysis Tools: Improve tools for data analysis.
Technology Transfer: Transfer related technologies to other sectors, such as industries.
Ethical and Legal Issues: Address the ethical, legal, and social issues (ELSI) that may arise from the project.

Project Coordination:

The HGP was coordinated by the U.S. Department of Energy and the National Institute of Health.
Partners included the Wellcome Trust (U.K.), Japan, France, Germany, China, and others.
The project was completed in 2003.

Significance of HGP:

Understanding DNA variations among individuals can lead to new ways to diagnose, treat, and prevent disorders.
Knowledge gained from the project can revolutionize healthcare, agriculture, energy production, and environmental remediation.
Sequencing non-human model organisms also provides insights into their natural capabilities.

Methodologies:

Two major approaches: ESTs (Expressed Sequence Tags) for identifying expressed genes and sequencing the entire genome for sequence annotation.
DNA isolation from cells, fragmentation, cloning in bacterial or yeast hosts using BAC or YAC vectors.
Automated DNA sequencers based on Frederick Sanger’s method.
Overlapping sequences assembled using computer-based programs.
Annotation and assignment to specific chromosomes.
Genetic and physical maps created using polymorphism of restriction endonuclease recognition sites and microsatellites.

The Human Genome Project represents a monumental achievement in biology, unlocking the genetic code that underlies human biology and offering a foundation for understanding genetics and its applications.

Salient Features of the Human Genome:

Genome Size: The human genome consists of 3,164.7 million base pairs (bp).
Gene Size Variation: While the average gene comprises around 3,000 bases, gene sizes vary significantly. The largest known human gene is dystrophin, spanning 2.4 million bases.
Gene Count: The estimated total number of genes in the human genome is approximately 30,000. This count is significantly lower than earlier estimates ranging from 80,000 to 140,000 genes. Remarkably, 99.9 percent of nucleotide bases are identical in all humans.
Unknown Gene Functions: Over 50 percent of discovered genes have unknown functions, highlighting the complexity of gene regulation and function.
Protein-Coding Portion: Less than 2 percent of the human genome codes for proteins, while the majority consists of non-coding DNA.
Repetitive Sequences: A substantial portion of the human genome is composed of repetitive sequences. These sequences are repeated many times, often hundreds to thousands of times, and are not believed to directly code for proteins. However, they provide insights into chromosome structure, dynamics, and evolution.
Functional Significance of Repetitive Sequences: Repetitive sequences shed light on chromosome structure, dynamics, and evolutionary processes.
Gene Distribution: Chromosome 1 contains the most genes, with a total of 2,968 genes, while the Y chromosome has the fewest genes, with 231.
Single-Nucleotide Polymorphisms (SNPs): Scientists have identified approximately 1.4 million locations in the human genome where single-base DNA differences, known as single nucleotide polymorphisms (SNPs), occur. SNPs have significant implications for identifying disease-associated sequences and tracing human history.

The Human Genome Project has revealed the remarkable complexity and organization of the human genome, providing invaluable insights into genetics, genomics, and human biology.

Applications of Genomic Information

Disease Diagnosis and Treatment: Genomic information has revolutionized the field of medicine. It allows for the identification of genetic mutations and variations associated with diseases, enabling early diagnosis and personalized treatment strategies. Pharmacogenomics uses genetic information to tailor drug prescriptions to individual patients.
Biomedical Research: Genomic data is crucial for biomedical research, helping scientists understand the genetic basis of diseases and uncover potential drug targets. It has led to breakthroughs in cancer research, neurodegenerative diseases, and many other areas.
Agriculture: Genomics is applied in agriculture to develop crops with improved traits, such as resistance to pests or tolerance to environmental stress. This can lead to increased crop yields and sustainable agriculture.
Forensic Science: DNA profiling and genomics are used in forensic science to identify individuals, solve crimes, and establish paternity or familial relationships.
Evolutionary Biology: Genomic information is essential for studying the evolutionary history and relationships between species. It provides insights into how organisms have evolved over time.
Biotechnology: Genomic data is the foundation of biotechnology applications, including genetic engineering, synthetic biology, and the production of biopharmaceuticals.

Future Challenges:

Data Management: As genomic data continues to grow exponentially, managing, storing, and analyzing this vast amount of information becomes a significant challenge. Advanced computational methods and storage solutions are needed.
Ethical and Privacy Concerns: The use of genomic information raises ethical and privacy concerns, particularly in areas like genetic testing and gene editing. Regulations and guidelines must address these issues.
Clinical Integration: Integrating genomic data into routine clinical practice remains a challenge. Physicians need training and tools to interpret genetic information effectively.
Functional Understanding: While we have sequenced many genomes, understanding the functions of all genes and their interactions in complex biological systems is an ongoing challenge.
Global Collaboration: Genomic research is a global endeavor. Ensuring international collaboration and data sharing is essential for maximizing the benefits of genomic information.
Rare Diseases: Identifying genetic causes of rare diseases and developing treatments for them is a priority. These diseases often have a severe impact on affected individuals and families.
Population Diversity: Genomic research must consider the genetic diversity within and among populations to avoid biases and ensure equitable benefits for all communities.

The field of genomics continues to evolve rapidly, and addressing these challenges will be critical for realizing the full potential of genomic information in medicine, agriculture, and beyond

DNA Fingerprinting

DNA fingerprinting is a powerful technique used to identify individuals based on the unique patterns in their DNA.

Repetitive DNA Sequences: DNA fingerprinting relies on identifying differences in specific regions of DNA known as repetitive DNA sequences. These sequences consist of short stretches of DNA that are repeated multiple times.
Polymorphism: Polymorphism in DNA refers to genetic variation caused by mutations. Polymorphic regions of DNA are those where more than one variant (allele) exists in a population with a frequency greater than 0.01. These polymorphisms can be inherited from generation to generation.
Alec Jeffreys: Alec Jeffreys is the scientist who initially developed the technique of DNA fingerprinting. He used Variable Number of Tandem Repeats (VNTRs), a type of repetitive DNA, as a probe to create DNA fingerprints.
Steps in DNA Fingerprinting:

Isolation of DNA: DNA is extracted from a biological sample, such as blood, hair, or saliva.
DNA Digestion: The extracted DNA is digested using restriction endonucleases, resulting in DNA fragments.
Electrophoresis: The DNA fragments are separated by size through electrophoresis, creating a pattern of bands on a gel.
Blotting: The separated DNA fragments are transferred (blotted) onto synthetic membranes.
Hybridization: A labeled VNTR probe is used to hybridize (bind) to the complementary sequences in the separated DNA fragments.
Autoradiography: The hybridized DNA fragments are detected through autoradiography, creating a unique pattern of bands.

VNTRs: VNTRs (Variable Number of Tandem Repeats) are a type of mini-satellite DNA consisting of short sequences that are repeated in tandem. The copy number of repeats in VNTRs can vary from chromosome to chromosome in an individual, leading to polymorphic patterns in DNA fingerprinting.
Unique Patterns: DNA fingerprinting creates a characteristic pattern of bands for an individual’s DNA. This pattern is unique to each person, except in the case of identical twins who share the same DNA fingerprint.
Sensitivity and PCR: DNA fingerprinting has become more sensitive with the use of Polymerase Chain Reaction (PCR). PCR amplifies DNA from a single cell, making it a highly sensitive technique.
Applications: DNA fingerprinting is used in forensic science for criminal investigations, paternity testing, and identifying human remains. It also has applications in areas such as genetics research, wildlife conservation, and archaeology.

Overall, DNA fingerprinting is a valuable tool for identifying and differentiating individuals based on their unique genetic profiles.