Contents
One of the interesting Biology Topics is the study of animal behavior and how it is influenced by genetics and the environment.
Types of Genes and Their Functions – Structural Analysis of DNA Sequence
Nature of Gene
The gene is a type of particulate matter that is considered as a unit of heredity, i.e., through its transmission from parent to progeny a character may be transmitted from parents to the progeny. An organism may possess thousands of genes and these genes constitute the foundation of an organism. If we look into the development of a sexually developed organism, it starts its journey to life starting from the formation of a zygote formed by the union of gametes. During fertilization, a male and female gamete unite to develop a zygote that carries two sets of chromosomes contributed by the male and female parent and during this fusion of the gametic cells, all the genes of the male and female come together within the zygote forming the foundation of the future organism.
The gene usually remains part of the cell nucleus and resides within the chromosome of the nucleus. Each gene is present in pair and normally the gene in paired condition may start functioning. However, in certain cases, a gene in a single dose is capable of functioning. In the organisms, the gene may remain in alternative forms. These alternative forms arise due to changes in the primary structure of the gene and the change of structure is called mutation. The alternative forms are known as alleles and for the cause of alternative forms the character it expresses also varies. Therefore, the variation in the expression of a character is due to a mutation in the gene.
Gene in several cases shows the existence of more than two alternative forms of a gene are called multiple alleles. The gene usually occupies a particular site over a chromosome and the location where a gene is present is called locus. The loci on a chromosome are not haphazard in arrangement and they are linear in arrangement over the chromosome. Genes are responsible for the expression of phenotypic characters in the organism and the expression of a character is usually assisted by the formation of a polypeptide. The gene during its function produces mRNA and from the mRNA, a polypeptide chain may be produced by a process in the cell called translation. The polypeptide ultimately forms a protein for the expression of character. One gene for one character is not true in many cases. Sometimes one character may be expressed by the contribution of several genes. Such a character is called a polygenic character.
The male gamete carries a set of genes from the male parent for almost all his characters and the female gamete carries one set of genes for all the characters of the female parent the union of the gametes brings the genes of male and female parent giving diploid state to the zygote. In the zygote, the genes promote the development of future offspring that resembles the parents because of the presence of all the parental genes. We call this ‘heredity’. The zygote is a single-celled body with two sets of chromosomes contributed by the male and female parent therefore, the zygote is called a diploid. These chromosomes actually carry the genes and they act as vehicles for the transmission of genes from two parents. In the male gamete, sperm carries one set of chromosomes from the male parent, and in the female gamete, the ovum carries one set of chromosomes from the female parent these gametes are the haploid cells, the products of meiosis in the testis and ovary respectively.
The genes that form the foundation for the body of an organism by nature a unit elements present over a chromosome. These elements are site-specific over a chromosome and they may exist in alternative forms in an organism. These alternative forms of a gene are known as alleles and one allele may be overpowering over the other. The overpowering form of the allele is called the dominant allele and the other form is known as the recessive allele. Usually, a gene is expressed as a character which in most cases is manifested through the production of a protein. Hence, we may call that genes function by producing protein. The expression of a gene contributes a phenotype to the organism when the inner genetic constitution for which the phenotype comes is known as the genotype of the organism.
The genetic element though particulate by nature, is actually a chemical component, known as DNA. The DNA remains within the chromosome as a long chain of deoxyribonucleic acid which is again composed of four types of nucleotides in different sequences. Two strands of polynucleotide chains remain side by side in a DNA molecule forming the genetic material in a cell. A gene represents a segment of this DNA chain. The molecule is normally very stable and remains within a chromosome in highly compact condition. It remains in the cell as a storehouse of such information and on necessity it may promote the formation of various proteins for expressing phenotypic characters.
Development of the Idea of the Gene
The idea of the presence of genes controlling the expression of phenotypic characters was not very old. Only in the year 1866, an Augustinian monk named Gregor Mendel first proposed that some particulate materials contributed to the expression of phenotypic characters in organisms. These particulate materials of Mendel were later named genes. That the characters are inherited from parents to progeny was known to man from the very old prehistoric days. Domestication of plants and animals, selective breeding of good characters. Egyptian record of palm breeding etc. is the indication of ideas of the people on heredity in general.
Our age-old practices of selection of bride for marriage based on family history also support the concept of man on heredity in the old days. Such ideas had only developed based on time tested experience of people, but these ideas had no support from any experimental evidence. The people would not even know anything about the inheritance of characters. Before Mendel, some pioneering philosophers and scientists tried to explain the basis for the inheritance of characters, but all these efforts proved to be illogical and vague. Among these, the theory of pangenesis would get support from some corners.
The theory states that both male and female parents contribute pangenes to the progeny through their gametes and the offspring then express some blended phenotypic expression because pangenes from the parents got mixed in the progeny. This theory was proposed based solely on speculation and there was no evidential support in favour of the theory of pangenesis. Charles Darwin was one of the prominent supporters of this theory. However, a concrete conclusion about the mechanism of transmission of characters from parents to progeny could only be provided by Mendel on the basis of his experiments on garden peas. Pisum sativum, from 1856-1865. He first realized that organisms possess some distinct genetic element for which a character is expressed in a living organism and an organism may transmit the component to its progeny because of this, the transmission of characters from parents to progeny may be observed. Subsequently, many other discoveries have revealed the nature and behavior of the genetic elements.
However, the development of ideas on the nature and behavior of genes in living organisms has occurred through several steps. The initial concept about genes as proposed in the earlier pe¬riod starting by Mendel is known as the classical concept of genes and gradually neoclassical concept and modern concepts of genes have evolved. According to the classical view, a gene is the indivisible unit of transmission, recombination, mutation, and gene function. This concept was started in the period of Mendel in 1966. The term gene was coined by Johanssen in 1909 and he considered the gene as the unit of heredity. The chromosome theory of inheritance of Sutton and Boveri (1903) has been an additional feature of the gene to its classical concept.
Further, the concept got strong support from the discovery of Thomas Hunt Morgan and his school-taking Calvin Bridges, Herman Joseph Muller, and Alfred Henry Sturtevant. Morgan proved the chromosomal location of genes and Muller highlighted how physical agents may affect the chromosomal constitution developing mutation for the expression of new features in the individual. While Sturtevant showed the collinear existence of genes over a chromosome and he also mapped six sex-linked genes in Drosophila melanogaster. Finally, a gene as a unit of function may be defined on the basis of a complementation test. Suppose two mutations a and b in heterozygous condition gives mutant phenotype then a and b are mutations of the same gene.
On the other hand, if a and b in heterozygous conditions give wild-type phenotype that means a and b complement each other. This means that a and b are different units of function i.e., a and b are two different genetic units. In such a case, the heterozygous genotype should be written as a+/+b. In 1941 Beadle and Tatum from their experiments on Neurospora crassa elucidated the nature of gene function by proposing the ‘one gene one polypeptide hypothesis. One gene one enzyme appeared as the culmination of the classical view of the gene. Immediately after the complete formulation of the classical concept of the gene, it started to break down. Oliver (1940) and Lewis (1941) noticed the phenomenon of intragenic recombination in Drosophila melanogaster. Intragenic recombination was also noticed in Aspergillus nidulans by some other investigators (Roper 1950; Pontecover, 1952). By this time nature of genetic material was also identified from the works of Avery et al. 1944 and Harsey and Chase, 1952). Following these scientific activities, the indivisible unit of transmission, recombination, mutation, and gene functions is now not an indivisible entity.
Genetic fine structure analysis of the rII locus of T2 bacteriophage and Benzer’s proposition of the cistron concept gave a blow to the classical concept of the gene. Benzer observed that in T2 bacteriophage rll locus contains two cistrons that can be detected by cis-trans complementation test. Along both the cistron regions Benzer noticed multiple mutation sites. Further findings on intragenic recombination provoked Benzer to denote three units in the gene cistron, recon, and muton. When cistron is the unit of function, recon is the unit of recombination and muton is the unit of mutation. Now cistron is considered the other name for a gene when it is detected in terms of the cis-trans test. Thus the atom of genetic material, the gene is no longer indivisible generating the neoclassical concept of gene.
The neoclassical concept of a gene again needed modification due to some new discoveries by the 1970s. In this context, the discoveries related to this should be discussed in a nutshell.
(a) Repeated Genes:
In many eukaryotic organisms, some genes are present in multiple copies and the gene clusters are transmitted from parent to progeny as a unit. Hence, a unit of transcription and a unit of transmission is not the same. Examples of such gene clusters are ribosomal RNA genes and histone genes. This fact of repeated genes in organisms appears to be contradictory to the classical concept of genes.
(b) Interrupted Genes and Alternative Splicing:
In the case of split genes when introns interrupted the genes into several exons raised questions against the classical or neoclassical views on genes. Split genes are found in many eukaryotic organisms and their viruses and such type of interrupted gene organization has been the rule in many organisms. Thus interrupted genes indicated that there is no one-to-one linear relationship between the gene and polypeptide. Further tissue and organ-specific alternative slicing make the split gene a meaningful one creating messenger RNA. The interrupted gene produces the primary transcript that undergoes splicing to produce the functional mRNA only with the exon sequences. Again alternative splicing may permit the production of more than one mRNA from a single gene. This fact is contradictory to the basic framework of the gene as per the classical or neoclassical concept of the gene. For example, tissue-specific splicing was first noticed in the case of fibrinogen genes in men and rats (Crabtree and Kant, 1983).
(c) Overlapping Gene:
The scientists observed the presence of overlapping genes in many cases. The existence of overlapping genes was first noticed in bacteriophage φ × 174 (Sanger et al. 1977). In this virus several genes are overlapped over the same DNA molecule and the virus can encode several proteins from the same DNA. In G4 phage the same DNA strand encodes as many as three different proteins. Overlapping genes were also discovered in eukaryotic multicellular organisms such as Drosophila melanogaster. The classical concept that genes are always present in tandem over the chromosomes in contradictory to actual cases as in overlapping genes. Therefore, neither the classical nor neoclassical concept of genes is applicable to overlapping genes.
(d) Mobile Gene:
The genetic elements that can move from one position to the other in the genome are known as mobile genes. The mobile genes were first discovered in the 1940s by Barbara McClintock in maize and subsequently in many other organisms by others. In addition to their ability to move from one region to the other in the chromosomes, they can also move from one individual to the other and also to some extent from one species to the other. This is known as a horizontal transfer. Thus the existence of mobile genetic elements is against the hypothesis of the fixed location of gene in chromosomes as per classical or neoclassical views.
(e) Complex Promoters:
The promoter is a sequence towards the 5′ end of a gene segment from which a transcript comes as a gene product. To this promoter, RNA polymerase binds and carries out transcription from the beginning of the gene. Many investigators have encountered complex promoters that permit RNA polymerase binding at different sites whereby promoting transcription initiation from variable points of initiations. Thus more than one type of messenger RNA may be produced from the same DNA sequence thereby permitting the production of more than one type of polypeptide. Alternative promoters have been encountered in almost all groups of organisms. This is again against the framework of the classical or neoclassical view of the gene.
Based on the above facts it may be surmised that the classical nor neoclassical view on the definition of the gene does not hold good as a general proposition in favour of the gene concept. Therefore, a new definition is to be framed for the gene. According to modern concepts, a gene may be defined as a combination of DNA segments that together constitute an expressible unit. More formally a gene consists of elements on the chromosome that give a positive result in the cis-trans test. Population geneticists consider genes as a simple calculation unit segregating in the population.
Gene at the Stage of Expression
It has already been mentioned that genes shape life. An organism with the help of its genes appears in a particular form and behaves in a particular manner and the genes in it make it different from other organisms present in the surrounding environment. The genes through their expression come to reality and the real appearance of a character becomes the indication of the existence of a gene within the organism. The form in which a gene is expressed is known as the phenotype of the organism. On the other hand, the genetic constitution for which a character comes into appearance is known as the genotype.
Genome
The genome represents a complete set of genetic instructions needed to build the organism allowing it to grow and develop. It means a collection of all the genes of an organism. Sometimes it is represented by a haploid set of chromosomes contributed by one parent. The genome of several organisms has been read and their genome structure may be presented in the table. The instructions in the genome are made up of DNA. In eukaryotic organisms, the DNA remains within the chromosomes and each chromosome contains one DNA molecule. Within the chromosome, a section is read as a gene that produces polypeptides for a character.
All living organisms contain unique genomes. The amount of DNA in a haploid set of chromosomes is known as a genome which is expressed in a pictogram (10-12 gm) and that may alternatively be expressed as a number of base pairs present in one set of DNA in the chromosomes. The Genome as the amount of DNA present in one set chromosome is called the C value. The C value of the organisms is found to be inconsistent with the complexity of an organism. Hence, the C value is said to be paradoxical.
Genome Size
Genome size represents the total number of DNA base pairs in one copy of a haploid genome. A correlation between genome size and the complexity of the organisms may be noted in prokaryotes and lower eukaryotes. But this correlation is disturbed in the higher organism. The genome size of a number of organisms may be given as under:
Organism | Genome Size (in base pairs) | Special Feature |
Escherichia coli | 460000 | Bacterium |
Polychaos dubium | 670000000000 | Amoeba with the largest known genome |
Arabidopsis thaliana | 157000000 | The first plant genome sequenced |
Paris japonica | 150000000000 | Largest plant genome (Dec 2000) |
Saccharomyces cerevisiae | 121100000 | First eukaryotic genome sequenced |
Caenorhabditis elegans | 100300000 | First multicellular animal genome sequenced (Dec. 1995) |
Pratylenchus coffeae | 20000000 | Smallest known animal genome |
Drosophila melanogaster | 130000000 | Insect, fruitfly |
Bombyx mori | 432000000 | Silk moth with 14623 genes |
Tetragon nigroviridis | 385000000 | A type of puffed fish with the smallest vertebrate genome |
Homo sapiens | 3200000000 | Human genome with about 25000 genes |
Protopterus aethiopicus | 130000000000 | Largest known vertebrate genome (Marbled lungfish) |
Genome Composition
The contents of a haploid genome constitute the genome composition which contains both non-repetitive and repetitive DNA. In prokaryotes, most of the genomes contain 85-90% non-repetitive DNA with a very small amount of non-coding regions. On the other hand, eukaryotes contain various amounts of repetitive DNA. In mammals and plants major part of the genome is constituted of repetitive DNA. It is noteworthy that the repetitive jDNA is noncoding in nature. The eukaryotes like plants, protozoans, and animals contain DNA containing organelles beside their nuclear DNA. The genetic information contained by DNA within these organelles is not considered as the genome of the organism. The DNA present within the chloroplast is sometimes called a plastome.
The Genome of Several Organisms:
Name of the Organism | Genome Size | Number of Genes in the Genome | Remark |
1. Phi × 174 | 5,386 | 11 | |
2. Escherichia coli | 4,639,221 | 4,377 | |
3. Mycoplasma genitalium | 580,073 | 517 | |
4. Treponema pallidum | 1,138,011 | 1,039 | The bacterium that causes syphilis |
5. Helicobacter pylori | 1,667,867 | 1,589 | The chief cause of stomach ulcers (not stress and diet) |
6. Saccharomyces cerevisiae | 12,495,682 | 5,770 | Budding yeast, A eukaryote. |
7. Agrobacterium tumefaciens | 4,674,062 | 5,419 | Useful vector for making transgenic plants |
8. Neurospora crassa | 38,639,769 | 10,082 | Plus 498 RNA genes. |
9. Drosophila melanogaster | 122,653,977 | ~17,000 | the ‘fruit fly’ |
10. Caenorhabditis elegans | 100,258,171 | 21,733 | |
11. Tetraodon nigroviridis (Puffer fish) | 3.42 × 108 | 27,918 | Although Tetraodon seems to have more protein-encoding genes than we do, it has much less non-coding DNA so its total genome is about a tenth the size of ours. |
12. Mouse | 2.8 × 109 | ~23,000 | |
13. Arabidopsis thaliana | 0.135 × 109 | 27,407 | A flowering plant with one of the smallest genomes known in the plant kingdom. |
14. Humans | 3.3 × 109 | ~21,000 | |
15. Human mitochondrion | 16,569 | 37 | |
16. Epstein-Barr virus (EBV) | 172,282 | 80 | Causes mononucleosis |
Defining of Gene and Its Characteristics
A sound definition of a gene has appeared after many years of Mendel’s discovery. At the early stage gene was known to be an abstract entity whose existence was reflected only by the expression of a phenotype and its transmission. A little later the existence of a gene has been stated to be locus-specific over the chromosomes and with the progression of ideas one gene one enzyme hypothesis was described by Beadle and Tatum by 1941 to mark the functional definition of a gene. This has been again modified as one gene one polypeptide hypothesis.
However, by 1950 gene has been identified as a physical molecule that may be destroyed by the enzyme DNase. Subsequently, a gene is structurally predicted by a defined sequence of DNA which has been an extension to locus relating to a phenotype. Following Human Genome Project gene appeared as a segment of DNA that contributes to a phenotype or function. In more recent terms gene is designated as a locatable genomic sequence corresponding to a unit of inheritance. Hence, a gene may be characterized by-
- Unit of genetic material with the power of duplication.
- Unit of recombination.
- Unit of genetic material capable of undergoing mutation.
- A unit of genetic material that controls some somatic structure or function.
Though gene is defined in a simplified way as a structural and functional unit of heredity, this unit of heredity is so diversified that it is very difficult to give a single definition to satisfy all of its properties. Still, the investigators tried to give the definition of the gene in a more reasonable way “Gene is the unit of genetic information producing one polypeptide or one structural RNA”.
Types of Gene
The existence of the heredity unit in varied forms needs a discussion again to have a good idea of the nature of the genes in organisms. The genes may be categorized into different types on the basis of different criteria as indicated below-
(a) On the basis of Origin
On the basis of origin, the gene may be categorized into two types namely wild-type gene and mutant gene. The wild-type gene appears to remain with the organism from the time of its evolution and this type of gene is well adapted to the natural environment. This type of gene is frequently observed in the species population. On the other hand, the mutant gene is the alternative form of the wild-type gene which is produced in the species individual by some environmental pressure and the form is infrequent in the species population. So far adaptability is concerned it is less adaptive to the general environment. A few examples of these two forms of genes may be cited in the following way.
Examples of Some Wild Type and Mutant Characters from Pea Plant, Fruit Fly, and Human:
Organism | Wild Type Character (Gene) | Mutant Character (Gene) |
1. Pea Plant (Pisum sativum) | Tall plant height (T) | Dwarf plant height (t) |
Round seed (W) | Wrinkled seed (w) | |
2. Fruit fly (Drosophila melanogaster) | Grey body colour (B) | Black body colour (b) |
Longwing (Vg) | Vestigial wing (vg) | |
3. Human (Homo sapiens) | Attached ear lobe | Free ear lobe |
Rh positive blood | Rh negative blood |
(b) On the basis of genic location in the cell, the genes may be categorized into the following types:
(i) Nuclear Gene:
Most of the genes of higher eukary¬otic organisms genes are nuclear in location. The nuclear genes are usually present in the chromosomes and based on the type of chromosome on which a gene is present they may be typified as autosomal gene – when a gene is present on the autosome in the cell, sex-linked gene – when a gene is present on the X chromosome in the cell and holandric gene – when a gene is present on the Y chromosome so the genic expression occurs only in male individuals. Further, a gene may neither be called autosomal nor sex chromosomal and it is called a pseudo-autosomal gene – when the gene under consideration is present on both the autosome and X chromosome. Usually, such a gene is present on that part of the X chromosome which is homologous with a part of the autosome. Sometimes a gene may not be permanently confined to a particular chromosome; rather such a gene may change its location from one chromosome to the other or from one position of the chromosome to the other. Such genes are called jumping genes or transposable genetic elements. An example of some characters whose genes are located on different chromosomes may be given as under.
Examples of Some Nuclear Genes:
Organism | Gene for Character | Chromosome |
1. Fruit fly | Body colour | Autosome |
Eye colour | X chromosome | |
Bobbed bristle | Pseudoautosome | |
2. Human | PTC testing | Autosome |
Colour blindness | X chromosome | |
Hypertrichosis | Y chromosome | |
3. Human | Jumping gene | LINE and SINE |
4. Maize | Jumping gene | AC-DS system |
(ii) Cytogene or plasma gene:
In this case, the gene is present in the cytoplasm. The genes of the cytoplasmic organelles namely mitochondria and chloroplast are cytoplasmic genes i.e., plasma genes.
(c) On the basis of the Time of Functioning of a Gene
Genes present in the organisms are found to vary in their time of functioning. For example, in humans, there are about 21000 genes. But all the genes do not function all the time. Not only that though all our body cells contain the same number of genes only some particular genes may function in a particular cell. Therefore, on the basis of functioning the genes may be categorized into the following types.
Genes of different types on the basis of time of activity:
Type of Gene | Properties | Example |
Embryonic gene | Remain active only during embryonic life | Pair rule gene in the fruit fly |
Geronto gene | Come to expression only during old age | |
Luxury gene | Tissue or organ-specific gene that is expressed only in need. | |
Housekeeping gene | The genes are needed for the basic functioning of the cell. | tRNA and rRNA genes |
(d) On the basis of the Activity of the Gene
On the basis of activity, the genes may be categorized into different types and initially, there may be two types of genes on the basis of their impact on the life of an organism. These are beneficial or adaptive genes – meaning the gene that helps to maintain life in a better way by an organism. This type of gene is usually adaptive to the organism in an environment and harmful gene – that is not at all helpful for the organism to live in an environment. Hence, this type of gene is non-adaptive for an organism. In extreme cases, the harmful gene may be lethal which is fatal for the life of an organism.
Sometimes Lethality of a gene depends upon conditions and this type of lethal gene is called a conditional lethal gene. In Drosophila melanogaster curly wing gene is dominant and lethal, but it is a conditional lethal gene as the gene is lethal to the flies only in homozygous conditions. The harmful genes for example as named on the basis of their functions are oncogene and mutator genes. The oncogene promotes the development of cancer in the body and the mutator gene induces the origin of mutation in the body.
On the other hand, the beneficial genes or adaptive genes are generally normal counterparts of the harmful genes namely protooncogene and tumour suppressor genes. Protooncogene when undergoes mutation may be converted into an oncogene. Some genes in normal conditions can suppress the repression of other non-allelic genes and such a gene is called an epistatic gene. The gene which is suppressed by the epistatic gene is called the hypostatic gene.
Sometimes a gene may be related to the expression of more than one character and such a gene is called a pleiotropic gene. Similarly, one character may be expressed by the involvement of more than one gene. Such genes expressing a single character are called polygene. In the case of prokaryotes mainly some genes are found those regulate the expression of several genes and such genes are called regulator genes. Whereas the genes that are regulated by the regulator gene are called reporter genes or structural genes. Sometimes a gene may not show its function in the life of an organism and behaves as a dormant gene. But the same gene would function in the past. This type of gene is called a pseudogene. Some genes are found to regulate the expression of some other genes and therefore, there are some genes that may be called regulator genes.
(e) On the basis of the Power of Expression
Power of expression varies from one gene to the other and on the basis of this power of expression, the genes may be classified into several types as shown under. In eukaryotes mostly for the expression of a character usually double doses of the same gene should be present which may be achieved by the diploid organisms during zygote formation. The genes present though are the representation of the same gene but sometimes they may be structurally little different from each other. Such alternative forms of the same gene are called alleles. The allelic variations show also variability in their power of expression.
Gene Types on the Basis of their Power of Expression:
Type | Properties | Example |
Dominant Gene | When one allele suppresses the other allele to be expressed. | Gene for brachydactyly |
Recessive Gene | The allele remains suppressed in the presence of its dominant allele. | Gene for albinism |
Co-dominant Gene | When both the alleles are equally potential for their expression. | IA and IB gene for ABO blood group |
Incompletely dominant Gene | When a wild-type gene cannot completely suppress the expression of its allele that appears to be recessive, is called an incompletely dominant gene. | The gene for the red flower colour in Mirabilis jalapa is incompletely dominant over the white flower colour |
Pseudogene | The gene that is inactivated is usually due to mutation. It is so changed that it loses the structural similarity with the original gene. | Globin pseudogene |
The classification of genes as indicated above may give us an idea of the identity of the genes present in the living system. But the given definition or classification does not give a clear idea of the structural dimension of a gene. In this context, the discoveries of Avery, MacLeod, and McCarty or of Harsey and Chase are to be mentioned. These discoveries were milestones in the development of the science of Genetics. Their discoveries proved that DNA is the genetic material. After this discovery, the formulation of the Double Helix model elucidating the structure of DNA present in the living body as well as the discovery by Yanofsky and his colleagues revealed the co-linearity of the amino acids and nucleotide sequence in DNA and the revelation of the presence of genetic codon along mRNA helped the investigators to propose a concept on the structural dimension of the gene in action.
Genes thus represent a segment of the DNA polynucleotide chain that is responsible for the synthesis of a polypeptide chain of protein or a functional RNA molecule. The polynucleotide chain denoting a gene varies from one gene to the other, because the gene products in all cases are not similar in nature as well as in structure. In this context, the history of the development of ideas on the structure and function of hereditary units starting from Mendel to the current time should be brought to light. However, before a discussion on this matter, some fundamental properties of the gene as known up to date must be presented.
(f) On the basis of Structural Configuration
Based on the structural organization of genes on the genome they may be categorized into different types and the different types of genes may be indicated in the following way.
Genes on the basis of their Structural Organization:
Type | Brief Description | Example |
Single Copy Gene | The gene is present as a single copy in the organism. The gene may be altered by mutation. | 60-70% of the functional genes of the organisms are of this category. |
Interrupted or split gene | Sometimes a gene may remain in the genome as fragmented sequences with some intervening sequences. Such an arrangement of a gene is called a split gene. The gene may be present over the genome along a long stretch and during transcription whole of the region is transcribed later the intervening sequences are cleaved out when the transcribed splitted segments of the gene are joined to a mature mRNA. The intervening sequences are known as introns and genic segments are called exons. | Human insulin-producing gene. |
Overlapping | When a single stretch of DNA codes for portions of two separate proteins, the genes are said to be in an overlapping arrangement. | Genes B, E, and K overlap in φ × 174 |
Jumping gene | The gene is present over a chromosome when can change its position it is known as the jumping gene. | Ac/Ds system in maize. |
Cryptic gene | A gene representing a silent DNA sequence is known as a cryptic gene. | Cryptic genes are found in microorganisms. |
Fundamental Properties of the Gene
- The hereditary unit, the gene, is the fundamental unit within the genetic material and high potential to maintain its integrity in all situations.
- The gene is highly stable and seldom undergoes change under external influence.
- It is represented by a stretch of the polynucleotide in DNA responsible for the synthesis of a polypeptide chain or functional RNA.
- It is indivisible and cannot be changed by recombination and crossing over with a gene segment.
- The unit of heredity also acts as a unit of function and this may be detected by the cis-trans complementation test.
- The genes remain linearly arranged over the chromosomes.
- Except for the mobile genetic elements, all the genes occupy a specific position on a chromosome.
- A gene may be available in an alternative form and the number of alternative forms sometimes may be more than two.
- Alternative forms of a gene may be having different potentiality in expression and therefore, a gene may be dominant, recessive, or incompletely dominant in nature.
- Normally a gene is very stable in nature, but under the influence of certain external agents, a gene may undergo change causing alternative expression in phenotype.
- All the activities in the life of an organism are controlled by the activities of the genes present in the organism.
- Any gene may be called as the storehouse of information that is expressed through the production of some transcript.
- The genes have the power of replication in the cell for transmission of the information from the mother cell to the progeny cells or from the parental organism to the progeny organisms.
- Gene expression in any organism is highly regulated and its expression is always need-based.
- These genes in the organisms act in a fashion to give addictiveness to the organism for maintenance of life in nature and they also promote continuity of life.
Gene Concept
Developmental Sequence:
Nothing was known to the scientific world about the gene present in living organisms before Mendel. However, he proposed the existence of some factors controlling the expression of features of a character. The Mendelian factors those accordingly were very stable and non-miscible and later on were named genes. Prior to Mendel though people of the pre-Mendelian period were aware of the phenomenon of inheritance of characters but the mechanism of transmission of characters was completely based on abstract ideas. Mendelian hybridization experiments on garden pea, Pisum sativum, considering different characters clearly revealed the existence of particulate genetic elements controlling the mechanism of heredity. These particulate materials may have alternative forms but have different potentialities for the expression of features.
According to Mendel the particulate non-miscible genetic elements may remain together in an organism and are segregated during gamete formation. The features are transmitted to the progeny through the gametes of the organisms. According to Mendel these hereditary factors are independent and may combine at random appearing in all possible combinations in the gametes. Such concepts on the genetic elements as the hereditary units were the first and basic idea about the present-day gene. The Mendelian idea about the character and the presence of some unit factors behind the character was very simple but received valid recognition and wide acceptance after the rediscovery of Mendel in 1900. Following this, scientists tried to investigate the actual nature and molecular organization of genes in animals as well as in other organisms.
At the present state of knowledge, a gene is a fundamental unit in the genetic material that stores some information that flows from DNA to protein or some RNA, and that protein or RNA is under the influence of many other conditions of the surroundings either of the cell itself or from the environment producing an effect on the cell. The cumulative effects on many of the tissues give something easily detectable impact which ultimately appears as a phenotype in the organism. Therefore, the concept of a gene and its function may be presented through the following diagram.
Mendelian single factor single phenotype concept has been modified because there may be many genes that contribute to one phenotypic trait. The expression of the gene at the individual level has a little or insignificant impact unless it is considered at the level of population of species.
Inborn error of metabolism: Garrod’s one gene one metabolic block concept:
Sir Archibald Garrod, a British physician, observed that due to the accumulation of homogentisic acid (or alcapton) in urine of the newborn babies, a disease is developed called alkaptonuria in which the urine of the babies turns black upon exposure to air. Besides the excess accumulation of homogentisic acid in the body dark pigmentation occurs in the cartilage upon exposure to light. On account of this, discoloration of the ears, the tip of the nose, and white of the eyes become prominent and such deposition in cartilage develops a form of arthritis in later life. Garrod believed that this is due to a block in the pathway of the metabolism of the amino acid tyrosine. He also analyzed the pedigree of the affected families and he came to the conclusion that the disease is caused due to a recessive genetic defect.
The affected condition develops in homozygous condition for a single gene disorder with metabolic block and this has been postulated as one mutation one metabolic block. Garrod’s idea of metabolic block was confirmed many years after his initial findings and an enzyme, homogentisic acid oxidase is lacking in the defectives for converting homogentisic acid into maleylacetoacetic acid. According to Garrod alkaptonuria is a type of inborn error of metabolism and this includes many other diseases like alkaptonuria. Several such diseases are phenylketonuria, tyrosinase, and albinism (OCA). In all these defects metabolic block appears in the pathway of the metabolism of an essential amino acid i.e., phenylalanine. There are nine essential amino acids namely histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine. How metabolic block in the pathway of phenylalanine metabolism may lead to defective conditions may be shown in the following diagram.
Phenylalanine obtained from dietary proteins is converted into tyrosine which then is converted into DOPA that may be converted into dopamine before synthesizing melanin in the skin. The first blockage is found to appear at the stage of conversion of phenylalanine to tyrosine and at this stage lack of phenylalanine hydroxylase results in high-level accumulation of phenylalanine in blood as well as a metabolic of amino acid, phenyl pyruvic acid. This results in impairment of brain development leading to the condition of phenylketonuria (PKU). In PKU the affected individual exhibits many neurological disorders including convulsive mismeasures, enhanced reflexes, and mental retardation. Because melanin is also the product in the pathway of phenylalanine metabolism, PKU patients usually show lighter hair and skin colour. The disease is inherited as an autosomal recessive genetic disorder and was first described by Ivan Foiling in 1934.
However, the deficiency of phenylalanine hydroxylase as the cause of this disease was first identified by Jorge Jervis in 1953. The incidence of this genetic defect in the population is about 1 in 12000 births. If the PKU-affected baby’s mother is normal, during pregnancy intrauterine development of the brain is not affected, because maternal enzyme may cross the placenta and helps the baby in metabolizing phenylalanine. Such a baby gets severally affected after its birth. Conversely, if the mother is a homozygote for the recessive gene of PKU, the baby will remain at high risk of getting damaged neurologically irrespective of its genotype. In this case, internal accumulation of phenylalanine may cross the placental barrier to affect the development of the baby.
Neurological development is thought to be completed by the age of 6 years and therefore, if the affected baby is supplied with restricted diets free from phenylalanine the baby may show normal development. Therefore, screening of newborn babies for PKU should be done by the states so that untreated cases may be very low. The affected newborn may develop normally without mental retardation if they are screened and treated from the early days of their birth.
In the metabolic pathway of phenylalanine, the absence of tyrosine which is required for oxidation of tyrosine to produce DOPA and conversion of DOPA into menaquinone, the precursor of melanin results in the most common form of albinism called musculocutaneous albinism (OCA). The condition appears as a recessive autosomal genetic disorder and in this condition melanin synthesis does not occur in the skin, hair, and eyes. Garrod (1908) predicted that albinism might be a genetic disorder like alkaptonuria.
One Gene One Enzyme Concept of Beadle and Tatum:
George Beadle and Edward Tatum carried out some pioneering experiments on Neurospora crassa and isolated many biochemical mutations in this fungus. Concomitant biochemical analysis along with a genetic screening of the mutations led them to propose the one gene one enzyme concept for which they were awarded Nobel Prize in 1958.
The model organism, Neurospora crassa is the common bread mold that can grow in a minimal medium that contains inorganic salts, a simple sugar as a carbon source, and the vitamin, biotin. Such a growth medium of Neorospora is known as a minimal medium and the fungus may synthesize all the essential metabolites as amino acids, purines and pyrimidines, and other vitamins during their growth. Beadle and Tatum were of the opinion that the biosynthesis of the essential ingredients must be under the genetic control of the fungus and a mutant strain may grow in the minimal medium having the supplement meaning a deficiency in the mutant fungus. With these speculative ideas, Beadle and Tatum started experiments taking its wild-type variety. They first exposed the asexual spores, the conidia, of the fungus to X-rays or UV light with the idea that retardation might have caused a mutation in the wild-type spores.
To test the fact whether radiation could have caused mutation or not, they tried to grow the treated spores in the minimal medium. As a result, they observed that the treated spores failed to grow in the minimal medium emphasizing the validity of the concept in their experiments. Now to grow the mutant strains in the laboratory, they provided the spores a complete medium that contained all amino acids, all vitamins, puine, and pyrimidines in the minimal medium. The treated spores as expected showed normal growth in the complete medium. Then the necessity arose for biochemical and genetic analysis for screening and isolation of the biochemical mutations in Neurospora. To make their study more precise, they screened the mutants for single gene disorders.
In this screening test, they crossed a mutant variety with the wild-type Neurospora. Through sexual reproduction with the opposite strains of fungi, they produced spores and analysis of the spores from ascus could easily indicate single gene disorder. In the case of single gene segregation of wild type and mutant would appear in 1 : 1 ratio. This type of screening test helped Beadle and Tatum to testify several mutant strains when all of them required arginine as the supplement in the minimal medium for the growth of the mutant. However, the mutant strains could be isolated depending on the requirement other than arginine as the supplement in the medium.
One mutant strain (Marked as arginine H) could grow only with arginine as a supplement. Another strain G could grow with argininosuccinate as a supplement even without the arginine supplement and another strain marked as strain F’ could grow with the supplement citrulline. However, this strain could also grow with supplements either argininosuccinate or citrulline. Further, they could be able to isolate a fourth mutant that could grow with methionine as a supplement in the minimal medium. The last variety mutant could also grow having any of the other three supplements in the minimal medium. A matrix for the nutritional requirements of related Neurospora mutants may be prepared in the following manner on the next page.
Difference mutants of Neurospora were common for arginine requirement but could grow in the minimal medium with some other supplement other than arginine:
Growth response of different mutant strains in supplemented media:
Supplemented in the medium | |||||
Strain | None | Ornithine | Citrulline | Arginosuccinate | Arginine |
Wild type | + | + | + | + | + |
Arg E- | – | + | + | + | + |
Arg F- | – | – | + | + | + |
Arg G- | – | – | – | + | + |
Arg H- | – | – | – | – | + |
From the above table it appears that the mutant forms isolated in this series were interrelated and were indicative of the pathway arginine biosynthesis and the ingredients come in the following ways:
Precursor → Ornithine → Citrulline → Arginosuccinate → Arginine
Further biochemical analysis indicated that each of the mutant forms was lacking one enzyme. The genetic analysis of the mutants indicated that each of the mutations in the fungus was related to a single gene disorder.
From the analysis of about 80000 spores of Neurospora Beadle and Tatum identified that in each case of deficient mutation, one enzyme failed to be produced by the mutant strain. This in another way correlates one mutation with one enzyme deficiency based on these findings Beadle and Tatum hypothesized that one gene specifies one enzyme. Thus the experiment of Beadle and Tatum on the isolation of biochemical mutations in Neurospora and their biochemical analysis for the deficiency of enzymes gave the foundation of the one gene-one enzyme hypothesis.
Similar other investigations were also made on Neurospora to prove the authentication of the hypothesis of Beadle and Tatum. One such investigation relates to the biosynthesis of methionine in Neurospora crassa. The steps in methionine synthesis and related genes involved in the production of the enzyme may be shown on the next page.
Matrix for different mutations with relation to methionine biosynthesis in Neurospora and their supplements for growth in the minimal medium and the related enzyme deficiency may be indicated in the following table.
Matrix for mutations related to methionine biosynthesis in Neurospora and the enzyme deficiency in each case of mutants:
Fungus Type/Supplement | Homoserine | O-acetylene homoserine | Cystathionine | Homocysteine | Methionine |
Wildtype | + | + | + | + | + |
Met5+ | – | + | + | + | + |
Met3+ | – | – | + | + | + |
Met2+ | – | – | – | + | + |
Met8+ | – | – | – | – | + |
Mutant/Enzyme Deficiency | No deficiency | Homoserine transacetylase | Cystathionine synthetase | Cystathionase II | Methyl tetrahydrofolate Homosysteine trans methylase |
Note that one mutant type is deficient for one enzyme and it needs a specific supplement denoted by + in the matrix column first in the serials. Therefore, a direct relationship between the mutant and enzyme deficiency may be correlated.
The emergence of One Gene One Polypeptide Concept
Though one gene one enzyme concept of Beadle and Tatum was a landmark discovery in the early 1940s yet it created a lot of confusion among the scientists and they were in a dilemma to accept the hypothesis as a universal concept in explaining the mechanism of gene expression or gene action. Such confusion arose because of several reasons and these are
- There are many enzymes that are formed of more than one type of polypeptide chain: one heteromeric protein.
- One gene may produce one enzyme, but the production of a heteromeric protein by a single gene appears to be ridiculous.
- In many cases when a gene undergoes mutation it gives an altered character, then how does a defective enzyme from a gene give an altered phenotype?
- There are many genes that show their expression by a protein in which the involvement of a single enzyme may not be found. Therefore, one gene one enzyme concept is contradictory to these facts.
However, in the subsequent period after the discovery of the one gene one enzyme hypothesis chemical analysis of human hemoglogin cleared much confusion on the mode of gene action and the phraseology one gene one enzyme. It was initially changed to one gene one protein hypothesis. However, this changed phraseology also could not stand for a long period and has been changed to one gene one polypeptide hypothesis. Vernon Ingram’s protein fingerprint analysis of sickle cell hemoglobin (HbS) of man during the period 1954 to 1957 led to the origin of this concept on genic functions and it has been accepted universally by the scientific world.
Analysis of Human Haemoglobin and its Variants in the Development of One Gene One Polypeptide Concept
Haemoglobin is an important metalloprotein present within the RBC of man and helps in the transportation of oxygen in the body. The protein in its quaternary structure contains four polypeptide chains with iron. Of the four polypeptide chains, two are alpha polypeptide chains each containing 141 amino acids and two are beta polypeptide chains each containing 146 amino acids. Hence, this is one heteromeric protein. Normal hemoglobin is known as HbA. There are some variants of HbA and these are HbS, HbC, HbE, HbG, etc. as available in the human population, and these lead to abnormal conditions in man. HbS is the most important one and develops the fatal disease. Sickle cell anaemia in man.
In normal conditions, RBC is disc-shaped and it contains HbA. But in the case of sickle cell anemia RBC contains HbS and because of this RBC becomes sickle-shaped and clogs the capillary preventing smooth blood supply to in capillary system of blood vessels. This causes deprivation of oxygen in the tissues of the body resulting in the effect of anaemia. This is called sickle cell anemia and if untreated this condition may lead to fatal condition. Along with this crisis, the affected individual becomes anaemic because of the rapid destruction of their erythrocytes. Some other associated abnormalities with this are dilation of the heart, abnormal bone size and shape, and damage to the kidney, brain, muscles, lungs, and joints. Because of all these abnormalities, the affected individual cannot survive for a long period.
James Neel and E. A. Beet in 1949 revealed that this disease is inherited in man according to Mendelian pattern and appears due to a single gene disorder. They also denoted that the defective gene is almost equally expressive with its normal allele. The defective gene and its normal allele may be designated as HbS and HbA. From the pedigree analysis, it could be revealed that there may be three genotypes for these two alleles with the corresponding phenotypes. Three genotypes and the corresponding phenotypes in man may be presented in the following table.
The Phenotypes and Associated Symptoms for HbA and HbS:
Genotype | Phenotype | Features |
HbA/HbA | Normal | |
HbA/HbS | Sickel Cell Trait | Less sickling of RBC, so sickling crisis is less, may lead to normal life with less suffering. |
HbS/HbS | Sickle Cell Anaemia | Sickling of RBC, Rapid destruction of RBC, severe sickling crisis, anaemia, bone deformity, etc. |
The heterozygotes (HbAHbS) showing sickle cell trait are carriers of the defective gene (HbS) and may transmit the defective gene to 50% of their progeny. The individuals having sickle cell hemoglobin get some adaptive advantage because they are more resistant to malarial infection. Especially the parasite Plasmodium falciparum that causes malignant malaria can hardly develop infection in men carrying HbS. RBC-carrying HbS does not permit the malarial parasites to complete their life cycle in man. Further, a rise in acidity in the red blood cells due to parasite entry in these cells also promoted more sickling and lysis of the erythrocytes. In addition, the concentration of K+ becomes less in the sickling RBC which is unfavourable for the survival of the parasite.
Concomitant with the revelation of the mode of transmission of the gene for sickle cell anaemia, chemical analysis of hemoglobin obtained from normal, sickle cell trait, and sickle cell anaemic persons was done by Linus Pauling and coworkers. When hemoglobins from three categories of men were subjected to starch gel electrophoresis, they exhibited differential migration patterns between cathode (-) and anode (+). The haemoglobin from normal individuals showed a faster movement than the haemoglobin from sickle cell anaemic individuals. Both HbA and HbS migrate towards the +ve pole in the electric field suggesting a -ve charge of both the haemoglobins.
However, HbA is more negative in comparison to HbS suggesting a chemical difference between the two types of haemoglobins. Further, this study of Pauling and others also revealed that the haemoglobin of sickle cell trait individuals contained both HbA and HbS almost in equal amounts. The concept of one gene one protein hypothesis got support from this study of Pauling and others because the chemical property of HbS is the main cause of the genetic disorder. Sickle cell anaemia.
However, the study still faces the question of the heteromeric nature of haemoglobin and single gene disorder. The answer to this question has come from the investigation of Vernon Ingram (1957) on human haemoglobin. With the use of the protein fingerprinting technique when normal haemoglobin and sickle cell haemoglobin were analysed by Ingram and he found that both the haemoglobin differed by one amino acid in the beta-polypeptide chain. The amino acid glutamic acid at a particular position in the primary beta-polypeptide chain of normal haemoglobin is replaced by valine in HbS.
The fingerprint analysis of Ingram involved the protocols in sequence as enzymatic digestion of both HbA and HbS (with the help of trypsin) separately, placement of the mixture of peptide fragments resulted due to enzymatic digestion on an absorbent paper, and exposure to the electrical field, exposure of the paper having separated fragments in the electrophoretic field in a solvent after turning at a right angle to exert chromatographic action on the paper in order to cause migration of already separated fragments for the second time. The end result was a two-dimensional separation of peptide fragments on the paper that appeared as distinct patterns of spots with special treatment. The fingerprints for peptide fragments from HbA and HbS, of Ingram exhibited that the difference lay in the 4th peptide fragment from the β-polypeptide of hemoglobin. The 4th peptide fragments from HbA and HbS took different positions on the absorbent paper.
Further analysis exhibited that the 4th peptide fragment contained eight amino acids and a change of amino acid appeared at the 6th position in the fragment obtained from HbS. This change was valine in place of glutamic acid in the normal haemoglobin (HbA). This discovery of Ingram clearly revealed that the single gene disorder of sickle cell anaemia is concerned with the alteration in one polypeptide chain, i.e., the beta polypeptide chain of the two polypeptide chains of hemoglobin. This formed the basis for the change of phraseology one gene one enzyme or one gene one protein into one gene one polypeptide.
Genetic basis for sickle cell anaemia may be predicted. Under the current state of knowledge, one gene one mRNA promotes the production of a polypeptide chain for protein. The polypeptide chain determines the primary structure of a protein. In this context, two genes are said to be responsible for the formation of α and β polypeptide chains of haemoglobin. Under this consideration the genotype for normal haemoglobin (HbA) production may be designated as α2Aβ2A and for production of sickle cell hemoglobin HbS the genotype of the affected person should be α2Aβ2S; whereas the same for the heterozygote (sickle cell trait) individual may be designated as α2Aβ2AβS meaning that the heterozygote may produce both HBA and HBS almost in equal amounts.
The amino acid sequence in a protein is determined by the sequence of codons in the mRNA. Therefore, the change in amino acid sequence in the beta polypeptide chain of haemoglobin is due to the change in the code word for glutamic acid in normal haemoglobin. The codon sequence for glutamic acid is GAA and that for valine is GUA, indicating a replacement of A by U in the codon of the mRNA for the Beta polypeptide chain. This sort of nitrogen base replacement in the codon of mRNA may come through site-specific base replacement in the DNA region meaning the gene for a beta-polypeptide chain of haemoglobin. The DNA region bearing the message for amino acid glutamic acid incorporates a defect when a message for valine occupies the position of glutamic acid. Hence, a conversion from A=T to T=A at a particular site in DNA causes the development of sickle cell anaemia. This type of point mutation in the gene for a beta polypeptide is one transversion type of base substitution.
Sickle cell anaemia is more prevalent in the black population of Africa. In the population of African American 1 among 625 births appears to be affected by sickle cell anaemia. Some variants of human haemoglobins are also known to us that are supportive of one gene one polypeptide hypothesis. Over one hundred varieties of haemoglobin have been reported from humans and they differ by their electrophoretic mobility. Because of gene mutation defective haemoglobins may be produced and they are concerned with the replacement of amino acids at various positions. Several variant forms of hemoglobins may be presented in the following table.
Several hemoglobin variants of man developed due to the substitution of amino acid in the beta polypeptide chain:
Haemoglobin Type | Amino Acid from the β Chain | Position of amino acid | Amino acid after replacement | Effect due to altered condition |
Haemoglobin C | Glutamic acid | 6 | Lysine | Mild anaemia |
Haemoglobin D | Glutamic acid Glutamic acid |
121 | Glutamine | No known abnormality |
Haemoglobin E | Glutamic acid Glutamic acid |
26 | Lysine | No clinical manifestation |
Genes in Terms of Structural Analysis
In the previous section of this chapter, it has been discussed how a gene acts as a functional unit of genetic material. Naturally, a question may arise what should be the configuration of the structure of a gene when classically a gene is defined as a structural and functional unit of heredity? To account for the origin of ideas on this structural unit of heredity or the Mendelian indivisible and non-miscible factor for phenotypic features, the concept of beads on a string structure that appeared prior to 1940 is noteworthy. Along with this concept about the genes also appeared that a gene is non-miscible by recombination or mutation. It means that recombination is only possible between two genetic units and mutation may affect only one gene or a genetic unit for alteration of a characteristic feature.
However, this type of concept on the structure of the genetic unit is contradicted by the discovery of complex locus. The complex locus may be divisible into several sub-loci when all of them are concerned with the same phenotypic feature. Recombination between these sub-loci though is extremely low in frequency, but may be possible giving an impression that recombination is possible between a gene. A study on the garnet locus on the X chromosome of Drosophila melanogaster by Chovnick may be cited in this regard. There are two mutations available along the garnet locus namely garnet1 and garnet2 both affecting the eye colour of the flies.
Considering two marker genes namely sable(s) and pleated (pi) located to the left and right of this locus at a distance of 1.4 and 3.5 units respectively (when the garnet locus is present at 44.4 units on the X chromosome) Chovnick constructed the heterozygote female differing in combination of genes on the two X chromosomes as Sg1+/+g2pl and therefore, such heterozygote showed wild type features from sable and pleated genes. Not only that these heterozygote females also produced s+pl (sable normal pleated) progeny and rg1g2 (only garnet) progeny through crossing over in very low frequency, i.e., about 0.003%.
A similar finding was also observed by C. P. Oliver for the X-linked lozenge locus of Drosophila melanogaster. Two mutations namely Iz+ (Spectacle eye) and Izg (glossy eyes) are considered to be two forms of mutation of the same gene on the X chromosome. Oliver found that Izs/Izg females showed lozenge phenotype rather than wild-type eye features. However, a cross between each heterozygote female and Izs or Izg male resulted in wild-type progeny by about 0.2 percent. Taking some marker genes outside the lozenge locus when the Izs/Izg heterozygote female was involved in a cross the progeny with wild-type eyes were produced in extremely low frequency. Surprisingly the rare normal progeny also carried the marker features in expressed form. This result could not occur unless crossing over could have occurred between the genes of the lozenge locus. One student of Oliver named M. M. Green carried out a detailed study on the lozenge locus containing four groups of mutations. According to their analysis, the recombination between the lozenge alleles ranges from 0.03 to 0.09 percent. The arrangement of some of the lozenge alleles along the X chromosome may be shown in the following diagram.
In this context, it should be mentioned that these alleles in many cases fail to respond to the cis-trans position effect. When two mutations of the same locus are present on one chromosome, it is said to be the cis-combination of the mutant genes. Therefore, one heterozygote showing cis-orientation of two mutant genes obviously shows a wild phenotype because one of the two homologous chromosomes carried wild-type genes for the mutations. On the contrary two mutant alleles being present on either of the homologous chromosomes exhibit their trans arrangement. In trans association, if the mutant genes show a wild phenotype, it is said to be a complementation.
Many of the mutant alleles of the lozenge locus fail to show implementation. The mutations of the lozenge locus are divisible units within a gene locus separable by recombination. It is now suggested that when a gene contains many forms of alleles, they may show recombination if a sufficiently large number of progeny are raised. On the basis of the cis-trans position effect of mutant alleles belonging to complex locus, Lewis hypothesized that alleles are arranged in linear order over the chromosome. The mutant allele in cis-position over a chromosome gives a defective linear product not favourable for the development of the normal feature.
On the contrary, the wild-type segments for these alleles on the other homologue in the heterozygote are capable of producing normal products from these segments to promote the expression of wild-type features. In trans-arrangement the mutations on both the homologues interfere with the normal gene expression. It is, therefore, logical to propose that a gene may be formed of several sub-loci that are very linear in their arrangement and all the segments may give products sequentially in linear order those give rise to an ultimate product for a phenotype.
Therefore, the structure of a gene may be determined on the basis of its functional parameters. When it gives one polypeptide chain as the final product from the information stored in it, a gene denotes a stretch of nucleotides over the DNA molecule of the cell or an organism. In this respect, it should be pointed out that the stretch of DNA denoting a gene forms a stretch of ribonucleotides in the mRNA through transcription and that may be sequenced as triplets for amino acids in successive order linearly. The gene that gives a final product, a polypeptide, must have to produce one mRNA as an intermediate molecule through transcription is indicative of the structural organization of the gene in DNA.
However, sometimes a product of a gene is only one RNA molecule as in the case of rRNA and tRNA genes and some catalytic RNAs. But in all these cases RNA showing linear in stretches of ribonucleotides advocates the linear organization of a gene with many nucleotides in a linear array. Therefore, a gene may be defined as a stretch of deoxyribonucleotides in the DNA of the cell having the potency to form one RNA for producing a polypeptide or to serve other cellular functions.
Complemenation Test and Genetic Analysis
The complementation test appears as a useful method for detecting the functional genetic unit in the living organism. The discovery of the cis-trans position effect laid the foundation for complementation tests in genetics. Two mutant genes contributing to one phenotypic character sometimes create confusion about their allelic nature and a complementation test may help to recognize their identity with certainty. By definition, complementation represents the potentiality of two mutant genes to produce wild phenotypes in their trans arrangement. This may be explained with the help of an example.
Suppose, there are two mutations namely m1 and m2 contributing to the same phenotype and they are needed to be examined for their allelic identity. For this purpose, a trans heterozygote for these mutations is to be raised and the phenotype of the heterozygote is to be observed. If the trans-heterozygote for these two mutations (m1 and m2) exhibits wild type feature, then the mutations are said to complement each other and the mutations are taken to be non-allelic and belong to different functional units in the genome.
It is to be pointed out here that the mutations in their cis arrangement exhibit wild-type features. Mutations in their cis-arrangement show that they are present on one chromosome of the two homologues, while the other homologous chromosome contains the wild-type alleles of the mutant genes. The particular chromosome having the wild-type alleles may produce the wild-type feature. But in trans arrangement, the expression of wild type feature is considered to be due to complementary interaction of the genes occupying their positions on either homologous chromosomes. The interaction may be shown in the following way.
To produce a heterozygote in order to test complementation two homozygotes (m1/m1 and M2/m2) for two different genes are to be crossed. The resultant progeny from such a cross may be observed for phenotypic expression. If the trans-heterozygote exhibits the mutant feature, then two mutations may be considered allelic in nature and they belong to the same functional unit. On the other hand, if the mutations in trans heterozygotes produce wild-type expression, they are taken to be of two functional units. As only the heterozygote with trans orientation of the mutation is to be considered for testing it is called a trans test.
Allelism and Complementation
From the previous discussion, it becomes clear whether the mutations are allelic or not which can be determined by complementation test. Non-allelic genes belonging to two functional units respond to the complementation test and they also recombine performing crossing over between them. But there are some allelic forms of genes that do not respond to complementation tests, yet they may recombine showing the very low frequency of recombination. This may be possible because a gene represents a stretch of nucleotides and mutation may occur at different sites of this stretch. These functional alleles share the common functional sites but appear due to a change of base sequence at such sites that may show recombination though they fail to complement. Failure to complementation denotes their allelic nature no doubt, but the allelic form of genes that fail to complement and also do not recombine are called homoalleles. On the other hand, the mutant forms of genes that fail to complement but may recombine at low frequency are called heteroalleles. The occurrence of recombination between such alleles is antagonistic to their structural identity as allelic, i.e., they appear to be structurally non-allelic.
Complementation and Gene Mapping
Because the complementation test may help to identify the functional alleles of a gene, it may also help to locate one allelic form of a gene in a specific site on a chromosome. How the complementation test helps to locate a gene in relation to other genes on a chromosome may be illustrated with an example. In bread mold, Neurospora crassa some mutations are found to lack the power of producing the amino acid histidine and because of this, the mutant form cannot grow in a minimal medium. Formation of heterokaryon by introducing the nucleus of the other mutant form in the mutant cell lacking the power of histidine synthesis complementation may be observed when the heterokaryon may be wild type by expression. In the -3 locus of the 1st chromosome of Neurospora five mutations namely CD-16, 245, 261. D-566 and 1438 could be observed. With the formation of heterokaryons taking two mutant nuclei together, the response of the mutations to complementation was detected. On the basis of such a response, a complementation matrix for these five mutations may be framed as under Fig.
Complementation matrix for five histone-deficient mutations (CD 16, 261, D 566, and 1438) of Neurospora crassa:
CD 16 | 245 | 261 | D 566 | 1438 | |
CD 16 | – | – | – | – | – |
245 | + | + | + | + | |
261 | – | + | + | ||
D 566 | – | – | |||
1438 | – |
‘-‘ Denotes no complementation, ‘+’ Denotes complementation
The complementation relationship as observed in the matrix indicates that concept CD 16 and CD 566, other mutant forms may also show complementation. Therefore, the mutation CD 16 probably overlaps with all other mutant forms taken under consideration, and mutations D 566 and 1438 overlap the same functional site. On the basis of this relationship, a complementation map for these genes may be drawn under fig.
The complementation map as obtained from the analysis of the complementation matrix indicates a linear order of genes or functional units as 245—261—D 566—1438, but the positioning of CD 16 affects all the functional regions of his-3 locus here. Further five mutations belong to three functional sites which may be indicated as I, II, and III. It is to be pointed out here that the genetic map for these closely linked functional sites as revealed from recombination analysis also indicated the same order of genes as obtained from complementation analysis. This also suggests that the genetic map and complementation map are colinear. This type of complementation map may also be constructed with many closely linked genes found in other organisms from the analysis of their complementation matrix.
The rII locus of T4 bacteriophage and its Genetic Dissection
The rII locus of T4 bacteriophage is responsible for developing infection in bacteria by the virus and promotes lysis of the bacterial cell with the production of about 300 progeny virus from one infected cell. Seymour Benzer (1955) observed about 2400 mutant varieties of virus that resulted due to alteration at 308 sites along the rII locus of the T4 bacteriophage. With the help of a cis-trans complementation test Benzer could be able to discover that the rII locus of the T4 bacteriophage contains two functional sites designated as rII A and rII B. He called these functional sites of rII locus as cistrons and he observed that a mutation of the functional site rII A can complement with a mutation of the site rII B.
On the other hand, two mutations of the same functional site either rll A or rll B fail to complement each other. Benzer’s analysis of the rll mutants may be regarded as an extension of the findings of Oliver, Green, Lewis, and others in Drosophila melanogaster on the fine structure of a gene. Genetic analysis of Benzer for the rll locus of T4 revealed that a gene is divisible by mutation and recombination. However, the viral gene expression is based on a single genomic DNA molecule. Benzer designed his protocol of experiments in such a way that the gene expression mimicked the condition as may be observed in diploid organisms.
Before a discussion of the genetic analysis of Benzer on T4 bacteriophage, some important information about T4 bacteriophage and their mode of infection on the bacterium E. coli should be taken into consideration. The wild type T4 having rll locus as r+ may easily develop in the E. coli strain and causes lysis of the bacterial cell. Over a lawn of bacteria on an agar culture plate, clear plaques are developed as a phenotypic expression of bacterial lysis. Therefore, for normal rll locus (r+) the phenotype is 300 progeny per cell by lysis of the bacterial cell with the formation of plaques on the bacterial lawn. But if a virus becomes mutant for its rII locus, it cannot develop an infection on the strain of the bacteria, E. coli K 12 (λ). But the same virus may develop an infection on the E. coli B strain showing lethality.
Therefore, the mutant T4 virus may show conditional lethality. This property of the mutant virus becomes a powerful tool to Benzer for the genetic crosses. The E. coli B strain that may be infected by a mutant virus is called the permissive host. Besides this property, the plaques developed by wild type (r+) virus and the mutant virus also differ by morphology. The plaques developed by wild-type viruses are small and turbid in appearance, while those developed by mutant viruses appear large and clear. Hence, the plaque morphology and host range specificity may help in identifying wild-type and mutant T4 phages.
Benzer and his colleagues isolated a large number of mutant forms and they were in a dilemma that whether all of them were true functional alleles of the same gene. To find out an answer to this confusion, they tried to develop infection in E. coli K12(X) by simultaneous infection with two mutants. Surprisingly, they observed that in many cases simultaneous infection in E. coli K12(λ) with two rll mutants produced wild-type expression in the same fashion as observed in diploids during complementation.
Benzer explained this incidence in the case of rll mutants of T4 bacteriophages as complementation. It is to be pointed out that not all simultaneous infections developed wild-type expression in the bacterial strains. On the basis of this complementation analysis, Benzer could be able to group all the mutant forms he obtained for the rII locus into two batches. He considered these two mutant groups as mutations of two contiguous genes designated as rll A and rII B.
He also coined the term cistron for each of the functional sites of the rII locus. Thus according to Benzer, the rII locus is formed of two cistrons as rII A and rII B. Presently the term cistron is taken to be equivalent to gene and therefore, it is better to denote a functional site as gene rather than cistron.
Mapping of mutations of rII locus of bacteriophage
Benzer observed about 2400 mutations in the rII locus of the T4 bacteriophage. Through complementation test, it has been found that all these mutations may be grouped into two functional units called by him cistron A and cistron B. Just determination of the presence of a mutation within cistron A or cistron B cannot define its actual location within the genome. This may be obtained by genetic mapping for the location of a gene. One method for such mapping involves genetic crosses between two mutant forms and the frequency of wild-type virus produced through such cross may give an estimate of the frequency of recombinants.
Similar to linkage mapping the recombination percentage may be taken as the distance between two mutations located over the rll locus in linear order. It is to be mentioned here that to observe recombination the permissible host E. coli B is to be infected simultaneously with two different mutants. The wild-type virus and double mutant virus may be detected when the progeny viruses are allowed to infect the bacterial strain E. coli K12 (λ). Among the progeny, those who will be wild type by nature will easily develop an infection in this bacterial strain. The number of viral forms wild type in nature may be an indication of recombination frequency which may be obtained by applying the formula
The value of crossing over percentage is to be taken as the map distance between two mutant genes of rll locus. However, the recombination frequency as Benzer obtained was very low being about 10″6 and he could be able to characterize only 60 independent mutant forms relating to mutations in the rll locus. Further, mapping mutations in this technique require much labour. Therefore, Benzer devised a new technique to reduce labour for mapping all the mutations of the rII locus. This technique is called deletion mapping in which deletion mutations were considered with the point mutation for recombination. The principle lies behind this method is that if a point mutation (deletion type) fails to produce a wild-type virus in case of recombination, the mutant gene under consideration falls within the deleted region of the rII locus. Based on this method Benzer could be able to construct a detailed map of the rII locus within a few years.
The principle of gene mapping by this technique indicates that if a point mutation produces a wild-type virus in a cross with one deletion mutation the mutation must be outside the deleted segment. A result otherwise not producing a recombinant wild-type virus indicates the location of the mutation is within the deleted segment. Benzer utilized seven large deletions (called big even) having issued overlapping segments of the rll region for mapping every new point mutation. He also characterized 47 small deletion mutations and after locating a mutation in consideration of the big seven deletions, he determined the precise location of a point mutation by considering the small deletions in a cross.
An example to explain this method of mapping may be described in this consideration. In T4, r548 represents a point mutation in the rll locus. On the other hand, the big seven deletions are r1272, r1241, rj3, rPT1, rPB242, rA105, and r638. When Benzer crossed r548 with each of the big seven, it showed recombination with the production of wild-type virus only with rl05 and r638. Based on such observation Benzer could be able to locate this mutation in the interval A5. Further, crossing of r548 with deletions like rl605, rl589, rPB230, and rl993 (all deletions having endpoints within A5), Benzer found that it showed recombination with rPB230 and rl993 producing wild-type progeny. With this finding, it could be summarized that r548 is located in the sub-interval of A5c2. It is to be mentioned here that r548 is the mutation within the segment rllA of rll locus. The fine structure of this segment could be produced by Benzer by crossing the r548 mutations with other mutations of interval A5.
Benzer’s experiments to find out the genetic fine structure of the rll locus includes a complementation test, recombination analysis, and three-point crosses involving mutant forms and crossing of point mutations with deletion-type mutations along the rll locus. With the help of these techniques, he could be able to get a high-resolution anatomy of the rll locus. The complementation analysis initially recognizes two functional sites in the rll locus as rIIA and rIIB. All the 2400 mutant forms of T4 for changes along the rll locus may be included in these two functional sites designated by him as cistrons. But the actual mapping of the genes may be assisted by recombination analysis with the help of recombination frequency and recombination test of the mutant forms (due to point mutation) involving different known deletion type mutations. The smallest recombination frequency observed by Benzer was 0.02% which corresponds to about 2.3 base pairs. From the sequence analysis, it has been found that the rII locus contains two coding regions rIIA and rIIB having 2175 and 936 bp respectively.
Further investigation also suggests that there are some small sites over the II locus that are more sensitive to developing mutations. The sites are called mutational hot spots. One hot spot on the rIIB region exhibited about 500 spontaneous mutations. On the basis of all these observations, it may be concluded that the rII locus contains two genes namely rIIA and rIIB and rIIB is smaller in dimension being about half the size of the rIIA gene. Each of the genetic segments contains many sites of independent mutations. Some sites may be available along both the rIIA and rIIB genes that exhibit multiple mutations and they are the hot spots along the genetic segments.