Following the discovery of DNA, and in parallel more recently with the ascent of biotechnology and projects to sequence the human genome, common usage of the word in ever more instances has echoed uses in molecular biology. In the primary, molecular sense, genes are segments of DNA within chromosomes. In particular, they are the subset of such DNA which cells transcribe into RNAs and translate, at least in part, into proteins.
A gene, in this primary sense, specifies a protein by way of its chemical structure. Any of four types of sequentially linked nucleotides make a DNA molecule or "strand" (more at DNA). These four represent the genetic alphabet, while the various possible sequences of three, called codons, represent the genetic vocabulary. The sequence in which different codons appear in a gene specifies the amino-acid sequence of a protein, and the genetic code describes which amino acids relate to which codon. This code is more or less the same from bacteria to humans; in other words, common to all cellular life.
Through the proteins they "encode," genes govern the cells in which they reside. In multicellular organisms they control development of the individual from the fertilized egg and the day-to-day functions of the cells that make up tissues and organs. The instrumental roles of their protein products range from mechanical support of the cell structure to the transportation and manufacture of other molecules to the regulation of other proteins' activities.
Because it is through proteins that genes exert their effects, and because gene transcripts (which are a prerequisite for protein synthesis) degrade rapidly, genes are in a sense inactive when they are not actively being transcribed. Cells appear to regulate the activity of genes primarily by increasing or decreasing their rate of transcription. Over the short term, this regulation occurs through the binding or unbinding of proteins known as transcription factors, which attach to specific "non-coding" DNA sequences called regulatory elements[?]. Over longer periods of time, genes may be "silenced"[?] through DNA methylation[?] or changes in the DNA packing of the chromosomes (see histone).
In many species of organism, very little of the DNA in the chromosomes encodes proteins. Rather, the genes are separated by often vast sequences of so-called junk DNA, and they are sometimes fragmented internally by "non-coding" sequences called introns, which may be many times longer than the genes themselves. Introns are removed on the heels of transcription by splicing. In the primary molecular sense they represent parts of a gene, however
All the genes and intervening DNA together make up the genome of an organism, which in many species is divided among several chromosomes and typically present in two or more copies. The location or locus of a gene and the chromosome on which it is situated is in a sense arbitrary. Genes that appear together on the chromosomes of one species, such as humans, may appear on separate chromosomes in another species, such as mice. Two genes sited close together on a chromosome may encode proteins that figure either in the same cellular process or in completely unrelated processes. As an example of the former, many of the genes responsible for human sexual characteristics reside together on the Y chromosome.
Due to rare, spontaneous errors in DNA replication, for example, mutations and hence variations in the sequence of a gene arise within a species population. Variants of a single gene are known as alleles, and differences in alleles may give rise to differences in traits, for example eye color.
In the many species that carry more than one copy of their genome within each of their somatic cells, these copies are in effect never identical. With respect to each gene, the copies that an individual possesses are liable to be distinct alleles, which may act either synergistically or antagonistically to generate a trait or phenotype (more at genetics, allele).
In common speech, "gene" is often used to refer to the hereditary cause of a trait, disease or condition--as in "the gene for obesity." A biologist, in contrast, might refer to an allele or a mutation that had been implicated in or correlated with obesity. Based on the incidence of obesity across parents and offspring, not to mention common sense, biologists know that not only genes but factors such as upbringing, culture and the availability of food decide whether or not a person is obese. To continue with the same example, it also appears unlikely that variations within a single gene--or single genetic locus--determine one's genetic predisposition for obesity. These aspects of inheritance--the interplay between genes and environment, the influence of many genes--appear to be the norm with regard to many and perhaps most traits. The term phenotype refers to the characteristics that result from this interplay, along with the effects of chance in the migration and division of cells during development.
Natural variations within regulatory sequences appear also to underlie many of the heritable characteristics seen in organisms. The influence of such variations on the trajectory of evolution through natural selection may be as large as or larger than variation in sequences that encode proteins. Thus, though regulatory elements are often distinguished from genes in molecular biology, in effect they satisfy the shared and historical sense of the word. Indeed, a breeder or geneticist, in following the inheritance pattern of a trait, has no immediate way to know whether this pattern arises from coding sequences or regulatory sequences. Typically, he or she will simply attribute it to variations within a "gene."
RNA is always the intermediary between genes and proteins, but for some gene sequences RNA molecules are actually the end products. These molecules may be capable of enzymatic function, such the RNAs known as ribozymes,or they may engage in regulatory base pairing, as in the case of "small interfering RNAs"[?].
The DNA sequences from which such RNAs are transcribed are known as "RNA genes." RNA genes are much harder to locate in genome sequences than conventional genes are. Because cells do not translate them, they lack the distinctive ATG codon that heads all protein-coding sequences and guides bioinformatics searches (more at reading frame[?]) .
For various reasons, the relationship between genes and proteins is not so simple as "one nucleotide sequence-->one amino-acid sequence." For example, cells may splice the transcripts of a gene in alternate ways to produce not one but a variety of proteins (alternative splicing). On the chromosome meanwhile, a single DNA sequence may contain overlapping genes. In addition, accidents over the course of evolution may lead to the duplication[?] of a gene to a second locus, where it may fall under different regulation. Though the two sequences may remain the same or be only slightly altered, they are typically regarded as separate genes (i.e. not as alleles of the same gene). The same is true when duplicate sequences appear in different species. Yet, though the alleles of a gene differ in sequence, nevertheless they are seen to represent one gene.
Finally, a molecular biologists will often use "gene" to refer to just a nucleotide sequence of a gene; and at times the sequence of only its coding regions without the introns. This more abstract sense of gene underlies the sense of genes as information. It also means that, by way of its sequence, not only DNA but RNA may be said either to be to carry a gene (see below).
Although all cell-based organisms carry their genes and transmit them to offspring as DNA, many of the viruses that parasitize and reproduce in them carry only RNA. Because they use RNA, their cellular hosts may synthesize their proteins as soon as they are infected and without the delay in waiting for transcription. RNA "retroviruses", on the other hand, require "retrotranscription" of their genome from RNA into DNA.
The genes that exist today are those that have reproduced successfully in the past. This is the basis of the selfish gene view, publicised by Richard Dawkins. He points out in his book, The Selfish Gene, that all DNA exists with no other purpose than to propagate itself, even at the expense of the host organism's welfare. The possibly disappointing answer to the question "what is the meaning of life?" may be "the survival and perpetuation of ribonucleic acids and their associated proteins".
The existence of genes was first suggested by Gregor Mendel, who studied inheritance in pea plants and hypothesized a factor that conveys traits from parent to offspring. Although he did not use the term "gene", he explained his results in terms of inherited characteristics. Mendel was also the first to hypothesize independent assortment, the distinction between dominant and recessive traits, the distinction between a heterozygote and homozygote, and the difference between what would later be described as genotype and phenotype.
|organism||# of genes||base pairs|