Chimpanzee genome project

Last updated

The Chimpanzee Genome Project was an effort to determine the DNA sequence of the chimpanzee genome. Sequencing began in 2005 and by 2013 twenty-four individual chimpanzees had been sequenced. This project was folded into the Great Ape Genome Project. [1]

Contents

Two juvenile central chimpanzees, the nominate subspecies Unnamed - Chimpanzee - Central African Republic.jpg
Two juvenile central chimpanzees, the nominate subspecies

In 2013 high resolution sequences were published from each of the four recognized [2] [3] chimpanzee subspecies: Central chimpanzee, Pan troglodytes troglodytes, 10 sequences; Western chimpanzee, Pan troglodytes verus, 6 sequences; Nigeria-Cameroon chimpanzee, Pan troglodytes ellioti, 4 sequences; and Eastern chimpanzee, Pan troglodytes schweinfurthii, 4 sequences. They were all sequenced to a mean of 25-fold coverage per individual. [1]

The research showed considerable genome diversity in chimpanzees with many population-specific traits. The central chimpanzees retain the highest diversity in the chimpanzee lineage, whereas the other subspecies demonstrate signs of population bottlenecks. [4]

Background

Human and chimpanzee chromosomes are very alike. The primary difference is that humans have one fewer pair of chromosomes than do other great apes. Humans have 23 pairs of chromosomes and other great apes have 24 pairs of chromosomes. In the human evolutionary lineage, two ancestral ape chromosomes fused at their telomeres, producing human chromosome 2. [5] There are nine other major chromosomal differences between chimpanzees and humans: chromosome segment inversions on human chromosomes 1, 4, 5, 9, 12, 15, 16, 17, and 18. After the completion of the Human genome project, a common chimpanzee genome project was initiated. In December 2003, a preliminary analysis of 7600 genes shared between the two genomes confirmed that certain genes such as the forkhead-box P2 transcription factor, which is involved in speech development, are different in the human lineage. Several genes involved in hearing were also found to have changed during human evolution, suggesting selection involving human language-related behavior. Differences between individual humans and common chimpanzees are estimated to be about 10 times the typical difference between pairs of humans. [6]

Another study showed that patterns of DNA methylation, which are a known regulation mechanism for gene expression, differ in the prefrontal cortex of humans versus chimpanzees, and implicated this difference in the evolutionary divergence of the two species. [7]

Chimpanzee-human chromosome differences. A major structural difference is that human chromosome 2 (green color code) was derived from two smaller chromosomes that are found in other great apes (now called 2A and 2B ). Parts of human chromosome 2 are scattered among parts of several cat and rat chromosomes in these species that are more distantly related to humans (more ancient common ancestors; about 85 million years since the human/rodent common ancestor Chimp chromosomes.png
Chimpanzee-human chromosome differences. A major structural difference is that human chromosome 2 (green color code) was derived from two smaller chromosomes that are found in other great apes (now called 2A and 2B ). Parts of human chromosome 2 are scattered among parts of several cat and rat chromosomes in these species that are more distantly related to humans (more ancient common ancestors; about 85 million years since the human/rodent common ancestor

Draft genome sequence of the common chimpanzee

An analysis of the chimpanzee genome sequence was published in Nature on September 1, 2005, in an article produced by the Chimpanzee Sequencing and Analysis Consortium, a group of scientists which is supported in part by the National Human Genome Research Institute, one of the National Institutes of Health. The article marked the completion of the draft genome sequence. [6]

A database now exists containing the genetic differences between human and chimpanzee genes, with about thirty-five million single-nucleotide changes, five million insertion/deletion events, and various chromosomal rearrangements. [10] Gene duplications account for most of the sequence differences between humans and chimps. Single-base-pair substitutions account for about half as much genetic change as does gene duplication.

Typical human and chimpanzee homologs of proteins differ in only an average of two amino acids. About 30 percent of all human proteins are identical in sequence to the corresponding chimpanzee protein. As mentioned above, gene duplications are a major source of differences between human and chimpanzee genetic material, with about 2.7 percent of the genome now representing differences having been produced by gene duplications or deletions during approximately 6 million years [11] since humans and chimpanzees diverged from their common evolutionary ancestor. The comparable variation within human populations is 0.5 percent. [12]

About 600 genes were identified that may have been undergoing strong positive selection in the human and chimpanzee lineages; many of these genes are involved in immune system defense against microbial disease (example: granulysin is protective against Mycobacterium tuberculosis [13] ) or are targeted receptors of pathogenic microorganisms (example: Glycophorin C and Plasmodium falciparum ). By comparing human and chimpanzee genes to the genes of other mammals, it has been found that genes coding for transcription factors, such as forkhead-box P2 (FOXP2), have often evolved faster in the human relative to chimpanzee; relatively small changes in these genes may account for the morphological differences between humans and chimpanzees. A set of 348 transcription factor genes code for proteins with an average of about 50 percent more amino acid changes in the human lineage than in the chimpanzee lineage.

Six human chromosomal regions were found that may have been under particularly strong and coordinated selection during the past 250,000 years. These regions contain at least one marker allele that seems unique to the human lineage while the entire chromosomal region shows lower than normal genetic variation. This pattern suggests that one or a few strongly selected genes in the chromosome region may have been preventing the random accumulation of neutral changes in other nearby genes. One such region on chromosome 7 contains the FOXP2 gene (mentioned above) and this region also includes the Cystic fibrosis transmembrane conductance regulator (CFTR) gene, which is important for ion transport in tissues such as the salt-secreting epithelium of sweat glands. Human mutations in the CFTR gene might be selected for as a way to survive cholera. [14]

Another such region on chromosome 4 may contain elements regulating the expression of a nearby protocadherin gene that may be important for brain development and function. Although changes in expression of genes that are expressed in the brain tend to be less than for other organs (such as liver) on average, gene expression changes in the brain have been more dramatic in the human lineage than in the chimpanzee lineage. [15] This is consistent with the dramatic divergence of the unique pattern of human brain development seen in the human lineage compared to the ancestral great ape pattern. The protocadherin-beta gene cluster on chromosome 5 also shows evidence of possible positive selection. [16]

Results from the human and chimpanzee genome analyses should help in understanding some human diseases. Humans appear to have lost a functional Caspase 12 gene, which in other primates codes for an enzyme that may protect against Alzheimer's disease.

Human and chimpanzee genomes. M stands for Mitochondrial DNA Humanchimpchromosomes.png
Human and chimpanzee genomes. M stands for Mitochondrial DNA

Genes of the chromosome 2 fusion site

Diagramatic representation of the location of the fusion site of chromosomes 2A and 2B and the genes inserted at this location. Chromosome2Insert.png
Diagramatic representation of the location of the fusion site of chromosomes 2A and 2B and the genes inserted at this location.

The results of the chimpanzee genome project suggest that when ancestral chromosomes 2A and 2B fused to produce human chromosome 2, no genes were lost from the fused ends of 2A and 2B. At the site of fusion, there are approximately 150,000 base pairs of sequence not found in chimpanzee chromosomes 2A and 2B. Additional linked copies of the PGML/FOXD/CBWD genes exist elsewhere in the human genome, particularly near the p end of chromosome 9. This suggests that a copy of these genes may have been added to the end of the ancestral 2A or 2B prior to the fusion event. It remains to be determined if these inserted genes confer a selective advantage.

See also

Further reading

Related Research Articles

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.

<span class="mw-page-title-main">FOXP2</span> Transcription factor gene of the forkhead box family

Forkhead box protein P2 (FOXP2) is a protein that, in humans, is encoded by the FOXP2 gene. FOXP2 is a member of the forkhead box family of transcription factors, proteins that regulate gene expression by binding to DNA. It is expressed in the brain, heart, lungs and digestive system.

<span class="mw-page-title-main">Pseudogene</span> Functionless relative of a gene

Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by gene duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes are usually identified when genome sequence analysis finds gene-like sequences that lack regulatory sequences needed for transcription or translation, or whose coding sequences are obviously defective due to frameshifts or premature stop codons. Pseudogenes are a type of junk DNA.

<span class="mw-page-title-main">Gene family</span> Set of several similar genes

A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions. One such family are the genes for human hemoglobin subunits; the ten genes are in two clusters on different chromosomes, called the α-globin and β-globin loci. These two gene clusters are thought to have arisen as a result of a precursor gene being duplicated approximately 500 million years ago.

<span class="mw-page-title-main">Sequence homology</span> Shared ancestry between DNA, RNA or protein sequences

Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).

<span class="mw-page-title-main">Hominini</span> Tribe of mammals

The Hominini form a taxonomic tribe of the subfamily Homininae ("hominines"). Hominini includes the extant genera Homo (humans) and Pan and in standard usage excludes the genus Gorilla (gorillas).

<span class="mw-page-title-main">Copy number variation</span> Repeated DNA variation between individuals

Copy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals. Copy number variation is a type of structural variation: specifically, it is a type of duplication or deletion event that affects a considerable number of base pairs. Approximately two-thirds of the entire human genome may be composed of repeats and 4.8–9.5% of the human genome can be classified as copy number variations. In mammals, copy number variations play an important role in generating necessary variation in the population as well as disease phenotype.

Human evolutionary genetics studies how one human genome differs from another human genome, the evolutionary past that gave rise to the human genome, and its current effects. Differences between genomes have anthropological, medical, historical and forensic implications and applications. Genetic data can provide important insights into human evolution.

The Olduvai domain, known until 2018 as DUF1220 and the NBPF repeat, is a protein domain that shows a striking human lineage-specific (HLS) increase in copy number and appears to be involved in human brain evolution. The protein domain has also been linked to several neurogenetic disorders such as schizophrenia and increased severity of autism. In 2018, it was named by its discoverers after Olduvai Gorge in Tanzania, one of the most important archaeological sites for early humans, to reflect data indicating its role in human brain size and evolution.

<span class="mw-page-title-main">CELSR1</span> Protein-coding gene in humans

Cadherin EGF LAG seven-pass G-type receptor 1 also known as flamingo homolog 2 or cadherin family member 9 is a protein that in humans is encoded by the CELSR1 gene.

<span class="mw-page-title-main">PCDH11X</span> Protein-coding gene in the species Homo sapiens

Protocadherin 11 X-linked, also known as PCDH11X, is a protein which in humans is encoded by the PCDH11X gene.

<span class="mw-page-title-main">FOXK2</span> Protein-coding gene in the species Homo sapiens

Forkhead box protein K2 is a protein that in humans is encoded by the FOXK2 gene.

<span class="mw-page-title-main">PCDH17</span> Protein-coding gene in the species Homo sapiens

Protocadherin-17 is a protein that in humans is encoded by the PCDH17 gene.

<span class="mw-page-title-main">FOXD4</span> Protein-coding gene in the species Homo sapiens

Forkhead box protein D4 is a protein that in humans is encoded by the FOXD4 gene.

<span class="mw-page-title-main">EIF1AY</span> Protein-coding gene in the species Homo sapiens

Eukaryotic translation initiation factor 1A, Y-chromosomal is a protein that in humans is encoded by the EIF1AY gene.

<span class="mw-page-title-main">MMADHC</span> Protein-coding gene in humans

Methylmalonic aciduria and homocystinuria type D protein, mitochondrial also known as MMADHC is a protein that in humans is encoded by the MMADHC gene.

The chimpanzee–human last common ancestor (CHLCA) is the last common ancestor shared by the extant Homo (human) and Pan genera of Hominini. Estimates of the divergence date vary widely from thirteen to five million years ago.

Cognitive genomics is the sub-field of genomics pertaining to cognitive function in which the genes and non-coding sequences of an organism's genome related to the health and activity of the brain are studied. By applying comparative genomics, the genomes of multiple species are compared in order to identify genetic and phenotypical differences between species. Observed phenotypical characteristics related to the neurological function include behavior, personality, neuroanatomy, and neuropathology. The theory behind cognitive genomics is based on elements of genetics, evolutionary biology, molecular biology, cognitive psychology, behavioral psychology, and neurophysiology.

The myth of the one percent refers to the 1975 study done by Wilson and King that asserted that human-chimpanzee divergence is about 1%. Humans share a common ancestor with chimpanzees, and the rapid evolution of chimpanzees and humans, along with gorillas and bonobos, has led to difficulties in creating an accurate lineage or tree topology. Chimpanzees and humans were found to be a monophyletic clade, leading to the question of how closely related the two are.

References

  1. 1 2 Prado-Martinez, J.; et al. (2013). "Great ape genetic diversity and population history". Nature. 499 (7459): 471–475. Bibcode:2013Natur.499..471P. doi:10.1038/nature12228. PMC   3822165 . PMID   23823723. Open Access logo PLoS transparent.svg
  2. Groves, Colin P. (2001). Primate Taxonomy. Washington, DC: Smithsonian Institution Press. pp. 303–307. ISBN   978-1-56098-872-4.
  3. Hof, J.; Sommer, V. (2010). Apes Like Us: Portraits of a Kinship. Mannheim: Panorama. p. 114. ISBN   978-3-89823-435-1.
  4. de Manuel, M.; et al. (2016). "Chimpanzee genomic diversity reveals ancient admixture with bonobos". Science. 354 (6311): 477–48. Bibcode:2016Sci...354..477D. doi: 10.1126/science.aag2602 . PMC   5546212 . PMID   27789843.
  5. De Grouchy J (August 1987). "Chromosome phylogenies of man, great apes, and Old World monkeys". Genetica. 73 (1–2): 37–52. doi:10.1007/bf00057436. PMID   3333352. S2CID   1098866.
  6. 1 2 Chimpanzee Sequencing; Analysis Consortium (2005). "Initial sequence of the chimpanzee genome and comparison with the human genome" (PDF). Nature . 437 (7055): 69–87. Bibcode:2005Natur.437...69.. doi: 10.1038/nature04072 . PMID   16136131.
  7. Zeng, J.; Konopa, G.; Hunt, B.G.; Preuss, T.M.; Geschwind, D.; Yi, S.V. (2012). "Divergent Whole-Genome Methylation Maps of Human and Chimpanzee Brains Reveal Epigenetic Basis of Human Regulatory Evolution". The American Journal of Human Genetics. 91 (3): 455–465. doi:10.1016/j.ajhg.2012.07.024. PMC   3511995 . PMID   22922032. Open Access logo PLoS transparent.svg
  8. McConkey EH (2004). "Orthologous numbering of great ape and human chromosomes is essential for comparative genomics" . Cytogenet. Genome Res. 105 (1): 157–8. doi:10.1159/000078022. PMID   15218271. S2CID   11571357.
  9. Springer MS, Murphy WJ, Eizirik E, O'Brien SJ (February 2003). "Placental mammal diversification and the Cretaceous-Tertiary boundary". Proc. Natl. Acad. Sci. U.S.A. 100 (3): 1056–61. Bibcode:2003PNAS..100.1056S. doi: 10.1073/pnas.0334222100 . PMC   298725 . PMID   12552136.
  10. "Chimpanzee genome database (Genome Data Viewer Pan troglodytes (chimpanzee))".
  11. Caswell JL, Mallick S, Richter DJ, Neubauer J, Schirmer C, Gnerre S, Reich D (April 2008). "Analysis of chimpanzee history based on genome sequence alignments". PLOS Genet. 4 (4): e1000057. doi: 10.1371/journal.pgen.1000057 . PMC   2278377 . PMID   18421364.
  12. Cheng Z, Ventura M, She X, Khaitovich P, Graves T, Osoegawa K, et al. (September 2005). "A genome-wide comparison of recent chimpanzee and human segmental duplications". Nature. 437 (7055): 88–93. Bibcode:2005Natur.437...88C. doi:10.1038/nature04000. PMID   16136132. S2CID   4420359.
  13. Stenger S, Hanson DA, Teitelbaum R, Dewan P, Niazi KR, Froelich CJ, et al. (October 1998). "An antimicrobial activity of cytolytic T cells mediated by granulysin". Science. 282 (5386): 121–5. Bibcode:1998Sci...282..121S. doi:10.1126/science.282.5386.121. PMID   9756476.
  14. Goodman BE, Percy WH (June 2005). "CFTR in cystic fibrosis and cholera: from membrane transport to clinical practice". Adv Physiol Educ. 29 (2): 75–82. doi:10.1152/advan.00035.2004. PMID   15905150.
  15. Khaitovich P, Hellmann I, Enard W, Nowick K, Leinweber M, Franz H, Weiss G, Lachmann M, Pääbo S (September 2005). "Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees". Science. 309 (5742): 1850–4. Bibcode:2005Sci...309.1850K. doi:10.1126/science.1108296. PMID   16141373. S2CID   16674740.
  16. Miki R, Hattori K, Taguchi Y, Tada MN, Isosaka T, Hidaka Y, Hirabayashi T, Hashimoto R, Fukuzako H, Yagi T (April 2005). "Identification and characterization of coding single-nucleotide polymorphisms within human protocadherin-alpha and -beta gene clusters". Gene. 349: 1–14. doi: 10.1016/j.gene.2004.11.044 . PMID   15777644.
  17. Fan Y, Newman T, Linardopoulou E, Trask BJ (November 2002). "Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions". Genome Res. 12 (11): 1663–72. doi:10.1101/gr.338402. PMC   187549 . PMID   12421752.