Earth BioGenome Project

Last updated
Earth BioGenome Project
DurationNovember 1, 2018 – 2028
Website www.earthbiogenome.org

The Earth BioGenome Project (EBP) is an initiative that aims to sequence and catalog the genomes of all of Earth's currently described eukaryotic species over a period of ten years. [1] The initiative would produce an open DNA database of biological information that provides a platform for scientific research and supports environmental and conservation initiatives. [2] A scientific paper presenting the vision for the project was published in PNAS in April 2018, [3] and the project officially launched November 1, 2018. [4]

Contents

The initiative was inspired by Human Genome Project, and emerged during November 2015 meeting between Harris Lewin (UCD), Gene E. Robinson (IGB) and W. John Kress (Smithsonian Institution's National Museum of Natural History). [3] [5] In February 2017, at major conference on genomics and biodiversity organized by the Smithsonian Institution and BGI in Washington, D.C. was supported project's 10-year plan and organizational structure. [3]

Summary

The project is projected to cost US$4.7 billion. [1] It includes already ongoing projects such as i5K (insects), [6] B10K (birds), 10KP (plants), [7] [8] and the Darwin Tree of Life, which aim to sequence the estimated 66,000 eukaryotic species in the United Kingdom. [1] The project is aiming to sequence and annotate the roughly 1.5 million known eukaryotic species in three phases, with first to create "annotated chromosome-scale reference assemblies for at least one representative species of each of the ~9,000 eukaryotic taxonomic families". [3] [8]

According to PNAS paper, several sequencing centers are supporting the project, including BGI (China), Baylor College of Medicine (USA), Wellcome Sanger Institute (UK), Rockefeller University (US), with an additional center to be established for the project in South America by São Paulo Research Foundation. [3] As for bio-observatories which use genomics, examples which meet the project needs are National Ecological Observatory Network, Chinese Ecological Research Network, ForestGEO, and MarineGEO. [3] To provide insight into the feasibility and technical requirements for "planetary scale" projects such as this, the 10,000 Plant Genome Project has published a pilot "Digitalization of Ruili Botanical Garden" project sampling and sequencing 761 vascular plant specimens growing in a Botanical Garden in South West China. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">BGI Group</span> Chinese genome sequencing company

BGI Group, formerly Beijing Genomics Institute, is a Chinese genomics company with headquarters in Yantian District, Shenzhen. The company was originally formed in 1999 as a genetics research center to participate in the Human Genome Project. It also sequences the genomes of other animals, plants and microorganisms.

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

<span class="mw-page-title-main">George Church (geneticist)</span> American geneticist

George McDonald Church is an American geneticist, molecular engineer, chemist, serial entrepreneur, and pioneer in personal genomics and synthetic biology. He is the Robert Winthrop Professor of Genetics at Harvard Medical School, Professor of Health Sciences and Technology at Harvard University and Massachusetts Institute of Technology, and a founding member of the Wyss Institute for Biologically Inspired Engineering at Harvard. Through his Harvard lab Church has co-founded around 50 biotech companies pushing the boundaries of innovation in the world of life sciences and making his lab as a hotbed of biotech startup activity in Boston. In 2018, the Church lab at Harvard made a record by spinning off 16 biotech companies in one year. The Church lab works on research projects that are distributed in diverse areas of modern biology like developmental biology, neurobiology, info processing, medical genetics, genomics, gene therapy, diagnostics, chemistry & bioengineering, space biology & space genetics, and ecosystem. Research and technology developments at the Church lab have impacted or made direct contributions to nearly all "next-generation sequencing (NGS)" methods and companies. In 2017, Time magazine listed him in Time 100, the list of 100 most influential people in the world. In 2022, he was featured among the most influential people in biopharma by Fierce Pharma, and was listed among the top 8 famous geneticists of all time in human history. As of January 2023, Church serves as a member of the Bulletin of the Atomic Scientists' Board of Sponsors, established by Albert Einstein.

<span class="mw-page-title-main">1000 Genomes Project</span> International research effort on genetic variation

The 1000 Genomes Project, taken place from January 2008 to 2015, was an international research effort to establish the most detailed catalogue of human genetic variation at the time. Scientists planned to sequence the genomes of at least one thousand anonymous healthy participants from a number of different ethnic groups within the following three years, using advancements in newly developed technologies. In 2010, the project finished its pilot phase, which was described in detail in a publication in the journal Nature. In 2012, the sequencing of 1092 genomes was announced in a Nature publication. In 2015, two papers in Nature reported results and the completion of the project and opportunities for future research.

<span class="mw-page-title-main">Whole genome sequencing</span> Determining nearly the entirety of the DNA sequence of an organisms genome at a single time

Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.

Complete Genomics is a life sciences company that has developed and commercialized a DNA sequencing platform for human genome sequencing and analysis. This solution combines the company's proprietary human genome sequencing technology with its informatics and data management software to provide finished variant reports and assemblies at Complete Genomics’ commercial genome center in Mountain View, California.

SOAP is a suite of bioinformatics software tools from the BGI Bioinformatics department enabling the assembly, alignment, and analysis of next generation DNA sequencing data. It is particularly suited to short read sequencing data.

The 1000 Plant Transcriptomes Initiative (1KP) was an international research effort to establish the most detailed catalogue of genetic variation in plants. It was announced in 2008 and headed by Gane Ka-Shu Wong and Michael Deyholos of the University of Alberta. The project successfully sequenced the transcriptomes of 1000 different plant species by 2014; its final capstone products were published in 2019.

<span class="mw-page-title-main">Reference genome</span> Digital nucleic acid sequence database

A reference genome is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead a reference provides a haploid mosaic of different DNA sequences from each donor. For example, one of the most recent human reference genomes, assembly GRCh38/hg38, is derived from >60 genomic clone libraries. There are reference genomes for multiple species of viruses, bacteria, fungus, plants, and animals. Reference genomes are typically used as a guide on which new genomes are built, enabling them to be assembled much more quickly and cheaply than the initial Human Genome Project. Reference genomes can be accessed online at several locations, using dedicated browsers such as Ensembl or UCSC Genome Browser.

<span class="mw-page-title-main">DNA nanoball sequencing</span>

DNA nanoball sequencing is a high throughput sequencing technology that is used to determine the entire genomic sequence of an organism. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Fluorescent nucleotides bind to complementary nucleotides and are then polymerized to anchor sequences bound to known sequences on the DNA template. The base order is determined via the fluorescence of the bound nucleotides This DNA sequencing method allows large numbers of DNA nanoballs to be sequenced per run at lower reagent costs compared to other next generation sequencing platforms. However, a limitation of this method is that it generates only short sequences of DNA, which presents challenges to mapping its reads to a reference genome. After purchasing Complete Genomics, the Beijing Genomics Institute (BGI) refined DNA nanoball sequencing to sequence nucleotide samples on their own platform.

<span class="mw-page-title-main">Viral metagenomics</span>

Viral metagenomics uses metagenomic technologies to detect viral genomic material from diverse environmental and clinical samples. Viruses are the most abundant biological entity and are extremely diverse; however, only a small fraction of viruses have been sequenced and only an even smaller fraction have been isolated and cultured. Sequencing viruses can be challenging because viruses lack a universally conserved marker gene so gene-based approaches are limited. Metagenomics can be used to study and analyze unculturable viruses and has been an important tool in understanding viral diversity and abundance and in the discovery of novel viruses. For example, metagenomics methods have been used to describe viruses associated with cancerous tumors and in terrestrial ecosystems.

Harris A. Lewin, an American biologist, is a professor of evolution and ecology and Robert and Rosabel Osborne Endowed Chair at the University of California, Davis. He is a member of the National Academy of Sciences. In 2011, Lewin won the Wolf Prize in Agriculture for his research into cattle genomics. Lewin chairs the working group for the Earth BioGenome Project, a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth’s eukaryotic biodiversity over a period of 10 years. Lewin is a founding co-editor of the Annual Review of Animal Biosciences, first published in 2013.

A plant genome assembly represents the complete genomic sequence of a plant species, which is assembled into chromosomes and other organelles by using DNA fragments that are obtained from different types of sequencing technology.

<span class="mw-page-title-main">Ruili Botanical Garden</span> Botanical garden in Ruili, China

Founded in 2002, Ruili Botanical Garden and Nanmaohu Park is a botanical garden in Ruili. Located 6 km from the city in Dehong Dai and Jingpo Autonomous Prefecture of Yunnan in the South-West of China near the Myanmar border. The park has more than 5,000 acres of well-preserved native vegetation, mainly monsoon evergreen broad-leaved forest, with more than 1,200 species of tropical and subtropical plants.

<span class="mw-page-title-main">Jose V. Lopez</span> American-Filipino molecular biologist

Jose V. Lopez is an American-Filipino Molecular Biologist. He has been a faculty and Professor of Biology at Nova Southeastern University (NSU). in Dania Beach, Florida since 2007. Lopez has contributed as co-founder of the Global Invertebrate Genomics Alliance (GIGA), a community of scientists. He has also participated in the "Porifera - Tree of Life", "Earth Microbiome" and Earth Bio-genome Projects.

References

  1. 1 2 3 "Life on Earth to have its DNA analysed in the name of conservation". Nature . 563 (7730): 155–156. November 2018. doi: 10.1038/d41586-018-07323-y . PMID   30401859.
  2. "Sequencing the world". The Economist . January 23, 2018. Archived from the original on January 24, 2018. Retrieved February 3, 2018.
  3. 1 2 3 4 5 6 Lewin HA, Robinson GE, Kress WJ, Baker WJ, Coddington J, Crandall KA, et al. (April 2018). "Earth BioGenome Project: Sequencing life for the future of life". Proceedings of the National Academy of Sciences of the United States of America. 115 (17): 4325–4333. doi: 10.1073/pnas.1720115115 . PMC   5924910 . PMID   29686065.
  4. "Scientists Launch Effort to Map DNA of Every Species". The Presidential Daily Brief: Intriguing. OZY. November 2, 2018. Retrieved November 2, 2018.
  5. Daley J (5 November 2018). "Ambitious Project to Sequence Genomes of 1.5 Million Species Kicks Off". Smithsonian . Retrieved 3 December 2018.
  6. i5K Consortium (2013-09-01). "The i5K Initiative: advancing arthropod genomics for knowledge, human health, agriculture, and the environment". The Journal of Heredity. 104 (5): 595–600. doi:10.1093/jhered/est050. PMC   4046820 . PMID   23940263.{{cite journal}}: CS1 maint: numeric names: authors list (link)
  7. Cheng S, Melkonian M, Smith SA, Brockington S, Archibald JM, Delaux PM, et al. (March 2018). "10KP: A phylodiverse genome sequencing plan". GigaScience. 7 (3): 1–9. doi:10.1093/gigascience/giy013. PMC   5869286 . PMID   29618049.
  8. 1 2 Exposito-Alonso, Moises; Drost, Hajk-Georg; Burbano, Hernán; Weigel, Detlef (2020-04-01). "The Earth BioGenome project: Opportunities and Challenges for Plant Genomics and Conservation". The Plant Journal. 102 (2): 222–229. doi: 10.1111/tpj.14631 . PMID   31788877.
  9. Liu H, Wei J, Yang T, Mu W, Song B, Yang T, et al. (January 2019). "Molecular digitization of a botanical garden: high-depth whole genome sequencing of 689 vascular plant species from the Ruili Botanical Garden". GigaScience. 8 (4). doi:10.1093/gigascience/giz007. PMC   6441391 . PMID   30689836.