Integrated Microbial Genomes System

Last updated
IMG
Integrated microbial genomes.jpg
Genome analysis tools in IMG 2.9
Content
DescriptionIntegrated microbial genomes database and comparative analysis system
Contact
AuthorsVictor M Markowitz
Primary citationMarkowitz et al. (2012) [1]
Access
Website img.jgi.doe.gov

The Integrated Microbial Genomes (IMG) system is a genome browsing and annotation platform developed by the U.S. Department of Energy (DOE)-Joint Genome Institute. [2] [3] IMG contains all the draft and complete microbial genomes sequenced by the DOE-JGI integrated with other publicly available genomes (including Archaea, Bacteria, Eukarya, Viruses and Plasmids). IMG provides users a set of tools for comparative analysis of microbial genomes along three dimensions: genes, genomes and functions. Users can select and transfer them in the comparative analysis carts based upon a variety of criteria. IMG also includes a genome annotation pipeline that integrates information from several tools, including KEGG, Pfam, InterPro, and the Gene Ontology, among others. Users can also type or upload their own gene annotations (called MyIMG gene annotations) and the IMG system will allow them to generate Genbank or EMBL format files containing these annotations.[ citation needed ]

Contents

In successive releases IMG has expanded to include several domain-specific tools. The Integrated Microbial Genomes with Microbiome Samples (IMG/M) system is an extension of the IMG system providing a comparative analysis context of assembled metagenomic data with the publicly available isolate genomes. [4] [5] The Integrated Microbial Genomes- Expert Review (IMG/ER) system provides support to individual scientists or group of scientists for functional annotation and curation of their microbial genomes of interest. [2] Users can submit their annotated genomes (or request the IMG automated annotation pipeline to be applied first) into IMG-ER and proceed with manual curation and comparative analysis in the system, through secure (password protected) access. The IMG-HMP is focused on analysis of genomes related to the Human Microbiome Project (HMP) in the context of all publicly available genomes in IMG. [6] The IMG-ABC system is a system for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery. [7] The IMG-VR system (with the recent updated version IMG/VR v.2.0) is the largest publicly available database for viral genomes and metagenomes. [8] [9]

See also

Related Research Articles

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

<span class="mw-page-title-main">Ensembl genome database project</span> Scientific project at the European Bioinformatics Institute

Ensembl genome database project is a scientific project at the European Bioinformatics Institute, which provides a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information.

<span class="mw-page-title-main">KEGG</span> Collection of bioinformatics databases

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

<span class="mw-page-title-main">Human Microbiome Project</span> Former research initiative

The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.

Phylogenetic profiling is a bioinformatics technique in which the joint presence or joint absence of two traits across large numbers of species is used to infer a meaningful biological connection, such as involvement of two different proteins in the same biological pathway. Along with examination of conserved synteny, conserved operon structure, or "Rosetta Stone" domain fusions, comparing phylogenetic profiles is a designated "post-homology" technique, in that the computation essential to this method begins after it is determined which proteins are homologous to which. A number of these techniques were developed by David Eisenberg and colleagues; phylogenetic profile comparison was introduced in 1999 by Pellegrini, et al.

The Genomes OnLine Database (GOLD) is a web-based resource for comprehensive information regarding genome and metagenome sequencing projects, and their associated metadata, around the world. Since 2011, the GOLD database has been run by the DOE Joint Genome Institute

MG-RAST is an open-source web application server that suggests automatic phylogenetic and functional analysis of metagenomes. It is also one of the biggest repositories for metagenomic data. The name is an abbreviation of Metagenomic Rapid Annotations using Subsystems Technology. The pipeline automatically produces functional assignments to the sequences that belong to the metagenome by performing sequence comparisons to databases in both nucleotide and amino-acid levels. The applications supply phylogenetic and functional assignments of the metagenome being analysed, as well as tools for comparing different metagenomes. It also provides a RESTful API for programmatic access.

<span class="mw-page-title-main">Viral metagenomics</span>

Viral metagenomics is the metagenomic study of viral genetic material obtained from environmental DNA samples or clinical DNA samples obtained from a host or natural reservoir. Metagenomic methods can be applied to study viruses in any system and has been used to describe various viruses associated with cancerous tumors, extreme environments, terrestrial ecosystems, and the blood and feces of humans. The term virome is also used to refer to viruses investigated by metagenomic sequencing of viral nucleic acids and is frequently used to describe environmental shotgun metagenomes. Viral metagenomics is a culture independent methodology that provides insights on viral diversity, abundance, and functional potential of viruses within the environment. Viruses lack a universal phylogenetic marker making metagenomics the only way to assess the genetic diversity of viruses in an environmental sample. With the advancements of techniques that can exploit next-generation sequencing, viruses can now be studied outside of culturable virus-host pairs. This approach has created improvements in molecular epidemiology and accelerated the discovery of novel viruses.

Deinococcus frigens is a species of low temperature and drought-tolerating, UV-resistant bacteria from Antarctica. It is Gram-positive, non-motile and coccoid-shaped. Its type strain is AA-692. Individual Deinococcus frigens range in size from 0.9-2.0 μm and colonies appear orange or pink in color. Liquid-grown cells viewed using phase-contrast light microscopy and transmission electron microscopy on agar-coated slides show that isolated D. frigens appear to produce buds. Comparison of the genomes of Deiococcus radiodurans and D. frigens have predicted that no flagellar assembly exists in D. frigens.

METAGENassist is a freely available web server for comparative metagenomic analysis. Comparative metagenomic studies involve the large-scale comparison of genomic or taxonomic census data from bacterial samples across different environments. Historically this has required a sound knowledge of statistics, computer programming, genetics and microbiology. As a result, only a small number of researchers are routinely able to perform comparative metagenomic studies. To circumvent these limitations, METAGENassist was developed to allow metagenomic analyses to be performed by non-specialists, easily and intuitively over the web. METAGENassist is particularly notable for its rich graphical output and its extensive database of bacterial phenotypic information.

Treponema socranskii was isolated from gum swabs of people with periodontitis and clinically-induced periodontitis. It is a motile, helically coiled, obligate anaerobe that grows best at 37 °C, and is a novel member of its genus because of its ability to ferment molecules that other Treponema species cannot. T. socranskii’s growth is positively correlated with gingival inflammation, which indicates that it is a leading cause of gingivitis and periodontitis.

<span class="mw-page-title-main">Longhurst code</span>

Longhurst code refers to a set of geospatial four-letter geocodes for referencing geographic regions in oceanography.

Haladaptatus paucihalophilus is a halophilic archaeal species, originally isolated from a spring in Oklahoma. It uses a new pathway to synthesize glycine, and contains unique physiological features for osmoadaptation.

<span class="mw-page-title-main">Virome</span>

Virome refers to the assemblage of viruses that is often investigated and described by metagenomic sequencing of viral nucleic acids that are found associated with a particular ecosystem, organism or holobiont. The word is frequently used to describe environmental viral shotgun metagenomes. Viruses, including bacteriophages, are found in all environments, and studies of the virome have provided insights into nutrient cycling, development of immunity, and a major source of genes through lysogenic conversion. Also, the human virome has been characterized in nine organs of 31 Finnish individuals using qPCR and NGS methodologies.

Model organism databases (MODs) are biological databases, or knowledgebases, dedicated to the provision of in-depth biological data for intensively studied model organisms. MODs allow researchers to easily find background information on large sets of genes, plan experiments efficiently, combine their data with existing knowledge, and construct novel hypotheses. They allow users to analyse results and interpret datasets, and the data they generate are increasingly used to describe less well studied species. Where possible, MODs share common approaches to collect and represent biological information. For example, all MODs use the Gene Ontology (GO) to describe functions, processes and cellular locations of specific gene products. Projects also exist to enable software sharing for curation, visualization and querying between different MODs. Organismal diversity and varying user requirements however mean that MODs are often required to customize capture, display, and provision of data.

Xenophilus azovorans is a bacterium from the genus Xenophilus which has been isolated from soil in Switzerland.

Nikos Kyrpides is a Greek-American bioscientist who has worked on the origins of life, information processing, bioinformatics, microbiology, metagenomics and microbiome data science. He is a senior staff scientist at the Berkeley National Laboratory, head of the Prokaryote Super Program and leads the Microbiome Data Science program at the US Department of Energy Joint Genome Institute.

<span class="mw-page-title-main">Genome mining</span>

Genome mining describes the exploitation of genomic information for the discovery of biosynthetic pathways of natural products and their possible interactions. It depends on computational technology and bioinformatics tools. The mining process relies on a huge amount of data accessible in genomic databases. By applying data mining algorithms, the data can be used to generate new knowledge in several areas of medicinal chemistry, such as discovering novel natural products.

References

  1. Markowitz, Victor M; Chen I-Min A; Palaniappan Krishna; Chu Ken; Szeto Ernest; Grechkin Yuri; Ratner Anna; Jacob Biju; Huang Jinghua; Williams Peter; Huntemann Marcel; Anderson Iain; Mavromatis Konstantinos; Ivanova Natalia N; Kyrpides Nikos C (Jan 2012). "IMG: the integrated microbial genomes database and comparative analysis system". Nucleic Acids Res. England. 40 (1): D115-22. doi:10.1093/nar/gkr1044. PMC   3245086 . PMID   22194640.
  2. 1 2 Markowitz, V. M.; Chen, I. M. A.; Palaniappan, K.; Chu, K.; Szeto, E.; Grechkin, Y.; Ratner, A.; Anderson, I.; Lykidis, A.; Mavromatis, K.; Ivanova, N. N.; Kyrpides, N. C. (2009). "The integrated microbial genomes system: An expanding comparative analysis resource". Nucleic Acids Research. 38 (Database issue): D382–D390. doi:10.1093/nar/gkp887. PMC   2808961 . PMID   19864254.
  3. Hadjithomas, Michalis; Chen, I.-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter (2015-07-14). "IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites". mBio. 6 (4): e00932. doi:10.1128/mBio.00932-15. ISSN   2150-7511. PMC   4502231 . PMID   26173699.
  4. Markowitz, V. M.; Chen, I. -M. A.; Chu, K.; Szeto, E.; Palaniappan, K.; Grechkin, Y.; Ratner, A.; Jacob, B.; Pati, A.; Huntemann, M.; Liolios, K.; Pagani, I.; Anderson, I.; Mavromatis, K.; Ivanova, N. N.; Kyrpides, N. C. (2011). "IMG/M: The integrated metagenome data management and comparative analysis system". Nucleic Acids Research. 40 (Database issue): D123–D129. doi:10.1093/nar/gkr975. PMC   3245048 . PMID   22086953.
  5. Chen, I.-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan (2017-01-04). "IMG/M: integrated genome and metagenome comparative data analysis system". Nucleic Acids Research. 45 (D1): D507–D516. doi:10.1093/nar/gkw929. ISSN   1362-4962. PMC   5210632 . PMID   27738135.
  6. Markowitz, Victor M.; Chen, I.-Min A.; Chu, Ken; Szeto, Ernest; Palaniappan, Krishna; Jacob, Biju; Ratner, Anna; Liolios, Konstantinos; Pagani, Ioanna (2012). "IMG/M-HMP: a metagenome comparative analysis system for the Human Microbiome Project". PLOS ONE. 7 (7): e40151. Bibcode:2012PLoSO...740151M. doi: 10.1371/journal.pone.0040151 . ISSN   1932-6203. PMC   3390314 . PMID   22792232.
  7. Hadjithomas, Michalis; Chen, I.-Min A.; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C. (2017-01-04). "IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes". Nucleic Acids Research. 45 (D1): D560–D565. doi:10.1093/nar/gkw1103. ISSN   1362-4962. PMC   5210574 . PMID   27903896.
  8. Paez-Espino, David; Chen, I.-Min A.; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M. (2017-01-04). "IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses". Nucleic Acids Research. 45 (D1): D457–D465. doi:10.1093/nar/gkw1030. ISSN   1362-4962. PMC   5210529 . PMID   27799466.
  9. Paez-Espino D, Roux S, Chen IA, Palaniappan K, Ratner A, Chu K, et al. (2019). "IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes". Nucleic Acids Res. 47 (D1): D678–D686. doi:10.1093/nar/gky1127. PMC   6323928 . PMID   30407573.