Bgee

Last updated
Bgee
Bgee gene expression database logo.png
Bgee gene expression database logo
Content
DescriptionGene expression across species and conditions
Data types
captured
scRNA-Seq, RNA-Seq, Microarray, In situ and EST Data
Organisms Human, Mouse, Rat, Fruit-fly, Chicken, Roundworm, Wild boar, Zebrafish, Cow and others.
Contact
Research center Swiss Institute of Bioinformatics
University of Lausanne
Primary citation PMID   33037820
Release dateJune 2007
Access
Website www.bgee.org
Download URL www.bgee.org
Sparql endpoint www.bgee.org/sparql/
Tools
Web Gene search
TopAnat: Gene Expression Enrichment
Expression comparison
Anatomical homology
Raw data interface
Standalone bioconductor.org/packages/BgeeDB/ bioconductor.org/packages/BgeeCall/
Miscellaneous
License CC0 1.0 Universal
Version15.2
Curation policyManual curation of every study.

Bgee is a database maintained by the SIB Swiss Institute of Bioinformatics and the University of Lausanne for retrieval and comparison of gene expression patterns from RNA-Seq, scRNA-Seq, Microarray, In situ hybridization and EST studies, across multiple animal species. [1] [2] Bgee provides an intuitive answer to the question where is a gene expressed? and supports research in cancer and agriculture, as well as evolutionary biology.

Contents

Bgee is based exclusively on curated, healthy wild-type, expression data (i.e., no gene knock-out, no treatment, no disease), to provide a comparable reference of healthy wild-type gene expression.

Bgee produces calls of presence/absence of expression, and of differential over-/under-expression, integrated along with information of gene orthology, and of homology between organs. This allows comparisons of expression patterns between species.

Bgee allows searches by gene, organ / tissue / cell type and developmental stage.

Bgee is a part of Global Core Biodata Resources (GCBRs) representing "critical components for ensuring the reproducibility and integrity of life sciences research." Bgee is also an ELIXIR Recommended Interoperability Resources that facilitate the FAIR-supporting activities in scientific research.

Related Research Articles

In genetics, an expressed sequence tag (EST) is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and were instrumental in gene discovery and in gene-sequence determination. The identification of ESTs has proceeded rapidly, with approximately 74.2 million ESTs now available in public databases. EST approaches have largely been superseded by whole genome and transcriptome sequencing and metagenome sequencing.

<span class="mw-page-title-main">Swiss Institute of Bioinformatics</span>

The SIB Swiss Institute of Bioinformatics is an academic not-for-profit foundation which federates bioinformatics activities throughout Switzerland.

The Rat Genome Database (RGD) is a database of rat genomics, genetics, physiology and functional data, as well as data for comparative genomics between rat, human and mouse. RGD is responsible for attaching biological information to the rat genome via structured vocabulary, or ontology, annotations assigned to genes and quantitative trait loci (QTL), and for consolidating rat strain data and making it available to the research community. They are also developing a suite of tools for mining and analyzing genomic, physiologic and functional data for the rat, and comparative data for rat, mouse, human, and five other species.

The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a suite of interoperable reference ontologies in the biomedical domain. Currently, there are more than a hundred ontologies that follow the OBO Foundry principles.

Expasy is an online bioinformatics resource operated by the SIB Swiss Institute of Bioinformatics. It is an extensible and integrative portal which provides access to over 160 databases and software tools and supports a range of life science and clinical research areas, from genomics, proteomics and structural biology, to evolution and phylogeny, systems biology and medical chemistry. The individual resources are hosted in a decentralized way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions.

<span class="mw-page-title-main">PHI-base</span>

The Pathogen-Host Interactions database (PHI-base) is a biological database that contains manually curated information on genes experimentally proven to affect the outcome of pathogen-host interactions. The database has been maintained by researchers at Rothamsted Research and external collaborators since 2005. PHI-base has been part of the UK node of ELIXIR, the European life-science infrastructure for biological information, since 2016.

The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences and their protein products. RefSeq was introduced in 2000. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule for major organisms ranging from viruses to bacteria to eukaryotes.

GeneCards is a database of human genes that provides genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. It is being developed and maintained by the Crown Human Genome Center at the Weizmann Institute of Science, in collaboration with LifeMap Sciences.

<span class="mw-page-title-main">DNA annotation</span> The process of describing the structure and function of a genome

In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.

<span class="mw-page-title-main">OrthoDB</span>

OrthoDB presents a catalog of orthologous protein-coding genes across vertebrates, arthropods, fungi, plants, and bacteria. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each major radiation along the species phylogeny. The database of orthologs presents available protein descriptors, together with Gene Ontology and InterPro attributes, which serve to provide general descriptive annotations of the orthologous groups, and facilitate comprehensive orthology database querying. OrthoDB also provides computed evolutionary traits of orthologs, such as gene duplicability and loss profiles, divergence rates, sibling groups, and gene intron-exon architectures.

<span class="mw-page-title-main">BPIFA3</span> Protein-coding gene in the species Homo sapiens

BPI fold containing family A, member 3 (BPIFA3) is a protein that in humans is encoded by the BPIFA3 gene. The gene is also known as SPLUNC3 and C20orf71 in humans and the orthologous gene in mice is 1700058C13Rik. There are multiple variants of the BPIFA3 projected to be a secreted protein. It is very highly expressed in testis with little or no expression in other tissues. The Human Protein Atlas project and Mouse ENCODE Consortium report RNA-Seq expression at RPKM levels of 29.1 for human testis and 69.4 for mouse, but 0 for all other tissues. Similarly, the Bgee consortium, using multiple techniques in addition to RNA-Seq, reports a relative Expression Score of 95.8 out of 100 for testis and 99.0 for sperm in humans; however low levels of BPIFA3 between 20 and 30 were seen for a variety of tissues such as muscle, glands, prostate, nervous system, and skin.

<span class="mw-page-title-main">Experimental factor ontology</span>

Experimental factor ontology, also known as EFO, is an open-access ontology of experimental variables particularly those used in molecular biology. The ontology covers variables which include aspects of disease, anatomy, cell type, cell lines, chemical compounds and assay information. EFO is developed and maintained at the EMBL-EBI as a cross-cutting resource for the purposes of curation, querying and data integration in resources such as Ensembl, ChEMBL and Expression Atlas.

<span class="mw-page-title-main">Alicia Oshlack</span> Australian bioinformatician

Alicia Yinema Kate Nungarai Oshlack is an Australian bioinformatician and is Co-Head of Computational Biology at the Peter MacCallum Cancer Centre in Melbourne, Victoria, Australia. She is best known for her work developing methods for the analysis of transcriptome data as a measure of gene expression. She has characterized the role of gene expression in human evolution by comparisons of humans, chimpanzees, orangutans, and rhesus macaques, and works collaboratively in data analysis to improve the use of clinical sequencing of RNA samples by RNAseq for human disease diagnosis.

The Expression Atlas is a database maintained by the European Bioinformatics Institute that provides information on gene expression patterns from RNA-Seq and Microarray studies, and protein expression from Proteomics studies. The Expression Atlas allows searches by gene, splice variant, protein attribute, disease, treatment or organism part. Individual genes or gene sets can be searched for. All datasets in Expression Atlas have its metadata manually curated and its data analysed through standardised analysis pipelines. There are two components to the Expression Atlas, the Baseline Atlas and the Differential Atlas:

The International Society for Biocuration (ISB) is a non-profit organisation that promotes the field of biocuration and was founded in early 2009. It provides a forum for information exchange through meetings and workshops. The society's conference, the International Biocuration Conference, has been held in Pacific Grove, California (2005), San José, CA (2007), Berlin (2009), Tokyo, Japan (2010), Washington, DC (2012), Cambridge, UK (2013), Toronto, Canada (2014), Beijing, China (2015) and Geneva, Switzerland (2016). The meeting in 2017 will be held in Stanford, California.

Cellosaurus is an online knowledge base on cell lines, which attempts to document all cell lines used in biomedical research. It is provided by the Swiss Institute of Bioinformatics (SIB). It is an ELIXIR Core Data Resource as well as an IRDiRC's Recognized Resource. It is the contributing resource for cell lines on the Resource Identification Portal. As of December 2022, it contains information for more than 144,000 cell lines.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

<span class="mw-page-title-main">Christophe Dessimoz</span>

Christophe Dessimoz is a Swiss National Science Foundation (SNSF) Professor at the University of Lausanne, Associate Professor at University College London and a group leader at the Swiss Institute of Bioinformatics. He was awarded the Overton Prize in 2019 for his contributions to computational biology. Starting in April 2022, he will be joint executive director of the SIB Swiss Institute of Bioinformatics, along with Ron Appel.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

References

  1. Bastian FB, Parmentier G, Roux J, Moretti S, Laudet V, Robinson-Rechavi M (2008). "Bgee: Integrating and Comparing Heterogeneous Transcriptome Data Among Species". Data Integration in the Life Sciences. Lecture Notes in Computer Science. Vol. 5109. pp. 124–131. doi:10.1007/978-3-540-69828-9_12. ISBN   978-3-540-69827-2.
  2. Bastian FB, Roux J, Niknejad A, Comte A, Fonseca Costa SS, Mendes de Farias T, et al. (Jan 2021). "The Bgee suite: integrated curated expression atlas and comparative transcriptomics in animals". Nucleic Acids Research. 49 (D1): D831–D847. doi: 10.1093/nar/gkaa793 . PMC   7778977 . PMID   33037820.

Further reading