Mammalian-wide interspersed repeat

Last updated

Mammalian-wide interspersed repeats (MIRs) are transposable elements in the genomes of some organisms and belong to the group of Short interspersed nuclear elements (SINEs).

Contents

Incidence

MIRs are found in all mammals (including marsupials). [1]

In human

It is estimated that there are around 368,000 MIRs in the human genome. [2]

Structure

The MIR consensus sequence is 260 basepairs long and has an A/T-rich 3' end. [1]

Propagation

Like other Short interspersed nuclear elements (SINEs), MIR elements used the machinery of LINE elements for their propagation in the genome, which took place around 130 million years ago. They cannot retrotranspose anymore since the loss of activity of the required reverse transcriptase. [3]

History of discovery

MIR elements have been first described in human genome 1989-1991 [4] [5] [6] and were first referred as MB1 family repeats (mirror to sequences of mouse B1 repeat). Then this family repeats were found in other mammalian genomes. [7] Then this family was renamed as "Mammalian interspersed repeats" in 1992 [8] Later this family was shown to be common for vertebrate genomes. [9]

Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

<span class="mw-page-title-main">Transposable element</span> Semiparasitic DNA sequence

A transposable element is a nucleic acid sequence in DNA that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. In the human genome, L1 and Alu elements are two examples. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983. Its importance in personalized medicine is becoming increasingly relevant, as well as gaining more attention in data analytics given the difficulty of analysis in very high dimensional spaces.

Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses.

An Alu element is a short stretch of DNA originally characterized by the action of the Arthrobacter luteus (Alu) restriction endonuclease. Alu elements are the most abundant transposable elements, containing over one million copies dispersed throughout the human genome. Alu elements were thought to be selfish or parasitic DNA, because their sole known function is self reproduction. However, they are likely to play a role in evolution and have been used as genetic markers. They are derived from the small cytoplasmic 7SL RNA, a component of the signal recognition particle. Alu elements are highly conserved within primate genomes and originated in the genome of an ancestor of Supraprimates.

Repeated sequences are short or long patterns of nucleic acids that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genomic DNA is repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans. Some of these repeated sequences are necessary for maintaining important genome structures such as telomeres or centromeres.

<span class="mw-page-title-main">Retrotransposon</span> Type of genetic component

Retrotransposons are a type of genetic component that copy and paste themselves into different genomic locations (transposon) by converting RNA back into DNA through the reverse transcription process using an RNA transposition intermediate.

Interspersed repetitive DNA is found in all eukaryotic genomes. They differ from tandem repeat DNA in that rather than the repeat sequences coming right after one another, they are dispersed throughout the genome and nonadjacent. The sequence that repeats can vary depending on the type of organism, and many other factors. Certain classes of interspersed repeat sequences propagate themselves by RNA mediated transposition; they have been called retrotransposons, and they constitute 25–40% of most mammalian genomes. Some types of interspersed repetitive DNA elements allow new genes to evolve by uncoupling similar DNA sequences from gene conversion during meiosis.

<span class="mw-page-title-main">Ubiquitin A-52 residue ribosomal protein fusion product 1</span> Human protein

60S ribosomal protein L40 (RPL40) is a protein that in humans is encoded by the UBA52 gene.

<span class="mw-page-title-main">CUGBP1</span> Protein-coding gene in the species Homo sapiens

CUG triplet repeat, RNA binding protein 1, also known as CUGBP1, is a protein which in humans is encoded by the CUGBP1 gene.

<span class="mw-page-title-main">CRYGB</span> Protein-coding gene in the species Homo sapiens

Gamma-crystallin B is a protein that in humans is encoded by the CRYGB gene.

<span class="mw-page-title-main">TREX2</span> Protein-coding gene in the species Homo sapiens

Three prime repair exonuclease 2 is an enzyme that in humans is encoded by the TREX2 gene.

<span class="mw-page-title-main">PAXIP1</span> Protein-coding gene in the species Homo sapiens

PAX-interacting protein 1 is a protein that in humans is encoded by the PAXIP1 gene.

<span class="mw-page-title-main">Mitochondrial ribosomal protein L18</span> Protein-coding gene in the species Homo sapiens

39S ribosomal protein L18, mitochondrial is a protein that in humans is encoded by the MRPL18 gene.

<span class="mw-page-title-main">SIX4</span> Protein-coding gene in the species Homo sapiens

Homeobox protein SIX4 is a protein that in humans is encoded by the SIX4 gene.

<span class="mw-page-title-main">Mitochondrial ribosomal protein L10</span> Protein-coding gene in the species Homo sapiens

39S ribosomal protein L10, mitochondrial is a protein that in humans is encoded by the MRPL10 gene.

L1Base is a database of functional annotations and predictions of active LINE1 elements.

TRANSFAC is a manually curated database of eukaryotic transcription factors, their genomic binding sites and DNA binding profiles. The contents of the database can be used to predict potential transcription factor binding sites.

<span class="mw-page-title-main">Long interspersed nuclear element</span>

Long interspersed nuclear elements (LINEs) are a group of non-LTR retrotransposons that are widespread in the genome of many eukaryotes. LINEs contain an internal Pol II promoter to initiate transcription into mRNA, and encode one or two proteins, ORF1 and ORF2. The functional domains present within ORF1 vary greatly among LINEs, but often exhibit RNA/DNA binding activity. ORF2 is essential to successful retrotransposition, and encodes a protein with both reverse transcriptase and endonuclease activity.

<span class="mw-page-title-main">Short interspersed nuclear element</span>

Short interspersed nuclear elements (SINEs) are non-autonomous, non-coding transposable elements (TEs) that are about 100 to 700 base pairs in length. They are a class of retrotransposons, DNA elements that amplify themselves throughout eukaryotic genomes, often through RNA intermediates. SINEs compose about 13% of the mammalian genome.

References

  1. 1 2 Smit, Arian F. A.; Riggs, Arthur D. (1995). "MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation". Nucleic Acids Research. 23 (1): 98–102. doi:10.1093/nar/23.1.98. PMC   306635 . PMID   7870595.
  2. Lander; et al. (2001). "Initial sequencing and analysis of the human genome" (PDF). Nature. 409 (6822): 860–921. Bibcode:2001Natur.409..860L. doi: 10.1038/35057062 . PMID   11237011.
  3. Krull, M; Petrusma, M; Makalowski, W; Brosius, J; Schmitz, J (August 2007). "Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs)". Genome Research. 17 (8): 1139–45. doi:10.1101/gr.6320607. PMC   1933517 . PMID   17623809.
  4. Donehower, Lawrence A.; Slagle, Betty L.; Wilde, Margaret; Darlington, Gretchen; Butel, Janet S. (1989). "Identification of a conserved sequence in the non-coding regions of many human genes". Nucleic Acids Research. 17 (2): 699–722. doi:10.1093/nar/17.2.699. PMC   331613 . PMID   2536922.
  5. Korotkov, Eugene V. (1990). "A family of mirror B1-like sequences from human genome". Dokl. Akad. Nauk SSSR (in Russian). 311 (1): 238–242. PMID   2357927.
  6. Korotkov, Eugene V. (1991). "A new family of widely propagated MB1-repeats in the human genome". Mol Biol (Mosk) (in Russian). 25 (1): 250–263. PMID   1896037.
  7. Korotkov, Eugene V. (1992). "The MB1 family of repeats in clones from the genomes of mammals". Izv Akad Nauk SSSR Biol (in Russian). Jul–Aug (4): 546–557. PMID   1452902.
  8. Jurka, Jerzy; Walichiewicz, Jolanta; Milosavljevic, Aleksandar (October 1992). "Prototypic sequences for human repetitive DNA". Journal of Molecular Evolution. 35 (4): 286–291. Bibcode:1992JMolE..35..286J. doi:10.1007/BF00161166. PMID   1404414. S2CID   22946894.
  9. Korotkov, Eugene V.; Korotkova, Maria A.; Rudenko, Valentina M. (2000). "MIR--family of repeats common for vertebrate genomes". Mol Biol (Mosk) (in Russian). 34 (4): 553–559. doi:10.1007/BF02759556. PMID   11042848. S2CID   9524833.