Cleavage and polyadenylation specificity factor

Last updated

Cleavage and polyadenylation specificity factor (CPSF) is involved in the cleavage of the 3' signaling region from a newly synthesized pre-messenger RNA (pre-mRNA) molecule in the process of gene transcription. In eukaryotes, messenger RNA precursors (pre-mRNA) are transcribed in the nucleus from DNA by the enzyme, RNA polymerase II. The pre-mRNA must undergo post-transcriptional modifications, forming mature RNA (mRNA), before they can be transported into the cytoplasm for translation into proteins. The post-transcriptional modifications are: the addition of a 5' m7G cap, splicing of intronic sequences, and 3' cleavage and polyadenylation. [1]

Contents

According to Schönemann et al., "CPSF recognizes the polyadenylation signal (PAS), providing sequence specificity in pre-mRNA cleavage and polyadenylation, and catalyzes pre-mRNA cleavage." [2] It is required to induce RNA polymerase pausing once it recognizes a functional PAS. [3] It is the first protein to bind to the signaling region near the cleavage site of the pre-mRNA, to which the poly(A) tail will be added by polynucleotide adenylyltransferase. The 10-30 nucleotide upstream signaling region of the cleavage site, polyadenylation signal (PAS), has the canonical nucleotide sequence AAUAAA, which is highly conserved across the vast majority of pre-mRNAs. The AAUAAA region is usually defined by a cytosine/adenine (CA) dinucleotide, which is the preferred sequence, that is 5' to the site of the endonucleolytic cleavage. [2] [4] A second downstream signaling region, located approximately 40 nucleotides downstream from the cleavage site on the portion of the pre-mRNA that is cleaved before polyadenylation, consists of a U/GU-rich region required for efficient processing. This downstream fragment is degraded. The mature RNA are transported into the cytoplasm, where they are translated into proteins. [4] [5]

Protein Structure & Interactions

In mammals, CPSF is a protein complex, consisting of six subunits: CPSF-160 (CPSF1), CPSF-100 (CPSF2), CPSF-73 (CPSF3), and CPSF-30 (CPSF4) kDa subunits, WDR33 and Fip1 (FIP1L1).

Cleavage and polyadenylation specificity factor quaternary complex CPSF2 highlighted within CPSF complex.png
Cleavage and polyadenylation specificity factor quaternary complex

The subunits form two components: mammalian polyadenylation specificity factors (mPSF) and mammalian cleavage factor (mCF). The mPSF is made up of CPSF-160, WDR33, CPSF-30, and Fip1. It is necessary for PAS recognition and polyadenylation. The mCF is made up of CPSF-73, CPSF-100, and symplekin. It catalyzes the cleavage reaction by recognizing the histone mRNA 3' processing site. [4] [5]

CPSF-73 is a zinc-dependent hydrolase which cleaves the mRNA precursor between a CA dinucleotide just downstream the polyadenylation signal sequence AAUAAA. [6] [7]

CPSF-100 contributes to the endonuclease activity of CPSF-73. [2]

CPSF-160 (160 kDa) is the largest subunit of CPSF and directly binds to the AAUAAA polyadenylation signal. [8] 160 kDa has three β-propeller domains and a C-terminal domain.

CPSF-30 (30 kDa) has five Cys-Cys-Cys-His (CCCH) zinc-finger motifs near the N terminus and a CCCH zinc knuckle at the C terminus. Two isoforms of CPSF-30 exist and can be found in CPSF complexes. The RNA binding activity of CPSF-30 is mediated by its zinc-fingers 2 and 3. WD repeat domain 33 (146 kDa) has a WD40 domain near the N terminus. The WD40 domain interacts with RNA. WDR33 and CPSF-30 recognize the polyadenylation signal (PAS) in pre-mRNA, which aids in defining the position of RNA cleavage. CPSF-30 recognizes the AU-rich hexamer region by a cooperative, metal-dependent binding mechanism. [4] [5] [9] [10]

Although CPSF-160 is the largest subunit of CPSF, a study conducted by Schönemann et al., debate that WDR33 is responsible for recognizing the PAS and not CPSF-160 as previously believed. The study concluded that the reason that CPSF-160 was believed to be responsible for recognizing the PAS was due to the fact that the WDR33 subunit had not been discovered at the time of the claim. [2]

Fip1 binds to U-rich RNAs by its arginine-rich C-terminus. It binds to RNA sequences upstream of the AAUAAA hexamer region in vitro. Fip1 and CPSF-160 recruit poly(A) polymerase (PAP) to the 3' processing site. [4] PAP is stimulated by Poly(A) binding protein nuclear one to add the poly(A) tail, a non-templated adenosine residues, at the cleavage site. [3] [7]

Only CPSF-160, CPSF-30, Fip1, and WDR33 are necessary and sufficient to form an active CPSF subcomplex in AAUAAA-dependent polyadenylation. CPSF-73 and CPSF-100 are disposable. [2]

CPSF recruits proteins to the 3' region. Identified proteins that are coordinated by CPSF activity include: cleavage stimulatory factor and the two poorly understood cleavage factors. The binding of the polynucleotide adenylyltransferase responsible for actually synthesizing the tail is a necessary prerequisite for cleavage, thus ensuring that cleavage and polyadenylation are tightly coupled processes.

Genes

Alternative Polyadenylation (APA)

Alternative polyadenylation (APA) is a regulatory mechanism that forms multiple 3' end on mRNA. [7]

APA isoforms from the same gene can encode different proteins and/or contain different 3' untranslated regions (UTRs). Deregulation of APA has been associated with a number of human diseases. Since longer UTRs have more binding sites for microRNAs and/or RNA-binding proteins in comparison to shorter UTRs, APA require different stability, translation efficiency, and/or intracellular localization. [4]

Mammalian PASs have a number of key cis elements.

PAS sequences are variable, and many PASs lack one or more cis elements. PAS recognition is accomplished by protein-RNA interactions.

CPSF synergistically binds to the AAUAAA hexamer and CstF synergistically binds to the downstream element (DSE). The CFI complex binds to the UGUA motifs. CPSF, CstF, and CFI bind directly to RNA. They also recruit other proteins such as CFII, symplekin, and the poly(A) polymerase (PAP) to assemble the mRNA 3' processing complex, also known as the cleavage and polyadenylation complex. The assembly of these factors are facilitated by the C-terminal domain (CTD) of the RNA polymerase II (RNAP II) large subunit. The CTD provides a landing pad for mRNA processing factors. [4] [11]

Other Protein Complexes in the Cleavage and Polyadenylation Complex

Symplekin (SYMPK) is a scaffolding protein that mediates the interaction between CPSF and CstF. [2]

In mammalian CPSF, both cleavage factor I (CFIm) and cleavage and polyadenylation specificity factor (CPSF) are required for cleavage and polyadenylation whereas cleavage stimulation factor (CstF) is only essential for the cleavage step. [12] CPSF and CstF travel along with RNA polymerase II (RNAP II) during nascent gene transcription in search of the PAS. [3]

Cleavage factor I (CFIm) is made of 25 (CPSF5), 59 (CPSF7), and 68 (CPSF6) kDa proteins. Cleavage factor II (CFIIm) is made of Pcf11, Clp1, and cleavage stimulation factor (CstF). CFIIm binds to the RNAP II C-terminal domain and other CpA factors. [3] [13]

Cleavage stimulation factor (CstF) has three subunits: CstF77 (CstF3), CstF50 (CstF1), and CstF64 (CstF2 and CstF2T). CstF recognizes the PAS that is 20 nucleotides downstream the signaling region of the cleavage site, which is a GU-rich sequence motif followed by U-rich sequences. CstF contributes to the selection of the cleavage site, as well as alternative polyadenylation. [4] [5] [13]

Coupled Processes

Coupling of RNA polymerase II (pol II) transcription can influence processing reactions in three ways. [11]

  1. localization
    • positions mRNA processing factors at the elongation complex, which raises their local concentration in the vicinity of the nascent transcript
  2. kinetic coupling
    • the rate of transcript can have profound effects on RNA folding and the assembly of RNA-protein complexes
  3. allosteric
    • contact between the pol II elongation complex and mRNA processing factors can allosterically inhibit or activate mRNA processing factors

Related Research Articles

<span class="mw-page-title-main">Transcription (biology)</span> Process of copying a segment of DNA into RNA

Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA, while at least 80% of mammalian genomic DNA can be actively transcribed, with the majority of this 80% considered to be ncRNA.

<span class="mw-page-title-main">Three prime untranslated region</span> Sequence at the 3 end of messenger RNA that does not code for product

In molecular genetics, the three prime untranslated region (3′-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3′-UTR often contains regulatory regions that post-transcriptionally influence gene expression.

In genetics, a transcription terminator is a section of nucleic acid sequence that marks the end of a gene or operon in genomic DNA during transcription. This sequence mediates transcriptional termination by providing signals in the newly synthesized transcript RNA that trigger processes which release the transcript RNA from the transcriptional complex. These processes include the direct interaction of the mRNA secondary structure with the complex and/or the indirect activities of recruited termination factors. Release of the transcriptional complex frees RNA polymerase and related transcriptional machinery to begin transcription of new mRNAs.

Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature mRNA for translation. In many bacteria, the poly(A) tail promotes degradation of the mRNA. It, therefore, forms part of the larger process of gene expression.

<span class="mw-page-title-main">Primary transcript</span> RNA produced by transcription

A primary transcript is the single-stranded ribonucleic acid (RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs are modified in preparation for translation. For example, a precursor mRNA (pre-mRNA) is a type of primary transcript that becomes a messenger RNA (mRNA) after processing.

In molecular biology, a termination factor is a protein that mediates the termination of RNA transcription by recognizing a transcription terminator and causing the release of the newly made mRNA. This is part of the process that regulates the transcription of RNA to preserve gene expression integrity and are present in both eukaryotes and prokaryotes, although the process in bacteria is more widely understood. The most extensively studied and detailed transcriptional termination factor is the Rho (ρ) protein of E. coli.

<span class="mw-page-title-main">Post-transcriptional modification</span> RNA processing within a biological cell

Transcriptional modification or co-transcriptional modification is a set of biological processes common to most eukaryotic cells by which an RNA primary transcript is chemically altered following transcription from a gene to produce a mature, functional RNA molecule that can then leave the nucleus and perform any of a variety of different functions in the cell. There are many types of post-transcriptional modifications achieved through a diverse class of molecular mechanisms.

Cleavage stimulatory factor or cleavage stimulation factor is a heterotrimeric protein, made up of the proteins CSTF1 (55kDa), CSTF2 (64kDa) and CSTF3 (77kDa), totalling about 200 kDa. It is involved in the cleavage of the 3' signaling region from a newly synthesized pre-messenger RNA (mRNA) molecule. CstF is recruited by cleavage and polyadenylation specificity factor (CPSF) and assembles into a protein complex on the 3' end to promote the synthesis of a functional polyadenine tail, which results in a mature mRNA molecule ready to be exported from the cell nucleus to the cytosol for translation.

Cleavage factors are two closely associated protein complexes involved in the cleavage of the 3' untranslated region of a newly synthesized pre-messenger RNA (mRNA) molecule in the process of gene transcription. The cleavage is the first step in adding a polyadenine tail to the pre-mRNA, which is one of the necessary post-transcriptional modifications necessary for producing a mature mRNA molecule.

<span class="mw-page-title-main">Eukaryotic transcription</span> Transcription is heterocatalytic function of DNA

Eukaryotic transcription is the elaborate process that eukaryotic cells use to copy genetic information stored in DNA into units of transportable complementary RNA replica. Gene transcription occurs in both eukaryotic and prokaryotic cells. Unlike prokaryotic RNA polymerase that initiates the transcription of all different types of RNA, RNA polymerase in eukaryotes comes in three variations, each translating a different type of gene. A eukaryotic cell has a nucleus that separates the processes of transcription and translation. Eukaryotic transcription occurs within the nucleus where DNA is packaged into nucleosomes and higher order chromatin structures. The complexity of the eukaryotic genome necessitates a great variety and complexity of gene expression control.

<span class="mw-page-title-main">Polynucleotide adenylyltransferase</span>

In enzymology, a polynucleotide adenylyltransferase is an enzyme that catalyzes the chemical reaction

<span class="mw-page-title-main">CSTF2</span> Protein-coding gene in humans

Cleavage stimulation factor 64 kDa subunit is a protein that in humans is encoded by the CSTF2 gene.

<span class="mw-page-title-main">NUDT21</span> Protein-coding gene in the species Homo sapiens

Cleavage and polyadenylation specificity factor subunit 5 (CPSF5) is an enzyme that in humans is encoded by the NUDT21 gene. It belongs to the Nudix family of hydrolases.

<span class="mw-page-title-main">CPSF2</span> Protein-coding gene in the species Homo sapiens

Cleavage and polyadenylation specificity factor subunit 2 is a protein that in humans is encoded by the CPSF2 gene. This protein is a subunit of the cleavage and polyadenylation specificity factor (CPSF) complex which plays a key role in pre-mRNA 3' end processing and polyadenylation. The CPSF2 protein connects the two subunits of the complex, mCF and mPSF. Its structure contributes both to the stability of the subunits interaction and to the flexibility of the complex necessary for function. This protein has been identified as an essential subunit of the complex as certain mutations in the region inhibit CPSF complex formation.

<span class="mw-page-title-main">CPSF1</span> Protein-coding gene in the species Homo sapiens

Cleavage and polyadenylation specificity factor subunit 1 is a protein that in humans is encoded by the CPSF1 gene.

<span class="mw-page-title-main">PAPOLA</span> Protein-coding gene in the species Homo sapiens

Poly(A) polymerase alpha is an enzyme that in humans is encoded by the PAPOLA gene.

<span class="mw-page-title-main">CSTF1</span> Protein-coding gene in the species Homo sapiens

Cleavage stimulation factor 50 kDa subunit is a protein that in humans is encoded by the CSTF1 gene.

<span class="mw-page-title-main">CSTF3</span> Protein-coding gene in the species Homo sapiens

Cleavage stimulation factor 77 kDa subunit is a protein that in humans is encoded by the CSTF3 gene.

<span class="mw-page-title-main">CPSF3</span> Protein-coding gene in the species Homo sapiens

Cleavage and polyadenylation specificity factor subunit 3 is a protein that in humans is encoded by the CPSF3 gene.

<span class="mw-page-title-main">CPSF4</span> Protein-coding gene in the species Homo sapiens

Cleavage and polyadenylation specificity factor subunit 4 is a protein that in humans is encoded by the CPSF4 gene.

References

  1. Mandel CR, Bai Y, Tong L (April 2008). "Protein factors in pre-mRNA 3'-end processing". Cellular and Molecular Life Sciences. 65 (7–8): 1099–1122. doi:10.1007/s00018-007-7474-3. PMC   2742908 . PMID   18158581.
  2. 1 2 3 4 5 6 Schönemann L, Kühn U, Martin G, Schäfer P, Gruber AR, Keller W, et al. (November 2014). "Reconstitution of CPSF active in polyadenylation: recognition of the polyadenylation signal by WDR33". Genes & Development. 28 (21): 2381–2393. doi:10.1101/gad.250985.114. PMC   4215183 . PMID   25301781.
  3. 1 2 3 4 Murphy MR, Doymaz A, Kleiman FE (2021-01-01). "Poly(A) tail dynamics: Measuring polyadenylation, deadenylation and poly(A) tail length". In Tian B (ed.). Methods in Enzymology. MRNA 3' End Processing and Metabolism. Vol. 655. Academic Press. pp. 265–290. doi:10.1016/bs.mie.2021.04.005. ISBN   9780128235737. PMC   9015694 . PMID   34183126.
  4. 1 2 3 4 5 6 7 8 Shi Y, Manley JL (May 2015). "The end of the message: multiple protein-RNA interactions define the mRNA polyadenylation site". Genes & Development. 29 (9): 889–897. doi:10.1101/gad.261974.115. PMC   4421977 . PMID   25934501.
  5. 1 2 3 4 Sun Y, Zhang Y, Hamilton K, Manley JL, Shi Y, Walz T, Tong L (February 2018). "Molecular basis for the recognition of the human AAUAAA polyadenylation signal". Proceedings of the National Academy of Sciences of the United States of America. 115 (7): E1419–E1428. Bibcode:2018PNAS..115E1419S. doi: 10.1073/pnas.1718723115 . PMC   5816196 . PMID   29208711.
  6. Mandel CR, Kaneko S, Zhang H, Gebauer D, Vethantham V, Manley JL, Tong L (December 2006). "Polyadenylation factor CPSF-73 is the pre-mRNA 3'-end-processing endonuclease". Nature. 444 (7121): 953–956. Bibcode:2006Natur.444..953M. doi:10.1038/nature05363. PMC   3866582 . PMID   17128255.
  7. 1 2 3 Arora A, Goering R, Lo HY, Lo J, Moffatt C, Taliaferro JM (2022). "The Role of Alternative Polyadenylation in the Regulation of Subcellular RNA Localization". Frontiers in Genetics. 12: 818668. doi: 10.3389/fgene.2021.818668 . PMC   8795681 . PMID   35096024.
  8. Murthy KG, Manley JL (November 1995). "The 160-kD subunit of human cleavage-polyadenylation specificity factor coordinates pre-mRNA 3'-end formation". Genes & Development. 9 (21): 2672–2683. doi: 10.1101/gad.9.21.2672 . PMID   7590244.
  9. Casañal A, Kumar A, Hill CH, Easter AD, Emsley P, Degliesposti G, et al. (November 2017). "Architecture of eukaryotic mRNA 3'-end processing machinery". Science. 358 (6366): 1056–1059. doi:10.1126/science.aao6535. PMC   5788269 . PMID   29074584.
  10. Shimberg GD, Michalek JL, Oluyadi AA, Rodrigues AV, Zucconi BE, Neu HM, et al. (April 2016). "Cleavage and polyadenylation specificity factor 30: An RNA-binding zinc-finger protein with an unexpected 2Fe-2S cluster". Proceedings of the National Academy of Sciences of the United States of America. 113 (17): 4700–4705. Bibcode:2016PNAS..113.4700S. doi: 10.1073/pnas.1517620113 . PMC   4855568 . PMID   27071088.
  11. 1 2 Bentley DL (June 2005). "Rules of engagement: co-transcriptional recruitment of pre-mRNA processing factors". Current Opinion in Cell Biology. Nucleus and gene expression. 17 (3): 251–256. doi:10.1016/j.ceb.2005.04.006. PMID   15901493.
  12. Stumpf G, Domdey H (November 1996). "Dependence of yeast pre-mRNA 3'-end processing on CFT1: a sequence homolog of the mammalian AAUAAA binding factor". Science. 274 (5292): 1517–1520. Bibcode:1996Sci...274.1517S. doi:10.1126/science.274.5292.1517. JSTOR   2892223. PMID   8929410. S2CID   34840144.
  13. 1 2 Gruber AR, Martin G, Keller W, Zavolan M (March 2014). "Means to an end: mechanisms of alternative polyadenylation of messenger RNA precursors". Wiley Interdisciplinary Reviews. RNA. 5 (2): 183–196. doi:10.1002/wrna.1206. PMC   4282565 . PMID   24243805.

Further reading