Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. Themajor polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here,weuse availablewhole-genomehumanpolymorphismdata together withdataoninterspeciesdivergence tostudy thepatterns of selectionactingonPAShexamers.CommonvariantsofPAShexamers aredepletedof single nucleotidepolymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that "improve" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that "impair" it. SNPs are rarer at PAS of "unique" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fractionof the SNPs at PAShexamers likely affect gene functions; in particular, someof the observed SNPs are associatedwithdisease.
- 1000 genomes
- mRNA processing