Complex selection on human polyadenylation signals revealed by polymorphism and divergence data

Yaroslav A. Kainov, Vasily N. Aushev, Sergey A. Naumenko, Elena M. Tchevkina, Georgii A. Bazykin

    Research output: Contribution to journalArticlepeer-review

    4 Citations (Scopus)


    Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. Themajor polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here,weuse availablewhole-genomehumanpolymorphismdata together withdataoninterspeciesdivergence tostudy thepatterns of selectionactingonPAShexamers.CommonvariantsofPAShexamers aredepletedof single nucleotidepolymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that "improve" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that "impair" it. SNPs are rarer at PAS of "unique" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fractionof the SNPs at PAShexamers likely affect gene functions; in particular, someof the observed SNPs are associatedwithdisease.

    Original languageEnglish
    Pages (from-to)1971-1979
    Number of pages9
    JournalGenome Biology and Evolution
    Issue number6
    Publication statusPublished - 2016


    • 1000 genomes
    • AATAAA
    • mRNA processing
    • Polyadenylation
    • SNP


    Dive into the research topics of 'Complex selection on human polyadenylation signals revealed by polymorphism and divergence data'. Together they form a unique fingerprint.

    Cite this