Risk stratification of cervical lesions using capture sequencing and machine learning method based on HPV and human integrated genomic profiles

Rui Tian, Zifeng Cui, Dan He, Xun Tian, Qinglei Gao, Xin Ma, Jian Rong Yang, Jun Wu, Bhudev C. Das, Konstantin Severinov, Inga Isabel Hitzeroth, Priya Ranjan Debata, Wei Xu, Haolin Zhong, Weiwen Fan, Yili Chen, Zhuang Jin, Chen Cao, Miao Yu, Weiling XieZhaoyue Huang, Yuxian Bao, Hongxian Xie, Shuzhong Yao, Zheng Hu

    Research output: Contribution to journalArticlepeer-review

    9 Citations (Scopus)

    Abstract

    From initial human papillomavirus (HPV) infection and precursor stages, the development of cervical cancer takes decades. High-sensitivity HPV DNA testing is currently recommended as primary screening method for cervical cancer, whereas better triage methodologies are encouraged to provide accurate risk management for HPV-positive women. Given that virus-driven genomic variation accumulates during cervical carcinogenesis, we designed a 39 Mb custom capture panel targeting 17 HPV types and 522 mutant genes related to cervical cancer. Using capture-based next-generation sequencing, HPV integration status, somatic mutation and copy number variation were analyzed on 34 paired samples, including 10 cases of HPV infection (HPV+), 10 cases of cervical intraepithelial neoplasia (CIN) grade and 14 cases of CIN2+ (CIN2: n = 1; CIN2-3: n = 3; CIN3: n = 9; squamous cell carcinoma: n = 1). Finally, the machine learning algorithm (Random Forest) was applied to build the risk stratification model for cervical precursor lesions based on CIN2+ enriched biomarkers. Generally, HPV integration events (11 in HPV+, 25 in CIN1 and 56 in CIN2+), non-synonymous mutations (2 in CIN1, 12 in CIN2+) and copy number variations (19.1 in HPV+, 29.4 in CIN1 and 127 in CIN2+) increased from HPV+ to CIN2+. Interestingly, 'common' deletion of mitochondrial chromosome was significantly observed in CIN2+ (P = 0.009). Together, CIN2+ enriched biomarkers, classified as HPV information, mutation, amplification, deletion and mitochondrial change, successfully predicted CIN2+ with average accuracy probability score of 0.814, and amplification and deletion ranked as the most important features. Our custom capture sequencing combined with machine learning method effectively stratified the risk of cervical lesions and provided valuable integrated triage strategies.

    Original languageEnglish
    Pages (from-to)1220-1228
    Number of pages9
    JournalCarcinogenesis
    Volume40
    Issue number10
    DOIs
    Publication statusPublished - 16 Oct 2019

    Fingerprint

    Dive into the research topics of 'Risk stratification of cervical lesions using capture sequencing and machine learning method based on HPV and human integrated genomic profiles'. Together they form a unique fingerprint.

    Cite this