SUMMARY: In this article, we introduce a hierarchical clustering and Gaussian mixture model with expectation-maximization (EM) algorithm for detecting copy number variants (CNVs) using whole exome sequencing (WES) data. The R shiny package 'HCMMCNVs' is also developed for processing user-provided bam files, running CNVs detection algorithm and conducting visualization. Through applying our approach to 325 cancer cell lines in 22 tumor types from Cancer Cell Line Encyclopedia (CCLE), we show that our algorithm is competitive with other existing methods and feasible in using multiple cancer cell lines for CNVs estimation. In addition, by applying our approach to WES data of 120 oral squamous cell carcinoma (OSCC) samples, our algorithm, using the tumor sample only, exhibits more power in detecting CNVs as compared with the methods using both tumors and matched normal counterparts. AVAILABILITY AND IMPLEMENTATION: HCMMCNVs R shiny software is freely available at github repository https://github.com/lunching/HCMM_CNVs.and Zenodo https://doi.org/10.5281/zenodo.4593371. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
No clinical trial protocols linked to this paper
Clinical trials are automatically linked when NCT numbers are found in the paper's title or abstract.PICO Elements
No PICO elements extracted yet. Click "Extract PICO" to analyze this paper.
Paper Details
MeSH Terms
Associated Data
Shared Datasets & Code 1
Code repository
Related Papers
Related paper suggestions will be available in future updates.