Population sequencing data reveal a compendium of mutational processes in the human germ line

By Vladimir B. Seplyarskiy, Ruslan A. Soldatov, Evan Koch, Ryan J. McGinty, Jakob M. Goldmann, Ryan D. Hernandez, Kathleen Barnes, Adolfo Correa, Esteban G. Burchard, Patrick T. Ellinor, Stephen T. McGarvey, Braxton D. Mitchell, Ramachandran S. Vasan, Susan Redline, Edwin Silverman, Scott T. Weiss, Donna K. Arnett, John Blangero, Eric Boerwinkle, Jiang He, Courtney Montgomery, D.C. Rao, Jerome I. Rotter, Kent D. Taylor, Jennifer A Brody, Yii-Der Ida Chen, Lisa de las Fuentes, Chii-Min Hwu, Stephen S. Rich, Ani W. Manichaikul, Josyf C. Mychaleckyj, Nicholette D. Palmer, Jennifer A. Smith, Sharon L.R. Kardia, Patricia A. Peyser, Lawrence F. Bielak, Timothy D. O’Connor, Leslie S. Emery, TOPMed Population Genetics Working Group, Christian Gilissen, Wendy S. W. Wong, Peter V. Kharchenko, Shamil Sunyaev, ssunyaev@rics.bwh.harvard.edu
 5 days ago

Cover picture for the articleYou are currently viewing the abstract. Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that resolve asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of LINE repeats. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.

