Epigenome-wide association studies (EWAS) are designed to characterise population-level epigenetic differences across the genome and link them to disease. Most commonly, they assess DNA-methylation status at cytosine-guanine dinucleotide (CpG) sites, using platforms such as the Illumina 450k array that profile a subset of CpGs genome wide. An important challenge in the context of EWAS is determining a significance threshold for declaring a CpG site as differentially methylated, taking multiple testing into account. We used a permutation method to estimate a significance threshold specifically for the 450k array and a simulation extrapolation approach to estimate a genome-wide threshold. These methods were applied to five different EWAS datasets derived from a variety of populations and tissue types. We obtained an estimate of α=2.4×10-7 for the 450k array, and a genome-wide estimate of α=3.6×10-8. We further demonstrate the importance of these results by showing that previously recommended sample sizes for EWAS should be adjusted upwards, requiring samples between ∼10% and ∼20% larger in order to maintain type-1 errors at the desired level.

Estimation of a significance threshold for epigenome-wide association studies

Zavattari, Patrizia;Moi, Loredana;Columbano, Amedeo;
2018-01-01

Abstract

Epigenome-wide association studies (EWAS) are designed to characterise population-level epigenetic differences across the genome and link them to disease. Most commonly, they assess DNA-methylation status at cytosine-guanine dinucleotide (CpG) sites, using platforms such as the Illumina 450k array that profile a subset of CpGs genome wide. An important challenge in the context of EWAS is determining a significance threshold for declaring a CpG site as differentially methylated, taking multiple testing into account. We used a permutation method to estimate a significance threshold specifically for the 450k array and a simulation extrapolation approach to estimate a genome-wide threshold. These methods were applied to five different EWAS datasets derived from a variety of populations and tissue types. We obtained an estimate of α=2.4×10-7 for the 450k array, and a genome-wide estimate of α=3.6×10-8. We further demonstrate the importance of these results by showing that previously recommended sample sizes for EWAS should be adjusted upwards, requiring samples between ∼10% and ∼20% larger in order to maintain type-1 errors at the desired level.
2018
CpG; DNA methylation; Epigenetic epidemiology; EWAS; FWER; GWAS; Permutation; Resampling; Simulation extrapolation; Epidemiology; Genetics (clinical)
File in questo prodotto:
File Dimensione Formato  
2017GeneticEpidemiologySaffari.pdf

Solo gestori archivio

Tipologia: versione post-print
Dimensione 749.38 kB
Formato Adobe PDF
749.38 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/233015
Citazioni
  • ???jsp.display-item.citation.pmc??? 49
  • Scopus 95
  • ???jsp.display-item.citation.isi??? 95
social impact