Fri. Nov 22nd, 2024

Ry to carry out a meta-analysis of aging PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28192408 effects across multiple data sets. Given methylation (or other) data from multiple independent data sets, and the corresponding ages, the function was used to calculate Stouffer’s meta-analysis Z statistics (reviewed in [39]), P-values, and corresponding q-values (local false discovery rates) [40]. Briefly, Stouffer’s approach for combining multiple correlation test statistics across the data sets is based on calculating the following meta-analysis Z statistic:no.dataSetsws Zs metaZ =s=1 no.dataSets s=(ws )where w s denotes a weight associated with the s-th data set. We found the results were similar irrespective of the weights, which is why we focused on the equal weight method (w_i = 1).Consensus network analysis with WGCNABatch effects are known to influence DNA methylation levels. In our study, batches can arise due to Illumina plate effects or due to the independent data sets generated by different labs. To protect against spurious artifacts due to batch effects, we used the following approaches. First, our network analysis used a consensus module approach which implicitly conditions on each data set by aggregating the information of ten individual networks (one for each of the ten data sets). Modules due to plate effects (or other batch effects) in one data set cannot be found in other data sets, that is, they will not give rise to consensus modules. By definition, consensus modules can be observed in the majority of the ten data sets, that is, they are highly reproducible across multiple data sets (generated by different labs). Second, we only considered those consensus modules that could also be found in data generated by the Illumina 450 K array (which we generated in one batch). Thus, the reported modules are highly reproducible in the Illumina 27 K andAn R software tutorial that describes these methods can be found at the following webpage [32]. Co-expression methodology is typically used for studying relationships between gene expression levels [41]. Here we use these techniques for studying relationships between methylation levels. To describe the relationships among methylation profiles, we used WGCNA. Compared to unweighted network methods, WGCNA has the following advantages: first, it preserves the continuous nature of co-methylation information [42,43]; second, weighted networks are particularly useful for consensus module detection since they allow one to calibrate the individual networks; third, they give rise to powerful module preservation statistics (described below). The consensus network analysis was applied to data sets 1 to 10 described in Table 1. For each data set, a signed weighted network adjacency matrix is defined as:aij = 1 + cor(xi , xj )bwhere xi is the methylation profile of the i-th CpG site (probe on the array), that is, x i is a numeric order (S)-(-)-Blebbistatin vectorHorvath et al. Genome Biology 2012, 13:R97 http://genomebiology.com/2012/13/10/RPage 14 ofwhose entries report the b values across the individuals. Note that the adjacency aij is a number between 0 and 1 that is a monotonically increasing function of the correlation coefficient. The power b is a soft-thresholding parameter that can be used to emphasize high positive correlations at the expense of low correlations. We chose the default threshold of 12. A major advantage of weighted correlation networks is that they are highly robust with regard to the choice of b [42]. While WGCNA can be applied to one data set at a time.