Methods for studying the preservation of module networks across independent data sets

Peter Langfelder1, Luo Rui1, Michael C. Oldham2, and Steve Horvath1,3


1 Dept. of Human Genetics, UC Los Angeles,
2 Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, UC San Francisco,
3 Dept. of Biostatistics, UC Los Angeles

Peter (dot) Langfelder (at) gmail (dot) com, SHorvath (at) mednet (dot) ucla (dot) edu

Article reference

Peter Langfelder, Rui Luo, Michael C. Oldham, Steve Horvath, Is My Network Module Preserved and Reproducible? PLoS Comput Biol 7(1): e1001057. doi:10.1371/journal.pcbi.1001057 ( link to article)

Article abstract

In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from this web page.

R software

Methods for assessing module preservation have been implemented within the WGCNA package for the freely available R statistical language and environment.

Tutorials and example studies

We put together a rather extensive set of tutorials illustrating the analysis of module preservation in several applications.

New to R or Weighted Gene Co-expression Network Analysis?

We recommend the tutorials written for the WGCNA package as a gentle introduction to R. Readers wishing to learn more about Weighted Gene Co-expression Network Analysis are also invited to visit the main WGCNA page (opens in new tab or window).




web analytics