Weighted Gene Co-Expression Network Analysis Software
A systems biologic microarray analysis software for finding important genes and pathways.
WGCNA for Windows (Help file and sample files included, please follow the installation guide ). User manual is also available.
Version 22.214.171.124 (5/13/2008, updates log). Please be aware that this software is not updated anymore and has compatibility problems with R versions 2.8.x and newer. We recommend that users switch to analysis within the R environment using the R package WGCNA.
Weighted gene co-expression network
analysis (WGCNA) is a systems biologic method for analyzing microarray data,
gene information data, and microarray sample traits (e.g. case control status or
clinical outcomes). WGCNA can be used for constructing a weighted gene
co-expression network, for finding co-expression modules, and for calculating
module membership measures (related to intra-modular connectivity). WGCNA
facilitates a network based gene screening method that can be used to identify
candidate biomarkers or therapeutic targets. The gene screening method
integrates gene significance information (e.g. correlation between gene
expression and a clinical outcome) and module membership information to identify
biologically and statistically plausible genes. The software has a graphic
interface that facilitates straightforward input of microarray and clinical
trait data or pre-defined gene information. The software can analyze networks
comprised of tens of thousands of genes and implements several options for
automatic and manual gene selection ("network screening").
WGCNA begins with the understanding that the information captured by microarray experiments is far richer than a list of differentially expressed genes. Rather, microarray data are more completely represented by considering the relationships between measured transcripts, which can be assessed by pair-wise correlations between gene expression profiles. In most microarray data analyses, however, these relationships go essentially unexplored. WGCNA starts from the level of thousands of genes, identifies clinically interesting gene modules, and finally uses intramodular connectivity, gene significance (e.g. based on the correlation of a gene expression profile with a sample trait) to identify key genes in the disease pathways for further validation. WGCNA alleviates the multiple testing problem inherent in microarray data analysis. Instead of relating thousands of genes to a microarray sample trait, it focuses on the relationship between a few (typically less than 10) modules and the sample trait. Toward this end, it calculates the eigengene significance (correlation between sample trait and eigengene) and the corresponding p-value for each module. The module definition does not make use of a priori defined gene sets. Instead, modules are constructed from the expression data by using hierarchical clustering. Although it is advisable to relate the resulting modules to gene ontology information to assess their biological plausibility, it is not required. Because the modules may correspond to biological pathways, focusing the analysis on intramodular hub genes (or the module eigengenes) amounts to a biologically motivated data reduction scheme. Because the expression profiles of intramodular hub genes are highly correlated, typically dozens of candidate biomarkers result. Although these candidates are statistically equivalent, they may differ in terms of biological plausibility or clinical utility. Gene ontology information can be useful for further prioritizing intramodular hub genes. Examples of biological studies that show the importance of intramodular hub genes can be found reported in (Horvath et al 2006, Carlson et al 2006, Gargalovic et al 2006, Ghazalpour et al 2006).
Snapshots: Step1. Load data, Step2.PreProcess, Step3. Network Construction, Step4. Module Detection, Step5. Gene Selection
To cite the WGCNA software, please use:
1. Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu, Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) "Analysis of Oncogenic Signaling Networks in Glioblastoma Identifies ASPM as a Novel Molecular Target", PNAS | November 14, 2006 | vol. 103 | no. 46 | 17402-17407
2. Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17
3. Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut library for R. Bioinformatics. November/btm563
Additional material on WGCNA can be found here.
Bioinformatics Team Members
UCLA: Steve Horvath, Lin Wang, Wei Zhao, Peter Langfelder, Jun Dong, Tova Fuller, Mike Oldham, Paul Mischel, Stan Nelson, Jake Lusis, Tom Drake, Dan Geschwind, Jenny Papp, Anja Presson
Acknowledgement: Dan Salomon (Scripps), Sunil Kurian (Scripps), Pui-Yan Kwok (UCSF)
Supported by the Transplant Genomics Collaborative Group 1U19AI063603-01, NINDS/NIMH 1U24NS043562-01
Supported in parts from the UCLA Specialized Program of Research Excellence (SPORE) in Prostate Cancer (P50CA092131) and from the Jonsson Comprehensive Cancer Center, Core grant (5P30CA016042-28)
Please register with us if you plan to download any of the programs on this web page for software update. Email us with your name, the programs you plan to download and your affiliation. Contact us for suggestions and bug reports.
Downloading since 4/13/2007: