WGCNA: an R package for weighted correlation network analysis


Peter Langfelder and Steve Horvath
with help of many other contributors


Dept. of Human Genetics, UC Los Angeles (PL, SH), and Dept. of Biostatistics, UC Los Angeles (SH)

Peter (dot) Langfelder (at) gmail (dot) com, SHorvath (at) mednet (dot) ucla (dot) edu

BMC Bioinformatics, 2008 9:559 (link opens in a new tab/window)

Quick navigation

Abstract

Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial.

The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings.

R Tutorials

A comprehensive set of tutorials that illustrate various aspects of WGCNA is available. We offer not only introductory tutorials that introduce basic functionality of the package, but also more advanced analyses in which we used the WGCNA package in our own research.

Click here to access the tutorial page.

Automatic installation from CRAN

The WGCNA package is now available from the Comprehensive R Archive Network (CRAN), the standard repository for R add-on packages. Currently, one of the required packages is only available from Bioconductor and needs to be installed separately. To install the required packages and WGCNA, simply type


source("http://bioconductor.org/biocLite.R")
biocLite("impute")
install.packages("WGCNA")


This will install the WGCNA package and all necessary dependencies. The catch is that this only installs the newest version of WGCNA if your R version is also the newest (minor) version. Users using older versions of R will need to follow the manual download and installation instructions below.

Note for Mac users: CRAN occasionally fails to compile the WGCNA package for Mac OS X. This leads to the error message "Package WGCNA is not available..." when calling install.packages(). If this occurs, please download the binary version from here and follow the installation instructions (or, if you are able to compile packages locally, download the source and install that).

Note of caution: The newest versions of WGCNA is available from CRAN only for the current R version. For example, if your R version is 3.0.1 and the current R version on CRAN is 3.1.0, the automatic installation and update will not use the newest version of WGCNA. Please update your R to the newest version or use the manual download below.

Problems installing or using the package? Please see our list of frequently asked questions. Your problem and the solution may already be posted there.

Manual download and installation

Please follow these steps only if the automatic package installation above does not work.

Prerequisites:

The current version of the WGCNA package will only work with R version 2.14.0 and higher. If you have an older version of R, please upgrade your R.

The WGCNA package requires the following packages to be installed: stats, grDevices, dynamicTreeCut (1.20 or higher), cluster, utils, flashClust, reshape, and impute (from Bioconductor). If your system does not have them installed, the easiest way to install them is to issue the following command at the R prompt:


install.packages(c("dynamicTreeCut", "cluster", "flashClust", "Hmisc", "reshape", "foreach", "doParallel") )
source("http://bioconductor.org/biocLite.R")
biocLite("impute")

If you plan on using annotation capabilities (such as GOenrichmentAnalysis), we also recommend installing Bioconductor annotation packages. This can be accomplished by the following R commands:

orgCodes = c("Hs", "Mm", "Rn", "Pf", "Sc", "Dm", "Bt", "Ce", "Cf", "Dr", "Gg");
orgExtensions = c(rep(".eg", 4), ".sgd", rep(".eg", 6));
packageNames = paste("org.", orgCodes, orgExtensions, ".db", sep="");

biocLite(c("GO.db", "KEGG.db", "topGO", packageNames, "hgu133a.db", "hgu95av2.db", "annotate", "hgu133plus2.db", "SNPlocs.Hsapiens.dbSNP.20100427", "minet", "OrderedList"))

If you run an older version of R, the above may not install the flashClust package and the newest version of the dynamicTreeCut package. Should you encounter this problem, please manually download and install flashClust from this web page, and dynamicTreeCut from this web page.

R package download and installation: Package WGCNA_1.41 (last updated 2014/06/13) is available here as source code and several pre-compiled versions for various platforms. In general it is preferable to download the source and compile the package locally; however, if this is not practical, please select an appropriate compiled version.

If you require a compiled version, please make sure you select the correct version. We are unable to provide compiled binaries for other versions of R; please upgrade your R if you are running an old version not listed here.

The package version numbers follow the format packageName_major.minor-revision. Minor versions typically add or change some functionality; revisions typically contain bugfixes or minor enhancements.

Installation instructions: Short installation instructions, including other required and recommended packages, are available here. Should you discover bugs (of which there are most likely plenty), please report them to Peter Langfelder.

Problems installing or using the package

Please see our list of Frequently Asked Questions (and frequently given answers); the solution to your problem may lie there. In particular, you can find answers about spurious Mac errors, compatibility problems when upgrading WGCNA, and others. If you still cannot solve the problem, email Peter Langfelder.

In-depth information about certain WGCNA features

Curious about some of the deep secrets WGCNA hides? Interested in learning about what some of the calculation options mean? We have put together a few technical reports that discuss a few deeply technical aspects of the WGCNA methodology. Only recommended for die-hard geeks!

Getting started with R and Weighted Gene Co-expression Network Analysis

The package described here is an add-on for the statistical language and environment R (free software). Our tutorial, described below, contains step by step instructions such that even complete novice users should be able to get started in R immediately.

Lastly, readers wishing to learn about the theory and published applications of WGCNA are invited to visit the WGCNA main page.

Old versions of R package WGCNA

Older version of the packages presented on this page are available here.

Citing the WGCNA package

If you use WGCNA in published work, please cite it to properly credit people who have created it.

WGCNA is in general described in the article

If you use any q-value (FDR) calculations, please also cite at least one of the following articles:

If you use the collapseRows function to summarize/convert probe-level data to gene-level data, please cite

If you use module preservation calculations, please cite

If you use functions rgcolors.func, plotCor, plotMat, stat.bwss, or stat.diag.da, please also cite the article

Acknowledgments

The core of the functions and other code was written by Peter Langfelder and Steve Horvath, partly based on older code written by Steve Horvath and Bin Zhang. Multiple people contributed additional code, most prominently Jeremy Miller, Chaochao (Ricky) Cai, Lin Song, Jun Dong, and Andy Yip. The package also contains code adapted from external packages that were either orphaned (such as package sma) or their development has made the code difficult to use in WGCNA (such as package qvalue). A big thanks goes out to people who continue report the many bugs in the package.

The package is currently maintained by Peter Langfelder.




hits counter