Is human blood a good surrogate for brain tissue in transcriptional studies?

R Software Tutorial & Data used

Chaochao Cai, Peter Langfelder, Tova F Fuller, Michael C Oldham, Rui Luo, Leonard H van den Berg, Roel A Ophoff, Steve Horvath

Human Genetics and Department of Biostatistics

University of California, Los Angeles


BACKGROUND: Since human brain tissue is often unavailable for transcriptional profiling studies, blood expression data is frequently used as a substitute. The underlying hypothesis in such studies is that genes expressed in brain tissue leave a transcriptional footprint in blood. We tested this hypothesis by relating three human brain expression data sets (from cortex, cerebellum and caudate nucleus) to two large human blood expression data sets (comprised of 1463 individuals).


Results: We found mean expression levels were weakly correlated between the brain and blood data (r range: [0.24,0.32]). Further, we tested whether co-expression relationships were preserved between the three brain regions and blood. Only a handful of brain co-expression modules showed strong evidence of preservation and these modules could be combined into a single large blood module. We also identified highly connected intramodular "hub" genes inside preserved modules. These preserved intramodular hub genes had the following properties: first, their expression profiles tended to be significantly more heritable than those from non-preserved intramodular hub genes (p < 10-90); second, they had highly significant positive correlations with the following cluster of differentiation genes: CD58, CD47, CD48, CD53 and CD164; third, a significant number of them were known to be involved in infection mechanisms, post-transcriptional and post-translational modification and other basic processes.


CONCLUSIONS: Overall, we find transcriptome organization is poorly preserved between brain and blood. However, the subset of preserved co-expression relationships characterized here may aid future efforts to identify blood biomarkers for neurological and neuropsychiatric diseases when brain tissue samples are unavailable.


1. R tutorials (Word)

2.    Data used

    Blood data

           SAFHS data

           Dutch Data

    Brain data

           CTX data

           CN data

           CB data

Other material regarding weighted gene co-expression network analysis

             Weighted Gene Co-Expression Network Page



Please send your suggestions and comments to: