Gene Network Interconnectedness and

the Generalized Topological Overlap Measure

Andy M. Yip (1,2) and Steve Horvath (3,4)
1 Dept. of Mathematics, UCLA
2 Dept. of Mathematics, National University of Singapore
3 Dept. of Human Genetics, David Geffen School of Medicine, UCLA
4 Dept. of Biostatistics, School of Public Health, UCLA

Department of Human Genetics and Department of Biostatistics
University of California, Los Angeles, CA 90095



Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency) between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. [1] can be interpreted as a measure of agreement between the m=1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness.


We generalize the topological overlap measure from m=1 step neighborhoods to m>=2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM) by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis.


Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules.




Yip A, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure BMC Bioinformatics 2007, 8:22

    Link to journal article

Yeast Gene Co-Expression Network Tutorial and Data Set

         Tutorial in Microsoft Word Format

Drosophila (Fly) Protein-Protein Interaction Network Tutorial and Data Set

         Tutorial in Microsoft Word Format
 Data Annotation File 

Presentation Slides

         PowerPoint version   PDF version

Other material regarding weighted gene co-expression network analysis

             Weighted Gene Co-Expression Network Page


The old webpage has been moved to here.


Please send your suggestions and comments to: