TY - GEN

T1 - On triangular versus edge representations - Towards scalable modeling of networks

AU - Ho, Qirong

AU - Yin, Junming

AU - Xing, Eric P.

PY - 2012/12/1

Y1 - 2012/12/1

N2 - In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N2) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is Θ(Σ i Di2) (where Di is the degree of vertex i), which is much smaller than N2 for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with (N2) inference cost.

AB - In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N2) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is Θ(Σ i Di2) (where Di is the degree of vertex i), which is much smaller than N2 for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with (N2) inference cost.

UR - http://www.scopus.com/inward/record.url?scp=84877735494&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877735494&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84877735494

SN - 9781627480031

T3 - Advances in Neural Information Processing Systems

SP - 2132

EP - 2140

BT - Advances in Neural Information Processing Systems 25

T2 - 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012

Y2 - 3 December 2012 through 6 December 2012

ER -