An effective and efficient subpopulation extraction method in large social networks

Bin Zhang, David Krackhardt, Ramayya Krishnan, Patrick Doreian

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

With the help of information technologies, we have access to very large networks, even with billions of nodes. This large size has limited our ability to perform analysis and provide theoretical compelling explanation on the whole network. One solution is to extract connected subgraphs and analyze them as subpopulations. We propose a method for extracting such subpopulation archiving two desirable properties: 1) be effective, resulting in subpopulations with more ties within them than to the external network; and 2) be fast, so that it scales well to large networks. We develop a method called the "Transitive Clustering and Pruning" (T-CLAP) algorithm. We compare the speed and effectiveness of this algorithm to two other popularly community detection algorithms - Newman's and Clauset's algorithms. We find that T-CLAP is orders of magnitudes faster than Newman's algorithm; and is superior to Clauset's algorithm in terms of returning effective subpopulations that are useful.

Original languageEnglish (US)
Title of host publicationInternational Conference on Information Systems 2011, ICIS 2011
Pages477-493
Number of pages17
Volume1
StatePublished - 2011
Externally publishedYes
Event32nd International Conference on Information System 2011, ICIS 2011 - Shanghai, China
Duration: Dec 4 2011Dec 7 2011

Other

Other32nd International Conference on Information System 2011, ICIS 2011
CountryChina
CityShanghai
Period12/4/1112/7/11

Fingerprint

Information technology

Keywords

  • Large scale data
  • Social network
  • Subpopulation extraction

ASJC Scopus subject areas

  • Information Systems

Cite this

Zhang, B., Krackhardt, D., Krishnan, R., & Doreian, P. (2011). An effective and efficient subpopulation extraction method in large social networks. In International Conference on Information Systems 2011, ICIS 2011 (Vol. 1, pp. 477-493)

An effective and efficient subpopulation extraction method in large social networks. / Zhang, Bin; Krackhardt, David; Krishnan, Ramayya; Doreian, Patrick.

International Conference on Information Systems 2011, ICIS 2011. Vol. 1 2011. p. 477-493.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, B, Krackhardt, D, Krishnan, R & Doreian, P 2011, An effective and efficient subpopulation extraction method in large social networks. in International Conference on Information Systems 2011, ICIS 2011. vol. 1, pp. 477-493, 32nd International Conference on Information System 2011, ICIS 2011, Shanghai, China, 12/4/11.
Zhang B, Krackhardt D, Krishnan R, Doreian P. An effective and efficient subpopulation extraction method in large social networks. In International Conference on Information Systems 2011, ICIS 2011. Vol. 1. 2011. p. 477-493
Zhang, Bin ; Krackhardt, David ; Krishnan, Ramayya ; Doreian, Patrick. / An effective and efficient subpopulation extraction method in large social networks. International Conference on Information Systems 2011, ICIS 2011. Vol. 1 2011. pp. 477-493
@inproceedings{564d54b65f884b7cb95aa9f9ffce8ab5,
title = "An effective and efficient subpopulation extraction method in large social networks",
abstract = "With the help of information technologies, we have access to very large networks, even with billions of nodes. This large size has limited our ability to perform analysis and provide theoretical compelling explanation on the whole network. One solution is to extract connected subgraphs and analyze them as subpopulations. We propose a method for extracting such subpopulation archiving two desirable properties: 1) be effective, resulting in subpopulations with more ties within them than to the external network; and 2) be fast, so that it scales well to large networks. We develop a method called the {"}Transitive Clustering and Pruning{"} (T-CLAP) algorithm. We compare the speed and effectiveness of this algorithm to two other popularly community detection algorithms - Newman's and Clauset's algorithms. We find that T-CLAP is orders of magnitudes faster than Newman's algorithm; and is superior to Clauset's algorithm in terms of returning effective subpopulations that are useful.",
keywords = "Large scale data, Social network, Subpopulation extraction",
author = "Bin Zhang and David Krackhardt and Ramayya Krishnan and Patrick Doreian",
year = "2011",
language = "English (US)",
isbn = "9781618394729",
volume = "1",
pages = "477--493",
booktitle = "International Conference on Information Systems 2011, ICIS 2011",

}

TY - GEN

T1 - An effective and efficient subpopulation extraction method in large social networks

AU - Zhang, Bin

AU - Krackhardt, David

AU - Krishnan, Ramayya

AU - Doreian, Patrick

PY - 2011

Y1 - 2011

N2 - With the help of information technologies, we have access to very large networks, even with billions of nodes. This large size has limited our ability to perform analysis and provide theoretical compelling explanation on the whole network. One solution is to extract connected subgraphs and analyze them as subpopulations. We propose a method for extracting such subpopulation archiving two desirable properties: 1) be effective, resulting in subpopulations with more ties within them than to the external network; and 2) be fast, so that it scales well to large networks. We develop a method called the "Transitive Clustering and Pruning" (T-CLAP) algorithm. We compare the speed and effectiveness of this algorithm to two other popularly community detection algorithms - Newman's and Clauset's algorithms. We find that T-CLAP is orders of magnitudes faster than Newman's algorithm; and is superior to Clauset's algorithm in terms of returning effective subpopulations that are useful.

AB - With the help of information technologies, we have access to very large networks, even with billions of nodes. This large size has limited our ability to perform analysis and provide theoretical compelling explanation on the whole network. One solution is to extract connected subgraphs and analyze them as subpopulations. We propose a method for extracting such subpopulation archiving two desirable properties: 1) be effective, resulting in subpopulations with more ties within them than to the external network; and 2) be fast, so that it scales well to large networks. We develop a method called the "Transitive Clustering and Pruning" (T-CLAP) algorithm. We compare the speed and effectiveness of this algorithm to two other popularly community detection algorithms - Newman's and Clauset's algorithms. We find that T-CLAP is orders of magnitudes faster than Newman's algorithm; and is superior to Clauset's algorithm in terms of returning effective subpopulations that are useful.

KW - Large scale data

KW - Social network

KW - Subpopulation extraction

UR - http://www.scopus.com/inward/record.url?scp=84884631542&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84884631542&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781618394729

VL - 1

SP - 477

EP - 493

BT - International Conference on Information Systems 2011, ICIS 2011

ER -