HiWalk: Learning node embeddings from heterogeneous networks

Jie Bai, Linjing Li, Daniel Zeng

Research output: Contribution to journalArticle

Abstract

Heterogeneous networks, such as bibliographical networks and online business networks, are ubiquitous in everyday life. Nevertheless, analyzing them for high-level semantic understanding still poses a great challenge for modern information systems. In this paper, we propose HiWalk to learn distributed vector representations of the nodes in heterogeneous networks. HiWalk is inspired by the state-of-the-art representation learning algorithms employed in the context of both homogeneous networks and heterogeneous networks, based on word embedding learning models. Different from existing methods in the literature, the purpose of HiWalk is to learn vector representations of the targeted set of nodes by leveraging the other nodes as “background knowledge” which maximizes the structural correlations of contiguous nodes. HiWalk decomposes the adjacent probabilities of the nodes and adopts a hierarchical random walk strategy, which makes it more effective, efficient and concentrated when applied to practical large-scale heterogeneous networks. HiWalk can be widely applied in heterogeneous networks environments to analyze targeted types of nodes. We further validate the effectiveness of the proposed HiWalk through multiple tasks conducted on two real-world datasets.

Original languageEnglish (US)
Pages (from-to)82-91
Number of pages10
JournalInformation Systems
Volume81
DOIs
StatePublished - Mar 2019

Keywords

  • Behavioral analysis
  • Heterogeneous network
  • Network analysis
  • Random walk
  • Representation learning

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'HiWalk: Learning node embeddings from heterogeneous networks'. Together they form a unique fingerprint.

  • Cite this