A gossip-based system for fast approximate score computation in multinomial Bayesian networks

Arun Zachariah, Praveen Rao, Anas Katib, Monica Senapati, Jacobus J Barnard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a system for fast approximate score computation, a fundamental task for score-based structure learning of multinomial Bayesian networks. Our work is motivated by the fact that exact score computation on large datasets is very time consuming. Our system enables approximate score computation on large datasets in an efficient and scalable manner with probabilistic error bounds on the statistics required for score computation. Our system has several novel features including gossip-based decentralized computation of statistics, lower resource consumption via a probabilistic approach of maintaining statistics, and effective distribution of tasks for score computation using hashing techniques. The demo will provide a real-time and interactive experience to a user on how our system employs the principle of gossiping and hashing techniques in a novel way for fast approximate score computation. The user will be able to control different aspects of our system's execution on a cluster with up to 32 nodes. The approximate scores output by our system can be then used by existing score-based structure learning algorithms.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PublisherIEEE Computer Society
Pages1968-1971
Number of pages4
ISBN (Electronic)9781538674741
DOIs
StatePublished - Apr 1 2019
Event35th IEEE International Conference on Data Engineering, ICDE 2019 - Macau, China
Duration: Apr 8 2019Apr 11 2019

Publication series

NameProceedings - International Conference on Data Engineering
Volume2019-April
ISSN (Print)1084-4627

Conference

Conference35th IEEE International Conference on Data Engineering, ICDE 2019
CountryChina
CityMacau
Period4/8/194/11/19

Fingerprint

Bayesian networks
Statistics
Learning algorithms

Keywords

  • Approximate score computation
  • Bayesian networks
  • Gossip algorithms
  • Large scale data

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Zachariah, A., Rao, P., Katib, A., Senapati, M., & Barnard, J. J. (2019). A gossip-based system for fast approximate score computation in multinomial Bayesian networks. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019 (pp. 1968-1971). [8731481] (Proceedings - International Conference on Data Engineering; Vol. 2019-April). IEEE Computer Society. https://doi.org/10.1109/ICDE.2019.00216

A gossip-based system for fast approximate score computation in multinomial Bayesian networks. / Zachariah, Arun; Rao, Praveen; Katib, Anas; Senapati, Monica; Barnard, Jacobus J.

Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. p. 1968-1971 8731481 (Proceedings - International Conference on Data Engineering; Vol. 2019-April).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zachariah, A, Rao, P, Katib, A, Senapati, M & Barnard, JJ 2019, A gossip-based system for fast approximate score computation in multinomial Bayesian networks. in Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019., 8731481, Proceedings - International Conference on Data Engineering, vol. 2019-April, IEEE Computer Society, pp. 1968-1971, 35th IEEE International Conference on Data Engineering, ICDE 2019, Macau, China, 4/8/19. https://doi.org/10.1109/ICDE.2019.00216
Zachariah A, Rao P, Katib A, Senapati M, Barnard JJ. A gossip-based system for fast approximate score computation in multinomial Bayesian networks. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society. 2019. p. 1968-1971. 8731481. (Proceedings - International Conference on Data Engineering). https://doi.org/10.1109/ICDE.2019.00216
Zachariah, Arun ; Rao, Praveen ; Katib, Anas ; Senapati, Monica ; Barnard, Jacobus J. / A gossip-based system for fast approximate score computation in multinomial Bayesian networks. Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. pp. 1968-1971 (Proceedings - International Conference on Data Engineering).
@inproceedings{1109b583d3354c58b56ead52d534133a,
title = "A gossip-based system for fast approximate score computation in multinomial Bayesian networks",
abstract = "In this paper, we present a system for fast approximate score computation, a fundamental task for score-based structure learning of multinomial Bayesian networks. Our work is motivated by the fact that exact score computation on large datasets is very time consuming. Our system enables approximate score computation on large datasets in an efficient and scalable manner with probabilistic error bounds on the statistics required for score computation. Our system has several novel features including gossip-based decentralized computation of statistics, lower resource consumption via a probabilistic approach of maintaining statistics, and effective distribution of tasks for score computation using hashing techniques. The demo will provide a real-time and interactive experience to a user on how our system employs the principle of gossiping and hashing techniques in a novel way for fast approximate score computation. The user will be able to control different aspects of our system's execution on a cluster with up to 32 nodes. The approximate scores output by our system can be then used by existing score-based structure learning algorithms.",
keywords = "Approximate score computation, Bayesian networks, Gossip algorithms, Large scale data",
author = "Arun Zachariah and Praveen Rao and Anas Katib and Monica Senapati and Barnard, {Jacobus J}",
year = "2019",
month = "4",
day = "1",
doi = "10.1109/ICDE.2019.00216",
language = "English (US)",
series = "Proceedings - International Conference on Data Engineering",
publisher = "IEEE Computer Society",
pages = "1968--1971",
booktitle = "Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019",

}

TY - GEN

T1 - A gossip-based system for fast approximate score computation in multinomial Bayesian networks

AU - Zachariah, Arun

AU - Rao, Praveen

AU - Katib, Anas

AU - Senapati, Monica

AU - Barnard, Jacobus J

PY - 2019/4/1

Y1 - 2019/4/1

N2 - In this paper, we present a system for fast approximate score computation, a fundamental task for score-based structure learning of multinomial Bayesian networks. Our work is motivated by the fact that exact score computation on large datasets is very time consuming. Our system enables approximate score computation on large datasets in an efficient and scalable manner with probabilistic error bounds on the statistics required for score computation. Our system has several novel features including gossip-based decentralized computation of statistics, lower resource consumption via a probabilistic approach of maintaining statistics, and effective distribution of tasks for score computation using hashing techniques. The demo will provide a real-time and interactive experience to a user on how our system employs the principle of gossiping and hashing techniques in a novel way for fast approximate score computation. The user will be able to control different aspects of our system's execution on a cluster with up to 32 nodes. The approximate scores output by our system can be then used by existing score-based structure learning algorithms.

AB - In this paper, we present a system for fast approximate score computation, a fundamental task for score-based structure learning of multinomial Bayesian networks. Our work is motivated by the fact that exact score computation on large datasets is very time consuming. Our system enables approximate score computation on large datasets in an efficient and scalable manner with probabilistic error bounds on the statistics required for score computation. Our system has several novel features including gossip-based decentralized computation of statistics, lower resource consumption via a probabilistic approach of maintaining statistics, and effective distribution of tasks for score computation using hashing techniques. The demo will provide a real-time and interactive experience to a user on how our system employs the principle of gossiping and hashing techniques in a novel way for fast approximate score computation. The user will be able to control different aspects of our system's execution on a cluster with up to 32 nodes. The approximate scores output by our system can be then used by existing score-based structure learning algorithms.

KW - Approximate score computation

KW - Bayesian networks

KW - Gossip algorithms

KW - Large scale data

UR - http://www.scopus.com/inward/record.url?scp=85067932574&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067932574&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2019.00216

DO - 10.1109/ICDE.2019.00216

M3 - Conference contribution

AN - SCOPUS:85067932574

T3 - Proceedings - International Conference on Data Engineering

SP - 1968

EP - 1971

BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019

PB - IEEE Computer Society

ER -