Generalized estimating equations in cluster randomized trials with a small number of clusters: Review of practice and simulation study

Shuang Huang, Mallorie H. Fiero, Melanie L Bell

Research output: Contribution to journalReview article

18 Citations (Scopus)

Abstract

Background/aims: Generalized estimating equations are a common modeling approach used in cluster randomized trials to account for within-cluster correlation. It is well known that the sandwich variance estimator is biased when the number of clusters is small (≤40), resulting in an inflated type I error rate. Various bias correction methods have been proposed in the statistical literature, but how adequately they are utilized in current practice for cluster randomized trials is not clear. The aim of this study is to evaluate the use of generalized estimating equation bias correction methods in recently published cluster randomized trials and demonstrate the necessity of such methods when the number of clusters is small. Methods: Review of cluster randomized trials published between August 2013 and July 2014 and using generalized estimating equations for their primary analyses. Two independent reviewers collected data from each study using a standardized, pre-piloted data extraction template. A two-arm cluster randomized trial was simulated under various scenarios to show the potential effect of a small number of clusters on type I error rate when estimating the treatment effect. The nominal level was set at 0.05 for the simulation study. Results: Of the 51 included trials, 28 (54.9%) analyzed 40 or fewer clusters with a minimum of four total clusters. Of these 28 trials, only one trial used a bias correction method for generalized estimating equations. The simulation study showed that with four clusters, the type I error rate ranged between 0.43 and 0.47. Even though type I error rate moved closer to the nominal level as the number of clusters increases, it still ranged between 0.06 and 0.07 with 40 clusters. Conclusions: Our results showed that statistical issues arising from small number of clusters in generalized estimating equations is currently inadequately handled in cluster randomized trials. Potential for type I error inflation could be very high when the sandwich estimator is used without bias correction.

Original languageEnglish (US)
Pages (from-to)445-449
Number of pages5
JournalClinical Trials
Volume13
Issue number4
DOIs
StatePublished - Aug 1 2016

Fingerprint

Economic Inflation

Keywords

  • bias correction
  • Cluster randomized trials
  • generalized estimating equations
  • sandwich estimator
  • small number of clusters

ASJC Scopus subject areas

  • Medicine(all)
  • Pharmacology

Cite this

Generalized estimating equations in cluster randomized trials with a small number of clusters : Review of practice and simulation study. / Huang, Shuang; Fiero, Mallorie H.; Bell, Melanie L.

In: Clinical Trials, Vol. 13, No. 4, 01.08.2016, p. 445-449.

Research output: Contribution to journalReview article

@article{cce2cd1eb511460b9a45f185b081ba63,
title = "Generalized estimating equations in cluster randomized trials with a small number of clusters: Review of practice and simulation study",
abstract = "Background/aims: Generalized estimating equations are a common modeling approach used in cluster randomized trials to account for within-cluster correlation. It is well known that the sandwich variance estimator is biased when the number of clusters is small (≤40), resulting in an inflated type I error rate. Various bias correction methods have been proposed in the statistical literature, but how adequately they are utilized in current practice for cluster randomized trials is not clear. The aim of this study is to evaluate the use of generalized estimating equation bias correction methods in recently published cluster randomized trials and demonstrate the necessity of such methods when the number of clusters is small. Methods: Review of cluster randomized trials published between August 2013 and July 2014 and using generalized estimating equations for their primary analyses. Two independent reviewers collected data from each study using a standardized, pre-piloted data extraction template. A two-arm cluster randomized trial was simulated under various scenarios to show the potential effect of a small number of clusters on type I error rate when estimating the treatment effect. The nominal level was set at 0.05 for the simulation study. Results: Of the 51 included trials, 28 (54.9{\%}) analyzed 40 or fewer clusters with a minimum of four total clusters. Of these 28 trials, only one trial used a bias correction method for generalized estimating equations. The simulation study showed that with four clusters, the type I error rate ranged between 0.43 and 0.47. Even though type I error rate moved closer to the nominal level as the number of clusters increases, it still ranged between 0.06 and 0.07 with 40 clusters. Conclusions: Our results showed that statistical issues arising from small number of clusters in generalized estimating equations is currently inadequately handled in cluster randomized trials. Potential for type I error inflation could be very high when the sandwich estimator is used without bias correction.",
keywords = "bias correction, Cluster randomized trials, generalized estimating equations, sandwich estimator, small number of clusters",
author = "Shuang Huang and Fiero, {Mallorie H.} and Bell, {Melanie L}",
year = "2016",
month = "8",
day = "1",
doi = "10.1177/1740774516643498",
language = "English (US)",
volume = "13",
pages = "445--449",
journal = "Clinical Trials",
issn = "1740-7745",
publisher = "SAGE Publications Ltd",
number = "4",

}

TY - JOUR

T1 - Generalized estimating equations in cluster randomized trials with a small number of clusters

T2 - Review of practice and simulation study

AU - Huang, Shuang

AU - Fiero, Mallorie H.

AU - Bell, Melanie L

PY - 2016/8/1

Y1 - 2016/8/1

N2 - Background/aims: Generalized estimating equations are a common modeling approach used in cluster randomized trials to account for within-cluster correlation. It is well known that the sandwich variance estimator is biased when the number of clusters is small (≤40), resulting in an inflated type I error rate. Various bias correction methods have been proposed in the statistical literature, but how adequately they are utilized in current practice for cluster randomized trials is not clear. The aim of this study is to evaluate the use of generalized estimating equation bias correction methods in recently published cluster randomized trials and demonstrate the necessity of such methods when the number of clusters is small. Methods: Review of cluster randomized trials published between August 2013 and July 2014 and using generalized estimating equations for their primary analyses. Two independent reviewers collected data from each study using a standardized, pre-piloted data extraction template. A two-arm cluster randomized trial was simulated under various scenarios to show the potential effect of a small number of clusters on type I error rate when estimating the treatment effect. The nominal level was set at 0.05 for the simulation study. Results: Of the 51 included trials, 28 (54.9%) analyzed 40 or fewer clusters with a minimum of four total clusters. Of these 28 trials, only one trial used a bias correction method for generalized estimating equations. The simulation study showed that with four clusters, the type I error rate ranged between 0.43 and 0.47. Even though type I error rate moved closer to the nominal level as the number of clusters increases, it still ranged between 0.06 and 0.07 with 40 clusters. Conclusions: Our results showed that statistical issues arising from small number of clusters in generalized estimating equations is currently inadequately handled in cluster randomized trials. Potential for type I error inflation could be very high when the sandwich estimator is used without bias correction.

AB - Background/aims: Generalized estimating equations are a common modeling approach used in cluster randomized trials to account for within-cluster correlation. It is well known that the sandwich variance estimator is biased when the number of clusters is small (≤40), resulting in an inflated type I error rate. Various bias correction methods have been proposed in the statistical literature, but how adequately they are utilized in current practice for cluster randomized trials is not clear. The aim of this study is to evaluate the use of generalized estimating equation bias correction methods in recently published cluster randomized trials and demonstrate the necessity of such methods when the number of clusters is small. Methods: Review of cluster randomized trials published between August 2013 and July 2014 and using generalized estimating equations for their primary analyses. Two independent reviewers collected data from each study using a standardized, pre-piloted data extraction template. A two-arm cluster randomized trial was simulated under various scenarios to show the potential effect of a small number of clusters on type I error rate when estimating the treatment effect. The nominal level was set at 0.05 for the simulation study. Results: Of the 51 included trials, 28 (54.9%) analyzed 40 or fewer clusters with a minimum of four total clusters. Of these 28 trials, only one trial used a bias correction method for generalized estimating equations. The simulation study showed that with four clusters, the type I error rate ranged between 0.43 and 0.47. Even though type I error rate moved closer to the nominal level as the number of clusters increases, it still ranged between 0.06 and 0.07 with 40 clusters. Conclusions: Our results showed that statistical issues arising from small number of clusters in generalized estimating equations is currently inadequately handled in cluster randomized trials. Potential for type I error inflation could be very high when the sandwich estimator is used without bias correction.

KW - bias correction

KW - Cluster randomized trials

KW - generalized estimating equations

KW - sandwich estimator

KW - small number of clusters

UR - http://www.scopus.com/inward/record.url?scp=84979080307&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979080307&partnerID=8YFLogxK

U2 - 10.1177/1740774516643498

DO - 10.1177/1740774516643498

M3 - Review article

C2 - 27094487

AN - SCOPUS:84979080307

VL - 13

SP - 445

EP - 449

JO - Clinical Trials

JF - Clinical Trials

SN - 1740-7745

IS - 4

ER -