On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models

Feifei Zheng, Holger R. Maier, Wenyan Wu, Graeme C. Dandy, Hoshin Vijai Gupta, Tuqiao Zhang

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.

Original languageEnglish (US)
Pages (from-to)1013-1030
Number of pages18
JournalWater Resources Research
Volume54
Issue number2
DOIs
StatePublished - Feb 1 2018

Fingerprint

calibration
streamflow
development model
evaluation
artificial neural network
runoff
catchment
engineering
rainfall
method

Keywords

  • artificial neural networks (ANNs)
  • calibration and evaluation
  • data allocation
  • data splitting
  • hydrological models
  • model evaluation bias

ASJC Scopus subject areas

  • Water Science and Technology

Cite this

On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data : Demonstration for Data-Driven Models. / Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin Vijai; Zhang, Tuqiao.

In: Water Resources Research, Vol. 54, No. 2, 01.02.2018, p. 1013-1030.

Research output: Contribution to journalArticle

@article{25b59a7760e34d6f8600a06a5e8e620e,
title = "On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models",
abstract = "Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.",
keywords = "artificial neural networks (ANNs), calibration and evaluation, data allocation, data splitting, hydrological models, model evaluation bias",
author = "Feifei Zheng and Maier, {Holger R.} and Wenyan Wu and Dandy, {Graeme C.} and Gupta, {Hoshin Vijai} and Tuqiao Zhang",
year = "2018",
month = "2",
day = "1",
doi = "10.1002/2017WR021470",
language = "English (US)",
volume = "54",
pages = "1013--1030",
journal = "Water Resources Research",
issn = "0043-1397",
publisher = "American Geophysical Union",
number = "2",

}

TY - JOUR

T1 - On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data

T2 - Demonstration for Data-Driven Models

AU - Zheng, Feifei

AU - Maier, Holger R.

AU - Wu, Wenyan

AU - Dandy, Graeme C.

AU - Gupta, Hoshin Vijai

AU - Zhang, Tuqiao

PY - 2018/2/1

Y1 - 2018/2/1

N2 - Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.

AB - Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.

KW - artificial neural networks (ANNs)

KW - calibration and evaluation

KW - data allocation

KW - data splitting

KW - hydrological models

KW - model evaluation bias

UR - http://www.scopus.com/inward/record.url?scp=85044400485&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044400485&partnerID=8YFLogxK

U2 - 10.1002/2017WR021470

DO - 10.1002/2017WR021470

M3 - Article

AN - SCOPUS:85044400485

VL - 54

SP - 1013

EP - 1030

JO - Water Resources Research

JF - Water Resources Research

SN - 0043-1397

IS - 2

ER -