Role of model selection criteria in geostatistical inverse estimation of statistical data- and model-parameters

Monica Riva, Marco Panzeri, Alberto Guadagnini, Shlomo P Neuman

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

We analyze theoretically the ability of model quality (sometimes termed information or discrimination) criteria such as the negative log likelihood NLL, Bayesian criteria BIC and KIC and information theoretic criteria AIC, AICc, and HIC to estimate (1) the parameter vector θ of the variogram of hydraulic log conductivity (Y = ln K), and (2) statistical parameters σ hE 2 and σ YE 2 proportional to head and log conductivity measurement error variances, respectively, in the context of geostatistical groundwater flow inversion. Our analysis extends the work of Hernandez et al. (2003, 2006) and Riva et al. (2009), who developed nonlinear stochastic inverse algorithms that allow conditioning estimates of steady state and transient hydraulic heads, fluxes and their associated uncertainty on information about conductivity and head data collected in a randomly heterogeneous confined aquifer. Their algorithms are based on recursive numerical approximations of exact nonlocal conditional equations describing the mean and (co)variance of groundwater flow. Log conductivity is parameterized geostatistically based on measured values at discrete locations and unknown values at discrete "pilot points." Optionally, the maximum likelihood function on which the inverse estimation of Y at pilot points is based may include a regularization term reflecting prior information about Y. The relative weight λ = σ hE 2YE 2 assigned to this term and its components σ hE 2 and σ YE 2, as well as θ are evaluated separately from other model parameters to avoid bias and instability. This evaluation is done on the basis of criteria such as NLL, KIC, BIC, HIC, AIC, and AICc. We demonstrate theoretically that, whereas all these six criteria make it possible to estimate σ hE 2, KIC alone allows one to estimate validly θ and σ YE 2 (and thus λ. We illustrate this discriminatory power of KIC numerically by using a differential evolution genetic search algorithm to minimize it in the context of a two-dimensional steady state groundwater flow problem. We find that whereas σ hE 2, σ YE 2, and the integral scale of Y can be estimated on the basis of a zero-order mean flow equation, the sill of the Y-variogram is estimated more accurately by a second-order approximation of flow. This notwithstanding, KIC prefers the simpler zero-order moment over the more complex second-order version.

Original languageEnglish (US)
Article numberW07502
JournalWater Resources Research
Volume47
Issue number7
DOIs
StatePublished - 2011

Fingerprint

statistical data
groundwater flow
conductivity
variogram
confined aquifer
hydraulic head
sill
conditioning
hydraulic conductivity
parameter

ASJC Scopus subject areas

  • Water Science and Technology

Cite this

Role of model selection criteria in geostatistical inverse estimation of statistical data- and model-parameters. / Riva, Monica; Panzeri, Marco; Guadagnini, Alberto; Neuman, Shlomo P.

In: Water Resources Research, Vol. 47, No. 7, W07502, 2011.

Research output: Contribution to journalArticle

@article{f36e3c03ef0544629762047b87461b25,
title = "Role of model selection criteria in geostatistical inverse estimation of statistical data- and model-parameters",
abstract = "We analyze theoretically the ability of model quality (sometimes termed information or discrimination) criteria such as the negative log likelihood NLL, Bayesian criteria BIC and KIC and information theoretic criteria AIC, AICc, and HIC to estimate (1) the parameter vector θ of the variogram of hydraulic log conductivity (Y = ln K), and (2) statistical parameters σ hE 2 and σ YE 2 proportional to head and log conductivity measurement error variances, respectively, in the context of geostatistical groundwater flow inversion. Our analysis extends the work of Hernandez et al. (2003, 2006) and Riva et al. (2009), who developed nonlinear stochastic inverse algorithms that allow conditioning estimates of steady state and transient hydraulic heads, fluxes and their associated uncertainty on information about conductivity and head data collected in a randomly heterogeneous confined aquifer. Their algorithms are based on recursive numerical approximations of exact nonlocal conditional equations describing the mean and (co)variance of groundwater flow. Log conductivity is parameterized geostatistically based on measured values at discrete locations and unknown values at discrete {"}pilot points.{"} Optionally, the maximum likelihood function on which the inverse estimation of Y at pilot points is based may include a regularization term reflecting prior information about Y. The relative weight λ = σ hE 2/σ YE 2 assigned to this term and its components σ hE 2 and σ YE 2, as well as θ are evaluated separately from other model parameters to avoid bias and instability. This evaluation is done on the basis of criteria such as NLL, KIC, BIC, HIC, AIC, and AICc. We demonstrate theoretically that, whereas all these six criteria make it possible to estimate σ hE 2, KIC alone allows one to estimate validly θ and σ YE 2 (and thus λ. We illustrate this discriminatory power of KIC numerically by using a differential evolution genetic search algorithm to minimize it in the context of a two-dimensional steady state groundwater flow problem. We find that whereas σ hE 2, σ YE 2, and the integral scale of Y can be estimated on the basis of a zero-order mean flow equation, the sill of the Y-variogram is estimated more accurately by a second-order approximation of flow. This notwithstanding, KIC prefers the simpler zero-order moment over the more complex second-order version.",
author = "Monica Riva and Marco Panzeri and Alberto Guadagnini and Neuman, {Shlomo P}",
year = "2011",
doi = "10.1029/2011WR010480",
language = "English (US)",
volume = "47",
journal = "Water Resources Research",
issn = "0043-1397",
publisher = "American Geophysical Union",
number = "7",

}

TY - JOUR

T1 - Role of model selection criteria in geostatistical inverse estimation of statistical data- and model-parameters

AU - Riva, Monica

AU - Panzeri, Marco

AU - Guadagnini, Alberto

AU - Neuman, Shlomo P

PY - 2011

Y1 - 2011

N2 - We analyze theoretically the ability of model quality (sometimes termed information or discrimination) criteria such as the negative log likelihood NLL, Bayesian criteria BIC and KIC and information theoretic criteria AIC, AICc, and HIC to estimate (1) the parameter vector θ of the variogram of hydraulic log conductivity (Y = ln K), and (2) statistical parameters σ hE 2 and σ YE 2 proportional to head and log conductivity measurement error variances, respectively, in the context of geostatistical groundwater flow inversion. Our analysis extends the work of Hernandez et al. (2003, 2006) and Riva et al. (2009), who developed nonlinear stochastic inverse algorithms that allow conditioning estimates of steady state and transient hydraulic heads, fluxes and their associated uncertainty on information about conductivity and head data collected in a randomly heterogeneous confined aquifer. Their algorithms are based on recursive numerical approximations of exact nonlocal conditional equations describing the mean and (co)variance of groundwater flow. Log conductivity is parameterized geostatistically based on measured values at discrete locations and unknown values at discrete "pilot points." Optionally, the maximum likelihood function on which the inverse estimation of Y at pilot points is based may include a regularization term reflecting prior information about Y. The relative weight λ = σ hE 2/σ YE 2 assigned to this term and its components σ hE 2 and σ YE 2, as well as θ are evaluated separately from other model parameters to avoid bias and instability. This evaluation is done on the basis of criteria such as NLL, KIC, BIC, HIC, AIC, and AICc. We demonstrate theoretically that, whereas all these six criteria make it possible to estimate σ hE 2, KIC alone allows one to estimate validly θ and σ YE 2 (and thus λ. We illustrate this discriminatory power of KIC numerically by using a differential evolution genetic search algorithm to minimize it in the context of a two-dimensional steady state groundwater flow problem. We find that whereas σ hE 2, σ YE 2, and the integral scale of Y can be estimated on the basis of a zero-order mean flow equation, the sill of the Y-variogram is estimated more accurately by a second-order approximation of flow. This notwithstanding, KIC prefers the simpler zero-order moment over the more complex second-order version.

AB - We analyze theoretically the ability of model quality (sometimes termed information or discrimination) criteria such as the negative log likelihood NLL, Bayesian criteria BIC and KIC and information theoretic criteria AIC, AICc, and HIC to estimate (1) the parameter vector θ of the variogram of hydraulic log conductivity (Y = ln K), and (2) statistical parameters σ hE 2 and σ YE 2 proportional to head and log conductivity measurement error variances, respectively, in the context of geostatistical groundwater flow inversion. Our analysis extends the work of Hernandez et al. (2003, 2006) and Riva et al. (2009), who developed nonlinear stochastic inverse algorithms that allow conditioning estimates of steady state and transient hydraulic heads, fluxes and their associated uncertainty on information about conductivity and head data collected in a randomly heterogeneous confined aquifer. Their algorithms are based on recursive numerical approximations of exact nonlocal conditional equations describing the mean and (co)variance of groundwater flow. Log conductivity is parameterized geostatistically based on measured values at discrete locations and unknown values at discrete "pilot points." Optionally, the maximum likelihood function on which the inverse estimation of Y at pilot points is based may include a regularization term reflecting prior information about Y. The relative weight λ = σ hE 2/σ YE 2 assigned to this term and its components σ hE 2 and σ YE 2, as well as θ are evaluated separately from other model parameters to avoid bias and instability. This evaluation is done on the basis of criteria such as NLL, KIC, BIC, HIC, AIC, and AICc. We demonstrate theoretically that, whereas all these six criteria make it possible to estimate σ hE 2, KIC alone allows one to estimate validly θ and σ YE 2 (and thus λ. We illustrate this discriminatory power of KIC numerically by using a differential evolution genetic search algorithm to minimize it in the context of a two-dimensional steady state groundwater flow problem. We find that whereas σ hE 2, σ YE 2, and the integral scale of Y can be estimated on the basis of a zero-order mean flow equation, the sill of the Y-variogram is estimated more accurately by a second-order approximation of flow. This notwithstanding, KIC prefers the simpler zero-order moment over the more complex second-order version.

UR - http://www.scopus.com/inward/record.url?scp=79960034053&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960034053&partnerID=8YFLogxK

U2 - 10.1029/2011WR010480

DO - 10.1029/2011WR010480

M3 - Article

AN - SCOPUS:79960034053

VL - 47

JO - Water Resources Research

JF - Water Resources Research

SN - 0043-1397

IS - 7

M1 - W07502

ER -