Analyzing and handling local bias for calibrating parametric cost estimation models

Ye Yang, Zhimin He, Ke Mao, Qi Li, Vu Nguyen, Barry Boehm, Ricardo Valerdi

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Context Parametric cost estimation models need to be continuously calibrated and improved to assure more accurate software estimates and reflect changing software development contexts. Local calibration by tuning a subset of model parameters is a frequent practice when software organizations adopt parametric estimation models to increase model usability and accuracy. However, there is a lack of understanding about the cumulative effects of such local calibration practices on the evolution of general parametric models over time. Objective This study aims at quantitatively analyzing and effectively handling local bias associated with historical cross-company data, thus improves the usability of cross-company datasets for calibrating and maintaining parametric estimation models. Method We design and conduct three empirical studies to measure, analyze and address local bias in cross-company dataset, including: (1) defining a method for measuring the local bias associated with individual organization data subset in the overall dataset; (2) analyzing the impacts of local bias on the performance of an estimation model; (3) proposing a weighted sampling approach to handle local bias. The studies are conducted on the latest COCOMO II calibration dataset. Results Our results show that the local bias largely exists in cross company dataset, and the local bias negatively impacts the performance of parametric model. The local bias based weighted sampling technique helps reduce negative impacts of local bias on model performance. Conclusion Local bias in cross-company data does harm model calibration and adds noisy factors to model maintenance. The proposed local bias measure offers a means to quantify degree of local bias associated with a cross-company dataset, and assess its influence on parametric model performance. The local bias based weighted sampling technique can be applied to trade-off and mitigate potential risk of significant local bias, which limits the usability of cross-company data for general parametric model calibration and maintenance.

Original languageEnglish (US)
Pages (from-to)1496-1511
Number of pages16
JournalInformation and Software Technology
Volume55
Issue number8
DOIs
StatePublished - Aug 2013

Fingerprint

Costs
Calibration
Industry
Sampling
Software engineering
Tuning

Keywords

  • COCOMO II
  • Effort estimation
  • Local bias
  • Model maintenance
  • Parametric model
  • Weighted sampling

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Computer Science Applications

Cite this

Analyzing and handling local bias for calibrating parametric cost estimation models. / Yang, Ye; He, Zhimin; Mao, Ke; Li, Qi; Nguyen, Vu; Boehm, Barry; Valerdi, Ricardo.

In: Information and Software Technology, Vol. 55, No. 8, 08.2013, p. 1496-1511.

Research output: Contribution to journalArticle

Yang, Ye ; He, Zhimin ; Mao, Ke ; Li, Qi ; Nguyen, Vu ; Boehm, Barry ; Valerdi, Ricardo. / Analyzing and handling local bias for calibrating parametric cost estimation models. In: Information and Software Technology. 2013 ; Vol. 55, No. 8. pp. 1496-1511.
@article{0a98a14f3bf448aca43aa8f7b1758a14,
title = "Analyzing and handling local bias for calibrating parametric cost estimation models",
abstract = "Context Parametric cost estimation models need to be continuously calibrated and improved to assure more accurate software estimates and reflect changing software development contexts. Local calibration by tuning a subset of model parameters is a frequent practice when software organizations adopt parametric estimation models to increase model usability and accuracy. However, there is a lack of understanding about the cumulative effects of such local calibration practices on the evolution of general parametric models over time. Objective This study aims at quantitatively analyzing and effectively handling local bias associated with historical cross-company data, thus improves the usability of cross-company datasets for calibrating and maintaining parametric estimation models. Method We design and conduct three empirical studies to measure, analyze and address local bias in cross-company dataset, including: (1) defining a method for measuring the local bias associated with individual organization data subset in the overall dataset; (2) analyzing the impacts of local bias on the performance of an estimation model; (3) proposing a weighted sampling approach to handle local bias. The studies are conducted on the latest COCOMO II calibration dataset. Results Our results show that the local bias largely exists in cross company dataset, and the local bias negatively impacts the performance of parametric model. The local bias based weighted sampling technique helps reduce negative impacts of local bias on model performance. Conclusion Local bias in cross-company data does harm model calibration and adds noisy factors to model maintenance. The proposed local bias measure offers a means to quantify degree of local bias associated with a cross-company dataset, and assess its influence on parametric model performance. The local bias based weighted sampling technique can be applied to trade-off and mitigate potential risk of significant local bias, which limits the usability of cross-company data for general parametric model calibration and maintenance.",
keywords = "COCOMO II, Effort estimation, Local bias, Model maintenance, Parametric model, Weighted sampling",
author = "Ye Yang and Zhimin He and Ke Mao and Qi Li and Vu Nguyen and Barry Boehm and Ricardo Valerdi",
year = "2013",
month = "8",
doi = "10.1016/j.infsof.2013.03.002",
language = "English (US)",
volume = "55",
pages = "1496--1511",
journal = "Information and Software Technology",
issn = "0950-5849",
publisher = "Elsevier",
number = "8",

}

TY - JOUR

T1 - Analyzing and handling local bias for calibrating parametric cost estimation models

AU - Yang, Ye

AU - He, Zhimin

AU - Mao, Ke

AU - Li, Qi

AU - Nguyen, Vu

AU - Boehm, Barry

AU - Valerdi, Ricardo

PY - 2013/8

Y1 - 2013/8

N2 - Context Parametric cost estimation models need to be continuously calibrated and improved to assure more accurate software estimates and reflect changing software development contexts. Local calibration by tuning a subset of model parameters is a frequent practice when software organizations adopt parametric estimation models to increase model usability and accuracy. However, there is a lack of understanding about the cumulative effects of such local calibration practices on the evolution of general parametric models over time. Objective This study aims at quantitatively analyzing and effectively handling local bias associated with historical cross-company data, thus improves the usability of cross-company datasets for calibrating and maintaining parametric estimation models. Method We design and conduct three empirical studies to measure, analyze and address local bias in cross-company dataset, including: (1) defining a method for measuring the local bias associated with individual organization data subset in the overall dataset; (2) analyzing the impacts of local bias on the performance of an estimation model; (3) proposing a weighted sampling approach to handle local bias. The studies are conducted on the latest COCOMO II calibration dataset. Results Our results show that the local bias largely exists in cross company dataset, and the local bias negatively impacts the performance of parametric model. The local bias based weighted sampling technique helps reduce negative impacts of local bias on model performance. Conclusion Local bias in cross-company data does harm model calibration and adds noisy factors to model maintenance. The proposed local bias measure offers a means to quantify degree of local bias associated with a cross-company dataset, and assess its influence on parametric model performance. The local bias based weighted sampling technique can be applied to trade-off and mitigate potential risk of significant local bias, which limits the usability of cross-company data for general parametric model calibration and maintenance.

AB - Context Parametric cost estimation models need to be continuously calibrated and improved to assure more accurate software estimates and reflect changing software development contexts. Local calibration by tuning a subset of model parameters is a frequent practice when software organizations adopt parametric estimation models to increase model usability and accuracy. However, there is a lack of understanding about the cumulative effects of such local calibration practices on the evolution of general parametric models over time. Objective This study aims at quantitatively analyzing and effectively handling local bias associated with historical cross-company data, thus improves the usability of cross-company datasets for calibrating and maintaining parametric estimation models. Method We design and conduct three empirical studies to measure, analyze and address local bias in cross-company dataset, including: (1) defining a method for measuring the local bias associated with individual organization data subset in the overall dataset; (2) analyzing the impacts of local bias on the performance of an estimation model; (3) proposing a weighted sampling approach to handle local bias. The studies are conducted on the latest COCOMO II calibration dataset. Results Our results show that the local bias largely exists in cross company dataset, and the local bias negatively impacts the performance of parametric model. The local bias based weighted sampling technique helps reduce negative impacts of local bias on model performance. Conclusion Local bias in cross-company data does harm model calibration and adds noisy factors to model maintenance. The proposed local bias measure offers a means to quantify degree of local bias associated with a cross-company dataset, and assess its influence on parametric model performance. The local bias based weighted sampling technique can be applied to trade-off and mitigate potential risk of significant local bias, which limits the usability of cross-company data for general parametric model calibration and maintenance.

KW - COCOMO II

KW - Effort estimation

KW - Local bias

KW - Model maintenance

KW - Parametric model

KW - Weighted sampling

UR - http://www.scopus.com/inward/record.url?scp=84878335183&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878335183&partnerID=8YFLogxK

U2 - 10.1016/j.infsof.2013.03.002

DO - 10.1016/j.infsof.2013.03.002

M3 - Article

VL - 55

SP - 1496

EP - 1511

JO - Information and Software Technology

JF - Information and Software Technology

SN - 0950-5849

IS - 8

ER -