How to find big-oh in your data set (and how not to)

C. C. McGeoch, D. Precup, Paul R Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages41-52
Number of pages12
Volume1280
ISBN (Print)9783540633464
StatePublished - 1997
Externally publishedYes
Event2nd International Symposium on Intelligent Data Analysis, IDA 1997 - London, United Kingdom
Duration: Aug 4 1997Aug 6 1997

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1280
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other2nd International Symposium on Intelligent Data Analysis, IDA 1997
CountryUnited Kingdom
CityLondon
Period8/4/978/6/97

Fingerprint

Correctness
Iterative Refinement
Curve
Tightness
Heuristic algorithms
Hybrid Method
Experimental Evaluation
Heuristic algorithm
Diagnostics
Heuristics
Unknown
Family

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

McGeoch, C. C., Precup, D., & Cohen, P. R. (1997). How to find big-oh in your data set (and how not to). In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1280, pp. 41-52). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1280). Springer Verlag.

How to find big-oh in your data set (and how not to). / McGeoch, C. C.; Precup, D.; Cohen, Paul R.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280 Springer Verlag, 1997. p. 41-52 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1280).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

McGeoch, CC, Precup, D & Cohen, PR 1997, How to find big-oh in your data set (and how not to). in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 1280, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1280, Springer Verlag, pp. 41-52, 2nd International Symposium on Intelligent Data Analysis, IDA 1997, London, United Kingdom, 8/4/97.
McGeoch CC, Precup D, Cohen PR. How to find big-oh in your data set (and how not to). In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280. Springer Verlag. 1997. p. 41-52. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
McGeoch, C. C. ; Precup, D. ; Cohen, Paul R. / How to find big-oh in your data set (and how not to). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280 Springer Verlag, 1997. pp. 41-52 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{644cc927108b46f987b3facf9e7e1be5,
title = "How to find big-oh in your data set (and how not to)",
abstract = "The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.",
author = "McGeoch, {C. C.} and D. Precup and Cohen, {Paul R}",
year = "1997",
language = "English (US)",
isbn = "9783540633464",
volume = "1280",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "41--52",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - How to find big-oh in your data set (and how not to)

AU - McGeoch, C. C.

AU - Precup, D.

AU - Cohen, Paul R

PY - 1997

Y1 - 1997

N2 - The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

AB - The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

UR - http://www.scopus.com/inward/record.url?scp=84880369193&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880369193&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84880369193

SN - 9783540633464

VL - 1280

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 41

EP - 52

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -