### Abstract

The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

Original language | English (US) |
---|---|

Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |

Publisher | Springer Verlag |

Pages | 41-52 |

Number of pages | 12 |

Volume | 1280 |

ISBN (Print) | 9783540633464 |

State | Published - 1997 |

Externally published | Yes |

Event | 2nd International Symposium on Intelligent Data Analysis, IDA 1997 - London, United Kingdom Duration: Aug 4 1997 → Aug 6 1997 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 1280 |

ISSN (Print) | 03029743 |

ISSN (Electronic) | 16113349 |

### Other

Other | 2nd International Symposium on Intelligent Data Analysis, IDA 1997 |
---|---|

Country | United Kingdom |

City | London |

Period | 8/4/97 → 8/6/97 |

### Fingerprint

### ASJC Scopus subject areas

- Computer Science(all)
- Theoretical Computer Science

### Cite this

*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)*(Vol. 1280, pp. 41-52). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1280). Springer Verlag.

**How to find big-oh in your data set (and how not to).** / McGeoch, C. C.; Precup, D.; Cohen, Paul R.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).*vol. 1280, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1280, Springer Verlag, pp. 41-52, 2nd International Symposium on Intelligent Data Analysis, IDA 1997, London, United Kingdom, 8/4/97.

}

TY - GEN

T1 - How to find big-oh in your data set (and how not to)

AU - McGeoch, C. C.

AU - Precup, D.

AU - Cohen, Paul R

PY - 1997

Y1 - 1997

N2 - The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

AB - The empirical curve bounding problem is defined as follows. Suppose data vectors X, Y are presented such that E(Y[i]) = f(X[i]) where f(x) is an unknown function. The problem is to analyze X, Y and obtain complexity bounds O(gu(x)) and Ω(gl(x)) on the function f(x). As no algorithm for empirical curve bounding can be guaranteed correct, we consider heuristics. Five heuristic algorithms are presented here, together with analytical results guaranteeing correctness for certain families of functions. Experimental evaluations of the correctness and tightness of bounds obtained by the rules for several constructed functions f(x) and real datasets are described. A hybrid method is shown to have very good performance on some kinds of functions, suggesting a general, iterative refinement procedure in which diagnostic features of the results of applying particular methods can be used to select additional methods.

UR - http://www.scopus.com/inward/record.url?scp=84880369193&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880369193&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84880369193

SN - 9783540633464

VL - 1280

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 41

EP - 52

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -