A practical approach for efficiently answering top-k relational queries

Anteneh Ayanso, Paulo B. Goes, Kumar Mehta

Research output: Contribution to journalArticle

6 Scopus citations

Abstract

An increasing number of application areas now rely on obtaining the "best matches" to a given query as opposed to exact matches sought by traditional transactions. This type of exploratory querying (also called top-k querying) can significantly improve the performance of web-based applications such as consumer reviews, price comparisons and recommendations for products/services. Due to the lack of support for specialized indexes and/or data structures in relational database management systems (RDBMSs), recent research has focused on utilizing summary statistics (histograms) maintained by RDBMSs for translating the top-k request into a traditional range query. Because the RDBMS query engines are already optimized for execution of range queries, such approach has both practical as well as efficiency advantages. In this paper, we review the strengths and weaknesses of common histogram construction techniques with regard to their structural characteristics, accuracy in approximating the true distribution of the underlying data, and implications for top-k retrieval. We also present our top-k retrieval strategy (Query-Level Optimal Cost Strategy - QLOCS) and demonstrate its "histogram-independent" performance. Based on comparative experimental and statistical analyses with the best-known histogram-based strategy in the literature, we show that QLOCS is not only more efficient but also provides more consistent performance across commonly used histogram types in RDBMSs.

Original languageEnglish (US)
Pages (from-to)326-349
Number of pages24
JournalDecision Support Systems
Volume44
Issue number1
DOIs
StatePublished - Nov 2007

Keywords

  • RDBMSs
  • Similarity search
  • Top-k query
  • Uncertainty modeling

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems
  • Developmental and Educational Psychology
  • Arts and Humanities (miscellaneous)
  • Information Systems and Management

Fingerprint Dive into the research topics of 'A practical approach for efficiently answering top-k relational queries'. Together they form a unique fingerprint.

Cite this