Ke Yi, Hai Yu, Jun Yang, Gangqiang Xia, and Yuguo Chen
In Proceedings of the 19th International Conference on Data Engineering (ICDE '03), Bangalore, India, March 2003, 189-200.
We tackle the problem of maintaining materialized top-k
views in this paper. Top-k queries, including MIN and MAX as
important special cases, occur frequently in common database
workloads. A top-k view can be materialized to improve query
performance, but in general it is not self-maintainable unless it
contains all tuples in the base table. Deletions and updates on the
base table may cause tuples to leave the top-k view, resulting in
expensive queries over the base table to "refill'' the view. In this
paper, we propose an algorithm that reduces the frequency of refills
by maintaining a top-k' view instead of a top-k view,
where
k' changes at runtime between k and some
kmax >= k. We show that in most practical
cases,
our algorithm can reduce the expected amortized cost of refill queries
to
O(1) while still keeping the view small. The optimal value of
kmax depends on the update pattern and the costs of
querying the base table and updating the view. Compared with the
simple
approach of maintaining either the top-k view itself or a copy
of
the base table, our algorithm can provide orders-of-magnitude
improvements
in performance with appropriate kmax values. We
show how
to choose kmax dynamically to adapt to the actual
system
workload and performance at runtime, without requiring accurate prior
knowledge.
Paper is available in PDF formats.