Ken Goldberg on Learning to Optimize Join Queries with Deep Reinforcement Learning
Check out this great paper featured on Assert on Arxiv, coauthored by Ken Goldberg! "Learning to Optimize Join Queries With Deep Reinforcement Learning" was lead authored by Sanjay Krishnan while working with Ken at UC Berkeley!
Abstract:
Exhaustive enumeration of all possible join orders is often avoided, and most optimizers leverage heuristics to prune the search space. The design and implementation of heuristics are well-understood when the cost model is roughly linear, and we find that these heuristics can be significantly suboptimal when there are non-linearities in cost. Ideally, instead of a fixed heuristic, we would want a strategy to guide the search space in a more data-driven way---tailoring the search to a specific dataset and query workload. Recent work in deep reinforcement learning (Deep RL) may provide a new perspective on this problem. Deep RL poses sequential problems, like join optimization, as a series of 1-step prediction problems that can be learned from data. We present our deep RL-based DQ optimizer, which currently optimizes select-project-join blocks, and we evaluate DQ on the Join Order Benchmark. We found that DQ achieves plan costs within a factor of 2 of the optimal solution on all cost models and improves on the next best heuristic by up to 3×3×. Furthermore, DQ executes 10,000×× faster than exhaustive enumeration and more than 10×× faster than left/right-deep enumeration on the largest queries in the benchmark.
Authors: Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, Ion Stoica
Read the paper here.