Query Result Cache

RFC PR: datafuselabs/databend#10014
Tracking Issue: datafuselabs/databend#10011

Summary

Support query result cache for faster query response.

Motivation

For some expensive queries with data that doesn't change frequently, we can cache the results to speed up the query response. For the same query with the same underlying data, we can return the cached result directly which improves the query efficiency greatly.

For example, if we want to execute the following query to get the top 5 well-saled products everyt 10 seconds:

SELECT product, count(product) AS sales_count
FROM sales_log
GROUP BY product
ORDER BY sales_count DESC
LIMIT 5;

If we execute the full query pipeline every time, the cost may be every expensive, but the result is very small (5 rows). As the data of sales_log may not change very frequently, we can cache the result of the query and return the cached result directly for the same query.

Detail design

Lifecyle of query result cache

Each result cache has a time-to-live (TTL). Each access to the result cache will refresh the TTL .When the TTL is expired, the result cache will not be used any more.

Besides TTL, when the underlying data is changed (we can infer this by snapshot ids, segment ids or partition locations), the result cache will also be invalidated.

Result cache storage

Databend uses key-value pairs to record query result caches. For every query, Databend will construct a key to represent the query, and store ralated information in the value.

Databend will not store the query result directly in the key-value storage. Instead, Databend only stores the location of the result cache file in the value. The actual result cache will be stored in the storage layer (local fs, s3, ...).

Key

Query result cache is indexed by its abstract syntax tree (AST). Databend serializes the AST to a string and use hashes it as the key.

The key generation is like:

let ast_str = ast.to_string();
let key = format!("_cache/{}/{}", tenant, hash(ast_str.as_bytes()));

Value structure

A query result cache value's structure is like:

pub struct ResultCacheValue {
    /// The query SQL.
    pub sql: String,
    /// The query time.
    pub query_time: DateTime<Utc>,
    /// Time-to-live of this query.
    pub ttl: usize,
    /// The size of the result cache (bytes).
    pub result_size: usize,
    /// The location of the result cache file.
    pub location: String,

    // May be other information
    // ...
}

Key-value storage

databend-meta has a capability to store and query key-value pairs. Databend use it to store query result cache key-value pairs.

Garbage collection

Databend will cache every query result is query result cache is enabled. If the result cache is expired, the cache will not be used any more. To save the disk or object storage space, Databend needs a deamon thread to scan all the query cache periodically and remove the expired ones.

enable_query_result_cache: whether to enable query result cache (default: false).
max_result_cache_bytes: the maximum size of the result cache for one query (default: 1048576 bytes, 1MB).
result_cache_ttl: the time-to-live of the result cache (default: 300 seconds).
tolerate_inconsistent_result_cache: whether to tolerate inconsistent query result cache (default: false). If the underlying data is changed, the result cache will be invalidated. If this configuration is set to true, Databend will still use the result cache even if the underlying data is changed.

Write result cache

TransformWriteResultCache is used to handle query result cache writing:

pub struct TransformWriteResultCache {
    ctx: Arc<QueryContext>,
    cache_key: String,
    cache_writer: ResultCacheWriter,
}

When constructing the query pipeline, Databend will add TransformWriteResultCache to the end of the pipeline:

impl Interpreter for SelectInterpreterV2 {
    async fn execute2(&self) -> Result<PipelineBuildResult> {
        let build_res = self.build_pipeline().await?;
        if self.ctx.get_settings().get_query_result_cache().enable_query_result_cache {
            build_res.main_pipeline.add_transform(TransformWriteResultCache::try_create)?;
        }
        Ok(build_res)
    }
}

The process of TransformWriteResultCache is like:

If upstream is finished, use cache_writer to generate and write the result to a cache file. Go to 6.
Read a DataBlock from the input port.
If cache_writer is full (reach max_result_cache_bytes), goto 5 (do not write to cache).
Push the DataBlock into cache_writer.
Output the DataBlock to the output port. Goto 1.
Finish.

Read result cache

Before constructing the select interpreter, Databend will check if the query result cache is available.

Databend will validate the ResultCacheValue by the cache key (AST) from databend-meta first. If the result cache is available and valid, Databend will get the query result from the result cache file; otherwise, Databend will continue to build and execute the original query pipeline.

System table `system.query_cache`

System table system.query.cache is used to look up query result cache information.

The table contains such information:

sql: the cached SQL.
query_time: the last query time.
expired_time: the expired time of the result cache.
result_size: the size of the result cache (bytes).
location: the location of the result cache file.

Non-deterministic functions

Some functions are non-deterministic, such as now(), rand(), uuid(), etc. If these functions are used in the query, the result will not be cached.

Summary​

Motivation​

Detail design​

Lifecyle of query result cache​

Result cache storage​

Key​

Value structure​

Key-value storage​

Garbage collection​

Related configurations​

Write result cache​

Read result cache​

System table system.query_cache​

Non-deterministic functions​