Kusto Hints & Strategies
Azure Data Explorer (ADX) offers hints and strategies to optimise query performance. Here are some:
hint.strategy = shuffle
:Distributes data across cluster nodes to optimise performance.
Ideal for high cardinality keys in operations like join, summarize, make-series, and partition. This strategy helps balance the load across nodes, improving query efficiency.
hint.shufflekey = <key>
:Specifies a key for shuffling data.
Enhances query performance by shuffling data based on a specific key. This is particularly useful when the key has high cardinality.
hint.num_partitions = <number>
:Controls the number of partitions for data processing.
Adjusts the number of partitions to balance load and improve performance. More partitions can help distribute the workload but may consume more resources.
hint.concurrency = <number>
:Sets the level of concurrency for query execution.
Increases or decreases the number of concurrent operations to optimise performance. Higher concurrency can speed up query execution but may require more resources.
hint.materialized_view = <view_name>
:Uses a materialized view to speed up query execution.
Leverages precomputed results from a materialized view for faster query responses. This is useful for frequently run queries that can benefit from cached results.
hint.remote = true
:Executes the query on a remote cluster.
Useful for distributed queries that need to access data across multiple clusters. This can help in scenarios where data is spread across different geographical locations.
hint.distribution = <strategy>
:Specifies the distribution strategy for data.
Can be set to
hash
orround_robin
to control how data is distributed across nodes. This helps in balancing the load and optimising query performance.
hint.max_memory_consumption_per_query = <size>
:Limits the maximum memory consumption for a query.
Helps prevent queries from consuming excessive memory, which can impact overall system performance.
hint.max_result_rows = <number>
:Limits the number of rows returned by a query.
Useful for controlling the size of the result set, especially in scenarios where only a subset of data is needed.
hint.query_timeout = <time>
:Sets a timeout for query execution.
Ensures that queries do not run indefinitely, which can help in managing system resources and preventing long-running queries from affecting performance.
Documentation:
Last updated