Kusto Hints & Strategies

Azure Data Explorer (ADX) offers hints and strategies to optimise query performance. Here are some:

hint.strategy = shuffle:
- Distributes data across cluster nodes to optimise performance.
- Ideal for high cardinality keys in operations like join, summarize, make-series, and partition. This strategy helps balance the load across nodes, improving query efficiency.
hint.shufflekey = <key>:
- Specifies a key for shuffling data.
- Enhances query performance by shuffling data based on a specific key. This is particularly useful when the key has high cardinality.
hint.num_partitions = <number>:
- Controls the number of partitions for data processing.
- Adjusts the number of partitions to balance load and improve performance. More partitions can help distribute the workload but may consume more resources.
hint.concurrency = <number>:
- Sets the level of concurrency for query execution.
- Increases or decreases the number of concurrent operations to optimise performance. Higher concurrency can speed up query execution but may require more resources.
hint.materialized_view = <view_name>:
- Uses a materialized view to speed up query execution.
- Leverages precomputed results from a materialized view for faster query responses. This is useful for frequently run queries that can benefit from cached results.
hint.remote = true:
- Executes the query on a remote cluster.
- Useful for distributed queries that need to access data across multiple clusters. This can help in scenarios where data is spread across different geographical locations.
hint.distribution = <strategy>:
- Specifies the distribution strategy for data.
- Can be set to hash or round_robin to control how data is distributed across nodes. This helps in balancing the load and optimising query performance.
hint.max_memory_consumption_per_query = <size>:
- Limits the maximum memory consumption for a query.
- Helps prevent queries from consuming excessive memory, which can impact overall system performance.
hint.max_result_rows = <number>:
- Limits the number of rows returned by a query.
- Useful for controlling the size of the result set, especially in scenarios where only a subset of data is needed.
hint.query_timeout = <time>:
- Sets a timeout for query execution.
- Ensures that queries do not run indefinitely, which can help in managing system resources and preventing long-running queries from affecting performance.

Documentation:

Azure Data Explorer - Shuffle Query

PreviousKusto Detective Agency NextPi-hole Wireguard VPN in Azure

Last updated 4 months ago

Kusto Hints & Strategies

Azure Data Explorer (ADX) offers hints and strategies to optimise query performance. Here are some:

hint.strategy = shuffle:
- Distributes data across cluster nodes to optimise performance.
- Ideal for high cardinality keys in operations like join, summarize, make-series, and partition. This strategy helps balance the load across nodes, improving query efficiency.
hint.shufflekey = <key>:
- Specifies a key for shuffling data.
- Enhances query performance by shuffling data based on a specific key. This is particularly useful when the key has high cardinality.
hint.num_partitions = <number>:
- Controls the number of partitions for data processing.
- Adjusts the number of partitions to balance load and improve performance. More partitions can help distribute the workload but may consume more resources.
hint.concurrency = <number>:
- Sets the level of concurrency for query execution.
- Increases or decreases the number of concurrent operations to optimise performance. Higher concurrency can speed up query execution but may require more resources.
hint.materialized_view = <view_name>:
- Uses a materialized view to speed up query execution.
- Leverages precomputed results from a materialized view for faster query responses. This is useful for frequently run queries that can benefit from cached results.
hint.remote = true:
- Executes the query on a remote cluster.
- Useful for distributed queries that need to access data across multiple clusters. This can help in scenarios where data is spread across different geographical locations.
hint.distribution = <strategy>:
- Specifies the distribution strategy for data.
- Can be set to hash or round_robin to control how data is distributed across nodes. This helps in balancing the load and optimising query performance.
hint.max_memory_consumption_per_query = <size>:
- Limits the maximum memory consumption for a query.
- Helps prevent queries from consuming excessive memory, which can impact overall system performance.
hint.max_result_rows = <number>:
- Limits the number of rows returned by a query.
- Useful for controlling the size of the result set, especially in scenarios where only a subset of data is needed.
hint.query_timeout = <time>:
- Sets a timeout for query execution.
- Ensures that queries do not run indefinitely, which can help in managing system resources and preventing long-running queries from affecting performance.

Documentation:

Azure Data Explorer - Shuffle Query

PreviousKusto Detective Agency NextPi-hole Wireguard VPN in Azure

Last updated 4 months ago