Statistic objects in SQL Server are light-weight objects containing statistical information about the distribution of values in one or more columns. Statistics are used to calculate cardinality estimates during optimisation allowing the query optimiser to generate efficient query execution plans.
Since SQL Server 2005, statistic objects can either be updated synchronously(default) or asynchronously. This setting is configured at the database level only. The ‘Auto Update Statistics’ database option controls whether statistics are updated automatically or not. The ‘Auto Update Statistics Asynchronously’ database option controls whether statistics are automatically updated synchronously or asynchronously:
- For the synchronous update of statistics; ‘Auto Update Statistics’ must be set to ‘True’ and ‘Auto Update Statistics Asynchronously’ must be set to ‘False’.
- For the asynchronous update of statistics; ‘Auto Update Statistics’ and ‘Auto Update Statistics Asynchronously’ must both be set to ‘True’.
Synchronous Vs Asynchronous – The Differences
The query optimiser looks at how many changes have occurred for a given statistic object as part of query optimisation. If the number of rows that have changed exceed certain thresholds, SQL Server will either update the statistic object synchronously or asynchronously (depending on the database settings), then optimise the query.
Note that the statistics don’t auto-update when the rows are changed. It’s only when a query is being compiled and the statistics would actually be used to make an optimisation decision that the update is kicked off.
Synchronous: If the query optimiser identifies out-of-date statistics prior to compiling a query or before executing a cached query plan, a statistic update is initiated and the query optimiser waits for updated statistics before compiling and executing the query. This effectively causes query execution to stall while the statistic object is automatically updated synchronously, not only for the session that issued the initial query, but for other sessions too that have issued queries dependent on the same statistic object. Depending on the size of the object the statistic is created on, your underlying hardware and I/O subsystem, the statistic update operation could take anywhere from milliseconds to minutes.
Synchronous statistic updates execute under the context of the principal who submitted the query that initiated the statistic update.
Asynchronous: With asynchronous statistics updates, queries compile with existing statistics even if they are considered as being out-of-date. This potentially could cause the query optimiser to choose a suboptimal query plan based on the stale statistics affecting currently running queries and only queries that compile after the asynchronous updates have completed will benefit from using the updated statistics.
Asynchronous statistic updates are handled in the background by a system thread under the context of sa. You can check if there are asynchronous statistic updates currently executing using the sys.dm_exec_background_job_queue DMV:
SELECT time_queued, des.session_id, des.login_name, OBJECT_NAME(object_id1) AS [object], ss.name as [statistic_name], sc.name AS [statistic_columns], auto_created, user_created, no_recompute, has_filter, is_temporary, is_incremental FROM sys.dm_exec_background_job_queue bq JOIN sys.stats ss ON ss.object_id = bq.object_id1 AND ss.stats_id = bq.object_id2 JOIN sys.stats_columns stc ON stc.object_id = ss.object_id AND stc.stats_id = ss.stats_id JOIN sys.columns sc ON sc.column_id = stc.column_id AND sc.object_id = stc.object_id JOIN sys.dm_exec_sessions des ON des.session_id = bq.session_id WHERE in_progress = 1 OPTION(RECOMPILE);
Official & Industry Recommendations
Microsoft: Microsoft’s official recommendations are:
Synchronous: Consider using synchronous statistics when you perform operations that change the distribution of data, such as truncating a table or performing a bulk update of a large percentage of the rows. If you do not manually update the statistics after completing the operation, using synchronous statistics will ensure statistics are up-to-date before executing queries on the changed data.
Asynchronous: Consider using asynchronous statistics to achieve more predictable query response times for the following scenarios:
- Your application frequently executes the same query, similar queries, or similar cached query plans. Your query response times might be more predictable with asynchronous statistics updates than with synchronous statistics updates because the query optimizer can execute incoming queries without waiting for up-to-date statistics. This avoids delaying some queries and not others.
- Your application has experienced client request time outs caused by one or more queries waiting for updated statistics. In some cases, waiting for synchronous statistics could cause applications with aggressive time outs to fail.
Brent Ozar: Brent’s popular sp_Blitz stored procedure checks for databases that have automatic asynchronous statistics updates enabled and recommends “setting it back to the default (disabled) unless you’ve got evidence that delays caused by statistics updates are causing you problems.”
Glenn Berry (SQLSkills): Glenn states here that he believes asynchronous statistic updates is beneficial for most workloads.
Jos de Bruijn: The new Microsoft sample database WideWorldImporters is configured to use the asynchronous update of statistics. When I asked Jos if there was any specific reason for this, his response was “enabling async statistics is best practice in most cases”.
My Two Cents & Conclusion
I think the best approach is proactive management of the statistics on your largest and most important tables. Generally speaking, statistic updates are usually a pretty quick operation and only potentially become problematic on very large tables (hundreds of millions of rows).
Consider using synchronous auto-update of statistics when:
- The data distribution and density of your columns changes frequently, statistics become stale quickly and you have evidence that this affects query performance negatively.
Consider using asynchronous auto-update of statistics when:
- The synchronous updating of statistics on very large tables is impacting the system, taking a long duration and in extreme cases causing unpredictable query performance or even query timeouts.
- The data distribution and density of your columns is fairly static and query execution plans do not change after an update to statistics.
With that said, it would seem that asynchronous statistics are better suited to OLTP environments and synchronous statistics are better suited to OLAP environments. As synchronous statistics are the default though, people are reluctant to change this setting without good reason to. I’ve worked exclusively in OLTP environments over the last few years and have never seen asynchronous statistics rolled out as the default. I have personally been bitten by synchronous statistic updates on large tables causing query timeouts which I resolved by switching to asynchronous statistic updates.
Asynchronous statistic updates may not be well suited for data warehouses / OLAP environments where large scale data modification is common through various ETL processes. The proactive SQL Server DBA would manually include statistic update logic at the end of their ETL process and use the synchronous update of statistics database setting as a safety measure.
If you are considering switching to asynchronous statistic updates in your OLTP environments my advice would be use tools like Distributed Replay or RML Utilities to capture your actual workload which you could then replay in a test environment under the different settings.Follow @SQLScotsman