Query Store captures query text, execution plans, and runtime statistics over time — persisting across restarts unlike the plan cache. It is the primary tool for detecting plan regressions after upgrades, statistics updates, or parameter sniffing issues.

Enable and Configure

  ALTER DATABASE MyApp SET QUERY_STORE = ON (
    OPERATION_MODE = READ_WRITE,
    CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
    DATA_FLUSH_INTERVAL_SECONDS = 900,
    MAX_STORAGE_SIZE_MB = 2048,
    INTERVAL_LENGTH_MINUTES = 60,
    QUERY_CAPTURE_MODE = AUTO,  -- AUTO, ALL, NONE, CUSTOM
    SIZE_BASED_CLEANUP_MODE = AUTO
);
  
Setting Recommendation
MAX_STORAGE_SIZE_MB 1–4 GB production; monitor sys.query_store_options
INTERVAL_LENGTH_MINUTES 60 for stable baselines; 15 for active tuning
QUERY_CAPTURE_MODE AUTO for most; ALL briefly during incidents
WAIT_STATS_CAPTURE_MODE ON (2017+) — correlate waits with queries

Verify status:

  SELECT actual_state_desc, desired_state_desc, current_storage_size_mb, max_storage_size_mb
FROM sys.database_query_store_options;
  

What Query Store Captures

  • Query text and query_id (parameterized)
  • Multiple plans per query (plan_id)
  • Runtime stats per plan per interval (duration, CPU, reads, executions)
  • Wait categories per plan (2017+)
  • Plan forcing and compatibility level context

Data lives in internal tables — query via catalog views only.

Find Top Resource Consumers

  SELECT TOP 20
    qsq.query_id,
    qsp.plan_id,
    SUM(qsrs.count_executions) AS executions,
    SUM(qsrs.avg_duration * qsrs.count_executions) / NULLIF(SUM(qsrs.count_executions), 0) / 1000.0 AS avg_ms,
    SUM(qsrs.avg_logical_io_reads * qsrs.count_executions) / NULLIF(SUM(qsrs.count_executions), 0) AS avg_reads,
    LEFT(qt.query_sql_text, 300) AS query_text
FROM sys.query_store_query qsq
JOIN sys.query_store_query_text qt ON qsq.query_text_id = qt.query_text_id
JOIN sys.query_store_plan qsp ON qsq.query_id = qsp.query_id
JOIN sys.query_store_runtime_stats qsrs ON qsp.plan_id = qsrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsrsi
    ON qsrs.runtime_stats_interval_id = qsrsi.runtime_stats_interval_id
WHERE qsrsi.start_time > DATEADD(day, -7, GETUTCDATE())
GROUP BY qsq.query_id, qsp.plan_id, qt.query_sql_text
ORDER BY avg_ms * executions DESC;
  

Weight by total impact (avg × executions), not avg alone.

Detect Plan Regressions

A new plan with worse performance after a change (upgrade, stats update, index drop):

  SELECT
    qsq.query_id,
    qsp.plan_id,
    qsp.last_execution_time,
    qsrs.avg_duration / 1000.0 AS avg_ms,
    qsp.is_forced_plan,
    qp.query_plan
FROM sys.query_store_query qsq
JOIN sys.query_store_plan qsp ON qsq.query_id = qsp.query_id
JOIN sys.query_store_runtime_stats qsrs ON qsp.plan_id = qsrs.plan_id
CROSS APPLY (
    SELECT query_plan FROM sys.query_store_plan AS p WHERE p.plan_id = qsp.plan_id
) qp
WHERE qsq.query_id = 42  -- Known problematic query
ORDER BY qsp.plan_id, qsrs.last_execution_time;
  

Compare plans in SSMS — Query Store → Track Query → Plan Summary.

Automated regression view (simplified):

  -- Queries where latest plan avg duration > 2x older plan
WITH plan_stats AS (
    SELECT
        qsq.query_id,
        qsp.plan_id,
        AVG(qsrs.avg_duration) AS avg_dur,
        MAX(qsrs.last_execution_time) AS last_run
    FROM sys.query_store_query qsq
    JOIN sys.query_store_plan qsp ON qsq.query_id = qsp.query_id
    JOIN sys.query_store_runtime_stats qsrs ON qsp.plan_id = qsrs.plan_id
    GROUP BY qsq.query_id, qsp.plan_id
)
SELECT a.query_id, a.plan_id AS slow_plan, b.plan_id AS fast_plan,
       a.avg_dur / NULLIF(b.avg_dur, 0) AS slowdown_ratio
FROM plan_stats a
JOIN plan_stats b ON a.query_id = b.query_id AND a.plan_id <> b.plan_id
WHERE a.avg_dur > b.avg_dur * 2 AND a.last_run > b.last_run;
  

Force and Unforce Plans

When a known-good plan exists:

  -- SSMS: Right-click plan → Force Plan
EXEC sp_query_store_force_plan @query_id = 42, @plan_id = 3;

-- Verify
SELECT query_id, plan_id, is_forced_plan, forced_plan_failure_count
FROM sys.query_store_plan WHERE query_id = 42;

-- Remove force
EXEC sp_query_store_unforce_plan @query_id = 42, @plan_id = 3;
  

Forced plans can fail if schema changes invalidate plan — monitor forced_plan_failure_count.

Wait Stats in Query Store (2017+)

  SELECT
    qsq.query_id,
    qsws.wait_category_desc,
    SUM(qsws.avg_query_wait_time_ms * qsws.count_executions) AS total_wait_ms
FROM sys.query_store_query qsq
JOIN sys.query_store_plan qsp ON qsq.query_id = qsp.query_id
JOIN sys.query_store_wait_stats qsws ON qsp.plan_id = qsws.plan_id
GROUP BY qsq.query_id, qsws.wait_category_desc
ORDER BY total_wait_ms DESC;
  

Correlate CPU vs I/O vs Lock waits per query — faster than instance-wide wait stats alone.

Query Store and Upgrades

Microsoft recommends:

  1. Enable Query Store before upgrading SQL Server version
  2. Capture baseline for 1–2 weeks
  3. Upgrade
  4. Compare plan counts and duration — use Query Store with Compatibility Level insights
  ALTER DATABASE MyApp SET COMPATIBILITY_LEVEL = 160;  -- SQL 2022
-- Query Store tracks plans per CE version
  

Capture Mode Tuning

Mode Behavior
ALL Every query — high overhead, short incidents only
AUTO Threshold-based — default
NONE Stops capture; existing data retained
CUSTOM QUERY_CAPTURE_POLICY with time/CPU thresholds

During firefighting:

  ALTER DATABASE MyApp SET QUERY_STORE (QUERY_CAPTURE_MODE = ALL);
-- After incident
ALTER DATABASE MyApp SET QUERY_STORE (QUERY_CAPTURE_MODE = AUTO);
  

Storage and Cleanup

  -- Manual cleanup old stats
EXEC sp_query_store_flush_db;

ALTER DATABASE MyApp SET QUERY_STORE CLEAR ALL;  -- Nuclear — dev only

-- Read-only mode for investigation
ALTER DATABASE MyApp SET QUERY_STORE (OPERATION_MODE = READ_ONLY);
  

When current_storage_size_mb approaches max, auto cleanup drops oldest data — may lose pre-upgrade baseline if undersized.

Common Mistakes

Mistake Effect
Query Store off in production No regression history after incidents
Storage too small Auto purge deletes evidence
Force plan permanently Schema drift causes failures
Ignoring AUTO capture gaps Rare queries missing — brief ALL mode
Tuning on dev with empty QS No representative plans

Integration with SSMS Reports

Database → Reports → Standard Reports → Query Store:

  • Top resource consuming queries
  • Regressed queries
  • Queries with forced plans
  • Tracked queries

Pin tracked queries for critical stored procedures and API queries.

Production Workflow: Post-Deploy Regression

  1. Deploy application release Friday 6 PM
  2. Alert: P95 latency doubled on GetOrders API
  3. Query Store → regressed queries report
  4. query_id 881 — new plan_id 12 (scan vs seek)
  5. Root cause: statistics auto-update skewed on Status column
  6. Short-term: sp_query_store_force_plan to plan_id 8
  7. Long-term: filtered index on active orders + update stats with FULLSCAN
  8. Unforce plan after index deployed; verify plan_id 12 gone or harmless

Document in postmortem with query_id and plan_id references.

Query Store vs Plan Cache DMVs

Aspect Query Store dm_exec_query_stats
Persistence Survives restart Cleared on restart
Historical intervals Yes Single cumulative
Plan forcing Built-in Manual plan guides
Overhead Moderate None

Use both — Query Store for trends, DMVs for live snapshot.

Troubleshooting

  -- Query Store not capturing
SELECT actual_state_desc, readonly_reason, current_storage_size_mb
FROM sys.database_query_store_options;
-- readonly_reason: 65536 = size limit, 8192 = single user mode, etc.

-- Flush stuck
ALTER DATABASE MyApp SET QUERY_STORE (OPERATION_MODE = READ_WRITE);

-- Compatibility with AG secondaries
-- Query Store on primary only for write workload DBs; readable secondaries may lag
  

Enable Query Store on every production database before you need it — plan regressions are inevitable; Query Store makes them diagnosable instead of mysterious.