to navigate

to select

to close

Always On Deep Dive

Always On Availability Groups (AG) are SQL Server’s primary enterprise HA/DR mechanism. This deep dive covers architecture decisions, edge cases, and operational patterns beyond basic AG setup — for DBAs running multi-replica production clusters.

Architecture Layers

  Application
    ↓ (AG Listener + MultiSubnetFailover)
WSFC / Pacemaker (quorum)
    ↓
Availability Group (replicas)
    ↓
Database (synchronized / synchronizing)

Each layer must be healthy — a working AG with broken quorum still blocks production failover.

Quorum Deep Dive

Voting Configuration

Node	Votes	Typical Setup
SQL-PRIMARY	1	Production site
SQL-SECONDARY	1	Same site sync replica
File share / cloud witness	1	Breaks 2-2 ties

Two-node clusters require a witness — without it, losing one node may remove quorum entirely.

  # Add cloud witness (Windows)
Set-ClusterQuorum -CloudWitness -AccountName "StorageAccount" -AccessKey "key"

Dynamic Quorum

Modern WSFC adjusts votes when nodes go offline — reduces accidental quorum loss. Still document expected vote count after planned maintenance.

Replica Modes in Production

Mode	RPO	RTO	Use Case
Synchronous + automatic	~0	Seconds	Same-rack HA
Asynchronous + manual	Minutes	Minutes–hours	Cross-region DR
Sync + manual	~0	Manual approval	Avoid auto split-brain across WAN

Never use synchronous commit across high-latency WAN — transactions wait for remote log harden, destroying throughput.

Distributed Availability Groups

Chain AGs across geographic sites without merging WSFC clusters:

  Site A: AG1 (Primary + Local Secondary)
           ↓ async forward
Site B: AG2 (Primary + Local Secondary)

  CREATE AVAILABILITY GROUP [GlobalAG]
WITH (DISTRIBUTED)
AVAILABILITY GROUP ON
    N'AG_SiteA' WITH (LISTENER_URL = N'TCP://ag1-listener:5022', AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT),
    N'AG_SiteB' WITH (LISTENER_URL = N'TCP://ag2-listener:5022', AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT);

Use for multi-datacenter DR without stretching a single WSFC across regions.

Seeding vs Manual Initial Sync

Automatic Seeding

  -- Requires MATCHING SERVICE ACCOUNTS or certificates between replicas
ALTER AVAILABILITY GROUP [AppAG] GRANT CREATE ANY DATABASE;
-- Secondary receives streamed backup

Monitor seeding progress:

  SELECT local_database_name, role_desc, internal_state_desc, transferred_size_bytes, database_size_bytes
FROM sys.dm_hadr_physical_seeding_stats;

Manual Backup/Restore

When seeding fails (firewall, size, version mismatch):

  -- Primary
BACKUP DATABASE MyApp TO DISK = N'\\share\MyApp_full.bak' WITH COPY_ONLY;

-- Secondary
RESTORE DATABASE MyApp FROM DISK = N'\\share\MyApp_full.bak' WITH NORECOVERY;
ALTER DATABASE MyApp SET HADR = ON;

Listener and Multi-Subnet Failover

  CREATE AVAILABILITY GROUP LISTENER [AppAG-Listener]
WITH (
    IP ((N'10.0.1.50', N'255.255.255.0'), (N'10.0.2.50', N'255.255.255.0')),
    PORT = 1433,
    DNS = N'appag-sql.contoso.com'
);

Connection string for .NET:

  Server=tcp:appag-sql.contoso.com,1433;Database=MyApp;MultiSubnetFailover=True;Application Name=MyApp;

MultiSubnetFailover=True — parallel connection attempts on failover (critical for multi-subnet AG).

Read-Scale Routing Advanced

Define routing priorities per replica:

  ALTER AVAILABILITY GROUP [AppAG]
MODIFY REPLICA ON N'SQL-REPORT1' WITH (
    SECONDARY_ROLE(READ_ONLY_ROUTING_URL = N'TCP://sql-report1:1433', READ_ONLY_ROUTING_PRIORITY = 50)
);
ALTER AVAILABILITY GROUP [AppAG]
MODIFY REPLICA ON N'SQL-REPORT2' WITH (
    SECONDARY_ROLE(READ_ONLY_ROUTING_URL = N'TCP://sql-report2:1433', READ_ONLY_ROUTING_PRIORITY = 100)
);

Higher priority receives read traffic first. Requires ApplicationIntent=ReadOnly.

Secondary Lag and redo_queue

  SELECT
    ar.replica_server_name,
    drs.database_name,
    drs.synchronization_state_desc,
    drs.log_send_queue_size AS send_kb,
    drs.redo_queue_size AS redo_kb,
    drs.last_redone_time,
    drs.last_commit_time
FROM sys.dm_hadr_database_replica_states drs
JOIN sys.availability_replicas ar ON drs.replica_id = ar.replica_id;

Symptom	Cause	Action
Growing send queue	Network bandwidth, primary log generation rate	Check network, index maintenance on primary
Growing redo queue	Secondary CPU/disk slower than primary	Match hardware, limit read workload
NOT SYNCHRONIZING	Endpoint down, login failure	Verify 5022, service accounts

Failover Scenarios

Planned Maintenance (Zero Data Loss)

  -- Verify sync healthy
SELECT synchronization_health_desc FROM sys.dm_hadr_database_replica_states;

ALTER AVAILABILITY GROUP [AppAG] FAILOVER;
-- Patch former primary, verify SYNCHRONIZED, fail back if desired

Disaster — Primary Site Lost

Confirm primary truly unavailable (avoid split-brain)
If async DR replica: FORCE_FAILOVER_ALLOW_DATA_LOSS
Repoint applications to DR listener
Rebuild former site as new secondary when restored

Document decision tree with executive approval for forced failover.

SQL logins with SIDs don’t auto-sync to secondaries:

  -- Compare SID mismatch (orphan detection after failover)
SELECT name FROM sys.server_principals WHERE sid NOT IN (
    SELECT sid FROM sys.sql_logins
);

-- Transfer login script via sp_help_revlogin (deprecated) or dbatools Copy-DbaLogin

Agent jobs must exist on new primary — automate with multi-server administration or IaC.

Extended Events for AG

  CREATE EVENT SESSION [AG_Health] ON SERVER
ADD EVENT sqlserver.always_on_ddl_executed,
ADD EVENT sqlserver.hadr_ag_wsfc_cluster_quorum_state,
ADD EVENT sqlserver.hadr_db_manager_state_change
ADD TARGET package0.event_file(SET filename = N'D:\XEvents\AG_Health.xel');
ALTER EVENT SESSION [AG_Health] ON SERVER STATE = START;

Common Expert Mistakes

Mistake	Outcome
AG as backup replacement	Still need off-site backups — AG replicates corruption too
Readable secondary without RCSI awareness	Blocking on primary during redo conflicts
Same AG listener DNS TTL too high	Slow client failover
Forgetting `DB_FAILOVER = ON`	Manual DB join after instance failover
Mixing SQL versions in AG	Unsupported — blocks upgrade path

Performance Tuning for AG

Compression on backups — reduces seeding time
Instant File Initialization on all replicas
Matching trace flags — inconsistent flags cause hard-to-debug sync issues
Limit concurrent readable queries — max read-only connections workload governance
Use delayed durability never on sync replicas — breaks durability assumptions

Production Architecture Example

                      [Cloud Witness]
                          |
        +-----------------+-----------------+
        |                                   |
   [Site A WSFC]                      [Site B WSFC]
   SQL-PROD (sync)                     SQL-DR (async)
   SQL-PROD2 (sync)                   SQL-DR2 (async)
        |                                   |
   Listener: prod-sql                   Listener: dr-sql
   RPO: 0 (site A)                      RPO: ~5 min (manual failover)

Quarterly: failover drill to DR, restore app connectivity, measure RTO, fail back.

Troubleshooting Commands

  -- Endpoint state
SELECT name, state_desc FROM sys.database_mirroring_endpoints;

-- AG cluster state
SELECT cluster_name, quorum_type_desc, quorum_state_desc
FROM sys.dm_hadr_cluster;

-- Last failover reason
SELECT ag.name, ar.replica_server_name, rh.failover_reason, rh.failover_timestamp
FROM sys.availability_replicas ar
JOIN sys.availability_groups ag ON ar.group_id = ag.group_id
CROSS APPLY (
    SELECT TOP 1 failover_reason, failover_timestamp
    FROM sys.dm_hadr_automatic_seeding_history
) rh;  -- Use error logs + XEvents for failover detail

Always On rewards meticulous quorum, listener, and runbook design — treat AG as a system spanning Windows/Linux clustering, networking, and SQL Server—not just a wizard click.

Performance Tuning

Query Store Deep Dive

Always On Deep Dive

Architecture Layers link

Quorum Deep Dive link

Voting Configuration link

Dynamic Quorum link

Replica Modes in Production link

Distributed Availability Groups link

Seeding vs Manual Initial Sync link

Automatic Seeding link

Manual Backup/Restore link

Listener and Multi-Subnet Failover link

Read-Scale Routing Advanced link

Secondary Lag and redo_queue link

Failover Scenarios link

Planned Maintenance (Zero Data Loss) link

Disaster — Primary Site Lost link

Login and Job Synchronization link

Extended Events for AG link

Common Expert Mistakes link

Performance Tuning for AG link

Production Architecture Example link

Troubleshooting Commands link