In one sentence, what is Windows Server Failover Clustering (WSFC)?
WSFC is a Windows OS feature that groups two or more servers into a cluster so clustered workloads are owned by one node at a time and can be moved (failover) when a node or dependency fails.
What does WSFC provide that prevents split-brain for clustered resources (especially FCI storage)?
Membership, health monitoring, resource orchestration, and quorum math so only one partition of the cluster stays “in charge” after a split — avoiding two sides both thinking they own shared resources.
For Always On availability groups on Windows, what does WSFC do vs what does SQL Server do?
WSFC hosts the AG resource, failover policy, and often the listener (cluster network name + IPs). SQL Server does log send/redo, synchronization, read-only routing, and the HADR endpoint — data replication is not WSFC’s job.
For a failover cluster instance (FCI), what does WSFC do vs SQL Server?
WSFC moves one clustered SQL instance between nodes; shared storage is presented to whoever owns the instance. SQL Server runs that single instance against shared database files — no per-replica copy of data for that instance.
What is a distributed availability group, in terms of clusters?
Two separate WSFC clusters (e.g. two sites); each side is a normal cluster. AG replication runs between clusters for DR/geo while WSFC still governs each side’s membership and quorum.
On Linux, does SQL Server AG use WSFC?
No. AG on Linux uses Pacemaker (and related components), not WSFC. The ideas overlap (membership, resource agents, quorum), but the stack is different.
Define node, cluster network, and why NIC layout matters.
A node is a Windows Server member of the cluster. A cluster network is the set of NICs used for client vs cluster (heartbeat) traffic. Best practice is redundant paths and correct metric ordering so heartbeat stays reliable.
What are cluster core resources vs a role / resource group?
Cluster core resources keep the cluster itself viable (e.g. cluster name, quorum/witness resources). A role/resource group is the clustered application footprint (e.g. SQL Server AG, SQL Server FCI) with dependencies like IP, name, and storage (for FCI).
What are CNO and VCO, and why do interviewers care about AD and DNS?
CNO = Cluster Name Object (cluster identity in AD). VCO = Virtual Computer Objects for clustered names (e.g. listener, FCI network name). Permissions and DNS registration for these objects matter when resources “don’t come online.”
What problem does quorum solve, and what is a witness for?
Quorum is the vote model so only one side of a partition stays online. A witness (file share, disk, cloud, etc.) often breaks ties on even node counts. Deep witness math → defer to your clustering note.
How does automatic failover of the primary in an AG on Windows involve both SQL and WSFC?
It’s a cooperation of SQL (health) and WSFC (resource failover policy). Every AG replica runs on a WSFC node in the same cluster for that AG on Windows.
Why is the listener usually a cluster resource, and what do clients use on multi-subnet deployments?
The listener is typically a cluster resource (network name + IP(s)). Multi-subnet deployments register multiple IPs; clients use MultiSubnetFailover=True (alongside correct connectivity patterns).
Contrast FCI vs AG using storage and WSFC’s role.
FCI: WSFC moves the instance to a node that sees the same disks — shared storage. AG: WSFC does not replace log shipping; it orchestrates roles and the listener while each node’s SQL has its own storage for AG databases — shared-nothing replicas.
Give a panel-ready one-liner: WSFC vs SQL Server for Always On.
“WSFC is the Windows cluster — membership, quorum, and moving clustered roles. SQL Server still does replication and redo for AG; WSFC orchestrates which node owns the AG/primary and the listener.”
Give a panel-ready contrast: FCI vs AG on the same WSFC.
“FCI is shared-storage instance failover; AG is shared-nothing replicas — same WSFC, different SQL HA pattern.”
Name high-level operational concerns: AD, patching, and cloud.
Traditional WSFC for SQL is usually domain-joined; CNO/VCO and DNS depend on AD health (workgroup clusters exist but are uncommon in enterprise SQL stories). Patching/reboots: drain roles or fail over deliberately; quorum after node loss must still add up. In cloud (e.g. AWS), witness placement, AZ layout, and stretch networking still use the same quorum ideas — platform detail lives in your cloud note.
When adding or restoring a node, what order do you think in (WSFC first vs SQL)?
Touch WSFC first (membership, quorum), then SQL (Always On, endpoint, replica) — your add-node/restore-node notes spell out the steps.