Hey Checkyourlogs fans,
I had the pleasure over the past few weeks to work with a customer that had recently deployed Windows Server 2019 and Storage Spaces Direct. With any early deployment, we expect to hit some bumps in the road, and we found a good one this week.
Microsoft has identified a bug that relates to the SDDC Management Resource inside of Failover Clustering. Basically what happens is that this resource times out via calls from Windows Admin Center and causes the RHS process to terminate causing running Highly Available Virtual Machines in the Cluster to crash and restart on other nodes. It is a hard outage for the Virtual Machines and causes many problems as you can imagine.
To be clear the SDDC Management Resource is what Windows Admin Center uses to work with Storage Spaces Direct.
You can see what is happening here in the output from Get-Clusterlog -UseLocalTime run from one of the Storage Spaces Direct Nodes. After this further down in the log you can see the Cluster Roles (Virtual Machines) crashing and moving around and eventually restarting.
This is a different issue than what was discovered previously where tweaking the SDDC Management Resource for Windows Admin Center to run in a separate monitor would fix the issue. You would run:
(Get-ClusterResource -Name "SDDC Management").SeparateMonitor = 1
In the past this had fixed the issue.
Microsoft has now confirmed that the fix is coming next week-ish – January 20th to 21st ETA. Until then they have recommended that we stop the SDDC Management Resource until it is fixed. This, in essence, will kill your Hyper Converged Storage Spaces Direct Mangement via Windows Admin Center until the hotfix is applied and the SDDC Management Resource is restarted.
Get-ClusterResource "SDDC Management" Get-ClusterResource "SDDC Management" | Stop-ClusterResource
So, for now, folks it is best to stop using Windows Admin Center with Storage Spaces Direct on Windows 2019 until next week. It hurts me to have to say this, but it is the only fix out there for this issue right now.
It is unclear at this time if the issue impacts Windows Server 2016 SDDC Management Resources.
I hope this helps save you some pain with your Storage Spaces Direct Clusters.
Dave
What about Windows Server 2016 S2d Cluster?
I found your site searching for a similar issue. I am managing a new Windows Server 2019 Essentials server from a Windows 10 Notebook. In Admin Center I click “storage” on this server and it reboots. Event viewer says the computer restarted from a bugcheck.
Hi. We are now half a year further. Did microsoft publish a fix and if so, which KB should we use?
We ran into the same issue on hyper-v 2019, resulting in heavy instability in the cluster.
Yes it seems good now just make sure you are on the latest Servicing Stack and CU’s