Add Environment Validator TSG: AzStackHci_Software_IsNotPartofDomain (Domain Membership)#303
Conversation
…(Domain Membership) Adds a public remediation guide for the pre-deployment Software validator AzStackHci_Software_IsNotPartofDomain (display name "Domain Membership"). The check fails when a machine is already joined to an Active Directory domain before deployment; Azure Local requires each machine to start in a workgroup and joins it to the domain itself during deployment. The TSG covers detection (the deployment Validation step, the targeted validator Invoke-AzStackHciSoftwareValidation -Include Test-IsNotPartofDomain, and the on-machine Event ID 17205), where the failure appears, the affected-machine detail line, the consequence, the remediation (unjoin with Remove-Computer -UnjoinDomainCredential and restart), and verification. The check name, display name, severity, description, the failure and success detail strings, and the remediation text are taken from the validator source. The guidance was validated end to end on a live lab cluster (baseline workgroup, inject a domain join, confirm the real check reports FAILURE with the production signature, run the documented unjoin and restart, confirm the check returns to SUCCESS). Tracked by ADO 38564291. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds a new public troubleshooting guide (TSG) for the Environment Validator check AzStackHci_Software_IsNotPartofDomain (“Domain Membership”) and indexes it in the EnvironmentValidator README, improving self-service remediation for pre-deployment failures caused by nodes being domain-joined.
Changes:
- Adds
Troubleshooting-Software-IsNotPartofDomain.mddocumenting symptom location, remediation (unjoin + reboot), and re-validation steps. - Updates
TSG/EnvironmentValidator/README.mdto include the new TSG in the list.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| TSG/EnvironmentValidator/Troubleshooting-Software-IsNotPartofDomain.md | New TSG for the “Domain Membership” validator failure, including remediation and verification steps. |
| TSG/EnvironmentValidator/README.md | Adds an index entry pointing to the new TSG. |
- Add a pre-unjoin step to confirm a working local administrator sign-in before Remove-Computer + restart, so an operator is not locked out of a previously domain-joined machine (review finding, MEDIUM). - Reframe the single-validator instruction: -Include runs only this check; drop the inaccurate "excluded from the default Software run" claim. A bare Invoke-AzStackHciSoftwareValidation runs all checks; the exclude lives only in the deployment orchestrator (Test-AzStackHciSoftware) and is conditional. - Use Restart-Computer -Force in the remediation to avoid a hang. - README: surface the "Domain Membership" display name in the link text. - Related: add the canonical Learn deployment-local-identity link. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed the review feedback (commit b598478):
Merge note: findings 1 and 4 change on-box commands, so per our embedded-test standard the live VM-cluster loop will be re-run (the |
|
Re-validation complete: the domain-membership loop re-validated Grade A on a live masonenode VM cluster (Azure Local build 2607) on 2026-06-25.
This re-exercises the two material changes from the review mitigations: the
|
Drop -Force from the Remove-Computer unjoin command; the restart is already explicit via Restart-Computer -Force. Make the "Azure Local deployment prerequisites" Related reference a clickable Markdown link. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add prompt guidance before the unjoin command (the credential dialog and the confirmation prompt now shown after -Force was removed), equate "machine" and "node" once in the Overview, and use the same illustrative node name (AzL-Node-01) in the verify-step detail line for consistency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1008covingtonlane
left a comment
There was a problem hiding this comment.
Re-reviewed at 79d4026 (unchanged since the last pass). Still clean: the three customer-understandability items from the earlier review are in place (the expected credential/confirm prompts in the unjoin step, the machine/node equivalence note, and the consistent AzL-Node-01 example), the structure matches the sibling EnvironmentValidator TSGs, and there is no internal or PII content. No new findings. Ready for maintainer review.
…light From a 10-persona usability read of this TSG, the highest-leverage single change: a top 'Before you start' box that (a) routes ownership (customer AD/server-admin or deployment partner; not a network task; not a hardware-vendor/OEM issue), (b) gates scope with an explicit STOP if the machine is already a deployed cluster member (this is a pre-deployment check), and (c) foregrounds the restart + local-sign-in requirement before unjoin. Resolves the majority of the personas' 'wants improved' comments (not-network / not-OEM ownership, deployed-member prohibition, local-login proof) without changing the remediation steps. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Independent validation — verified accurate
I independently validated this TSG against the validator behavior and on a live lab cluster node. It's accurate and the remediation is correct.
Confirmed:
- Check name
AzStackHci_Software_IsNotPartofDomain, display name Domain Membership, severity Critical. - On a domain-joined node,
Invoke-AzStackHciSoftwareValidation -Include Test-IsNotPartofDomain -PassThrureturnsAzStackHci_Software_IsNotPartofDomain/ FAILURE / Critical with the exact detail string the doc quotes ('<NODE>' is part of a domain. Please remove '<NODE>' from the domain.); a workgroup node returns the documented success string. - The remediation (
Remove-Computer -UnjoinDomainCredential ... -Force+ restart) matches the validator's own recommended remediation text. - The Event ID 17205 /
AzStackHciEnvironmentCheckerlog read is correct.
Minor (non-blocking): Remove-Computer is Windows PowerShell 5.1 only (removed in PowerShell 7+) — correct for the node context here; just noting in case the snippet is ever modernized.
Thanks for closing this gap.
|
@AlBurns-MSFT thank you for the independent validation, especially confirming the check name, the exact FAILURE detail string, the One small process note: your validation came through as a Comment review rather than an Approve, so GitHub still reports the PR as |
Reviewers (deep-systems persona) noted that Remove-Computer unjoins the machine but leaves its computer account in Active Directory. Add a note in step 3 that the object remains until an AD admin removes it: harmless for deployment, but worth cleaning up, especially if the machine will not rejoin the domain. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
What this adds
A public remediation TSG for the pre-deployment Environment Validator check AzStackHci_Software_IsNotPartofDomain (display name Domain Membership), plus its entry in the EnvironmentValidator README.
The check fails when a machine is already joined to an Active Directory domain before deployment. Azure Local requires each machine to start in a workgroup, and the deployment process joins it to the domain itself. There was no public remediation guide for this validator.
What the TSG covers
Invoke-AzStackHciSoftwareValidation -Include Test-IsNotPartofDomain, and the on-machine Event ID 17205, with the exact failure detail line.Remove-Computer -UnjoinDomainCredentialand restart, then re-validate. This is the remediation the validator itself recommends.Accuracy and validation
Tracked by ADO 38564291. Follows the same structure as the System Drive Free Space TSG (#302).