I was troubleshooting an issue for a customer where Operating System Deployment (OSD) was failing intermittently at one of their remote offices.
Working from head office I had no more information at this point other than the intermittent nature of the issue. I therefore created a Status Message Query to see what information I could identify from the build process. An example of the query I used is below, the DEV2001D refers to the deployment ID of the task sequence being used.
select SMS_StatusMessage.*, SMS_StatMsgInsStrings.*, SMS_StatMsgAttributes.*, SMS_StatMsgAttributes.AttributeTime from SMS_StatusMessage left join SMS_StatMsgInsStrings on SMS_StatMsgInsStrings.RecordID = SMS_StatusMessage.RecordID left join SMS_StatMsgAttributes on SMS_StatMsgAttributes.RecordID = SMS_StatusMessage.RecordID where SMS_StatMsgAttributes.AttributeID = 401 and SMS_StatMsgAttributes.AttributeValue = “DEV2001D” and SMS_StatMsgAttributes.AttributeTime >= ##PRM:SMS_StatMsgAttributes.AttributeTime## order by SMS_StatMsgAttributes.AttributeTime DESC
Filtering the results of the query within the “Status Message Viewer” to the name of the system in question I was able to see all the status messages being returned by that specific client during the build process.
An error was being reported after the first “Install Application” task was attempted; the error description is:
“The task sequence failed to install application XXXXX in the group () with exit code 615. The operating system reported error 615: The password provided is too short to meet the policy of your user account. Please choose a longer password”
There was no issue with this specific application as other systems were installing it fine running the same OSD task sequence in the same office.
I asked for a copy of the logs for the client in question and within the DataTransferService.log spotted the following error:
“Failed to send request to /Content_Path at host DPName.domain.com, error 0x2f8f”
0x2f8f translates to “A security error occurred”
The next line in the log states:
“[CCMHTTP] ERROR: URL =https://DPName.domain.com:443/Content_Path, Port=443, Options=63, Code=12175, Text=ERROR_WINHTTP_SECURE_FAILURE”
The key information I spotted here was Options=63 which means that CRL checking is enabled and the text ERROR_WINHTTP_SECURE_FAILURE.
The environment is running SCCM build 1511 and the Site is configured for HTTPS only with “Clients check the certificate revocation list (CRL) for site systems” enabled which suggest there is sometimes an issue with certificate validation from this remote site.
Checking an installed client certificate, I could see that the CRL path for a certificate refers to a URL that is part of a split-brain DNS configuration. From the failed client machine it was not possible to access this URL which was resolving incorectly to a public IP address.
Due to an internal DNS misconfiguration some DNS servers were resolving the CRL path to an external IP address. This meant that the internal client in question was unable to download the CRL and therefore unable to validate the certificate presented by the distribution point.
Once internal DNS resolution was remediated the issue was resolved and OSD began functioning as expected.