We had an issue recently after configuring Azure backup against multiple Linux VM’s that a few of them were failing to backup. In the backup job details in the Azure portal the failures were showing:
Error Code | GuestAgentPluginProcessingError |
Error Message | VM agent failed in processing Extension command. |
Recommended Action | Please restart the VM Agent service |
Restarting the VM agent service had no affect and I noticed the snapshot extension had a status of ‘Transitioning’ whereas working VM’s had a status of ‘Provisioning Succeeded’
Attempting to uninstall the extension failed with a timeout error:
“statusMessage”: “{\”status\”:\”Failed\”,\”error\”:{\”code\”:\”ResourceOperationFailure\”,\”message\”:\”The resource operation completed with terminal provisioning state ‘Failed’.\”,\”details\”:[{\”code\”:\”VMExtensionProvisioningTimeout\”,\”message\”:\”Provisioning of VM extension ‘VMSnapshotLinux’ has timed out. Extension installation may be taking too long, or extension status could not be obtained.\”}]}}”,
Following this I raised the issue with Microsoft and the recommended to upgrade the agent to the latest version but this is only possible by manually pulling it from github. The latest version was 2.2.25 so I ran the following commands to download, unzip and install it:
cd /var/lib/waagent
wget https://github.com/Azure/WALinuxAgent/archive/v2.2.25.zip
unzip v2.2.25.zip
cd WALinuxAgent-2.2.25
sudo python setup.py install
sudo service walinuxagent restart
Unfortunately the issue still remained so our next step was to clean up the agent directory from previous install attempts. If you list the contents of the /var/lib/waagent directory you can use some the commands below to remove any old references of extensions which may be causing conflictions.
rm -f /var/lib/waagent/Prod.*.agentsManifest
rm -f /var/lib/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.OSTCExtensions.LinuxDiagnostic.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.Extensions.CustomScript.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.OSTCExtensions.VMAccessForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.NetworkWatcher.NetworkWatcherAgentLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.OSTCExtensions.CustomScriptForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.RecoveryServices.SiteRecovery.Linux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.RecoveryServices.SiteRecovery.LinuxRHEL7.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.OSTCExtensions.VMAccessForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.Extensions.LinuxAsm.[0-9]*.manifest.xml
rm -f /var/lib/waagent/Microsoft.Azure.RecoveryServices__VMSnapshotLinux__*.zip
rm -rf /var/lib/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9111.0
Once these commands were executed and the agent was restarted I initiated a full backup of the VM. This triggered the new version of the VMSnapshotLinux extension to be installed which now provisioned successfully and also led to the backup job completing.