Backups not running at scheduled times but manual consistency checks work

I wasn't sure where to post this as the problem is affecting everything but we mostly backup Hyper-V. Nothing has been backing up since 14th August and no alerts appeared in the monitoring area in that period to indicate that backups weren't running.

What I have seen is a warning alert in that the database had reached a 2GB size threshold. Even though there was enough disk space on the database server it looks like it stopped trying to write to it because this limit had been reached.  The DPM Alerts Windows event log didnt have any events since the 17<sup>th</sup>, the DPM Backup Events didnt have anything since 14<sup>th</sup>. As this is a configurable threshold I increased it to 3GB and performed a consistency check on some VMs that hadnt been backing up since 14<sup>th</sup> August. These backups worked and new events were written to the DPM event logs as normal.

I expected the recovery points to be created to schedule after this as normal but even though consistency checks I initiate manually work it appears that DPM isn't even trying to back up anything at the times scheduled in the PG config.

August 26th, 2015 6:16am

Hi,

DPM jobs are initiated by SQL server agent - make sure SQL agent is running.  Check application event log for events from SQLServerAgent on the system hosting the DPM database.

Free Windows Admin Tool Kit Click here and download it now
August 26th, 2015 2:00pm

I found the SQL Server agent consistently failing on step1

The SQL Agent log has a lot of entries like the following:
Message
[136] Job 90870dc4-532f-4baf-b584-14d218d96ea2 reported: The process could not be created for step 1 of job 0x8E8FEACF379F1C4BA9F61E1C02F4A1C9 (reason: %1 is not a valid Win32 application)

The agent is running the command:
c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server>

I can't see any NTFS errors in the System event log, what may be causing this?

Edit: I restored the TriggerJob.exe from backup from a couple days before the problem started and tried a failing job again but got the exact same issue.

August 27th, 2015 4:56am

I found the SQL Server agent consistently failing on step1

The SQL Agent log has a lot of entries like the following:
Message
[136] Job 90870dc4-532f-4baf-b584-14d218d96ea2 reported: The process could not be created for step 1 of job 0x8E8FEACF379F1C4BA9F61E1C02F4A1C9 (reason: %1 is not a valid Win32 application)

The agent is running the command:
c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server>

I can't see any NTFS errors in the System event log, what may be causing this?

Edit: I restored the TriggerJob.exe from backup from a couple days before the problem started and tried a failing job again but got the exact same issue.

  • Edited by SysAdminITL Thursday, August 27, 2015 12:41 PM Edit
Free Windows Admin Tool Kit Click here and download it now
August 27th, 2015 8:55am

I found the SQL Server agent consistently failing on step1

The SQL Agent log has a lot of entries like the following:
Message
[136] Job 90870dc4-532f-4baf-b584-14d218d96ea2 reported: The process could not be created for step 1 of job 0x8E8FEACF379F1C4BA9F61E1C02F4A1C9 (reason: %1 is not a valid Win32 application)

The agent is running the command:
c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server>

I can't see any NTFS errors in the System event log, what may be causing this?

Edit: I restored the TriggerJob.exe from backup from a couple days before the problem started and tried a failing job again but got the exact same issue.

  • Edited by SysAdminITL Thursday, August 27, 2015 12:41 PM Edit
August 27th, 2015 8:55am

I found the SQL Server agent consistently failing on step1

The SQL Agent log has a lot of entries like the following:
Message
[136] Job 90870dc4-532f-4baf-b584-14d218d96ea2 reported: The process could not be created for step 1 of job 0x8E8FEACF379F1C4BA9F61E1C02F4A1C9 (reason: %1 is not a valid Win32 application)

The agent is running the command:
c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server>

I can't see any NTFS errors in the System event log, what may be causing this?

Edit: I restored the TriggerJob.exe from backup from a couple days before the problem started and tried a failing job again but got the exact same issue.

  • Edited by SysAdminITL Thursday, August 27, 2015 12:41 PM Edit
Free Windows Admin Tool Kit Click here and download it now
August 27th, 2015 8:55am

Hi,

If you open an administrative command prompt and run the c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe <GUID here> <FQDN of backup server> manually, does that succeed ?

August 27th, 2015 2:26pm

Don't know why I didn't think of trying that yesterday. Oddly copying the command to an Admin command prompt just returned me to a prompt after a second without giving any output, not sure what it was supposed to do or how to tell if it worked but I can't see any errors from the time. However it only did this after I wrapped the full path to the TriggerJobs exe in quotes (because of spaces in the path), it did not have this in the job step. The job worked after I made the same change in the job step.

It appears then that all the SQL Agent jobs are failing because of this. Why have they changed? if there any way to update them all or am I going to have to made the same change to each manually (there are dozens!)?

Edit: For anyone with the same issue I updated the job steps by running the following:

begin transaction
update sysjobsteps set command=replace(command,'c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe','"c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe"')where command like '%TriggerJob%'
commit  transaction

Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 4:38am

Don't know why I didn't think of trying that yesterday. Oddly copying the command to an Admin command prompt just returned me to a prompt after a second without giving any output, not sure what it was supposed to do or how to tell if it worked but I can't see any errors from the time. However it only did this after I wrapped the full path to the TriggerJobs exe in quotes (because of spaces in the path), it did not have this in the job step. The job worked after I made the same change in the job step.

It appears then that all the SQL Agent jobs are failing because of this. Why have they changed? if there any way to update them all or am I going to have to made the same change to each manually (there are dozens!)?

Edit: For anyone with the same issue I updated the job steps by running the following:

begin transaction
update sysjobsteps set command=replace(command,'c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe','"c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe"')where command like '%TriggerJob%'
commit  transaction

  • Edited by SysAdminITL Friday, August 28, 2015 10:26 AM Edit
August 28th, 2015 8:36am

Don't know why I didn't think of trying that yesterday. Oddly copying the command to an Admin command prompt just returned me to a prompt after a second without giving any output, not sure what it was supposed to do or how to tell if it worked but I can't see any errors from the time. However it only did this after I wrapped the full path to the TriggerJobs exe in quotes (because of spaces in the path), it did not have this in the job step. The job worked after I made the same change in the job step.

It appears then that all the SQL Agent jobs are failing because of this. Why have they changed? if there any way to update them all or am I going to have to made the same change to each manually (there are dozens!)?

Edit: For anyone with the same issue I updated the job steps by running the following:

begin transaction
update sysjobsteps set command=replace(command,'c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe','"c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe"')where command like '%TriggerJob%'
commit  transaction

  • Edited by SysAdminITL Friday, August 28, 2015 10:26 AM Edit
Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 8:36am

Don't know why I didn't think of trying that yesterday. Oddly copying the command to an Admin command prompt just returned me to a prompt after a second without giving any output, not sure what it was supposed to do or how to tell if it worked but I can't see any errors from the time. However it only did this after I wrapped the full path to the TriggerJobs exe in quotes (because of spaces in the path), it did not have this in the job step. The job worked after I made the same change in the job step.

It appears then that all the SQL Agent jobs are failing because of this. Why have they changed? if there any way to update them all or am I going to have to made the same change to each manually (there are dozens!)?

Edit: For anyone with the same issue I updated the job steps by running the following:

begin transaction
update sysjobsteps set command=replace(command,'c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe','"c:\Program Files\Microsoft Data Protection Manager\DPM2012R2\SQLPrep\TriggerJob.exe"')where command like '%TriggerJob%'
commit  transaction

  • Edited by SysAdminITL Friday, August 28, 2015 10:26 AM Edit
August 28th, 2015 8:36am

Hi,

SQL agent does not require the quotes around the path to triggerjob.exe like a command prompt does.   You should simply be able to run a SQL job by right-clicking and then selecting "start job at step..." and then look at the running job in DPM.

To associate a SQL job number with DPM, you can run the below query in SQL management studio.

use DPMDB --change db name accordingly
select 
      sche.ScheduleId as 'SQL agent Schedule Job Name', 
      sche.JobDefinitionId,
      prot.FriendlyName as 'Protection Group' ,
     case 
            when jobd.type = 'C9B259D2-6402-486D-8E36-C6C1ADAE0912' then 'Maintenance job that runs @ midnight'
            when jobd.Type = '3D859D8C-D0BB-4142-8696-C0D215203E0D' then 'Synchronization (file/volume) / Express Full (application)'
            when jobd.Type = '84021B5E-B4DC-9B27-2B7E-3B99BB1225FF' then 'Volume/Share/System State Recovery Point'
            when jobd.Type = '913afd2d-ed74-47bd-b7ea-d42055e5c2f1' then 'Backup to tape (D-T)'
            when jobd.Type = 'B5A3D25C-8EB2-4032-9428-C852DA5CE2C5' then 'Backup to tape (D-D-T)'
            when jobd.Type = 'C4CAE2F7-F068-4A37-914E-9F02991868DA' then 'Consistency Check'
			when jobd.Type = '5ECC82D0-3475-4E81-8ADD-55B1C1D23DB1' then 'Sharepoint catalog generation'
			when jobd.Type = '6E7C76F4-A832-4418-A772-8E58FD7466CB' then 'Azure Online backup'
     end
       as Operation,
      jobd.Type as VerbID
from tbl_SCH_ScheduleDefinition sche
left join dbo.tbl_JM_JobDefinition jobd
join tbl_IM_ProtectedGroup prot
on jobd.ProtectedGroupId = prot.ProtectedGroupId
on sche.JobDefinitionId = jobd.JobDefinitionId
where sche.IsDeleted = '0' and jobd.ProtectedGroupId is not null
order by prot.FriendlyName

 

Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 2:43pm

Further investigation revealed another application installed on the DPM Server host had created a folder in the root of the C drive called 'Program'. The existence of this folder caused problems launching applications under Program Files, because any paths to executables under Program Files that were not wrapped in quotes, would first find C:\Program, which was not a valid win32 executable (which is what was causing our SQL Agent jobs to fail).

Although wrapping the path to TriggerJob.exe in SQL Agent jobs provided a workaround for us, the removal of the C:\Program folder is obviously the proper fix.

September 1st, 2015 4:42am

Further investigation revealed another application installed on the DPM Server host had created a folder in the root of the C drive called 'Program'. The existence of this folder caused problems launching applications under Program Files, because any paths to executables under Program Files that were not wrapped in quotes, would first find C:\Program, which was not a valid win32 executable (which is what was causing our SQL Agent jobs to fail).

Although wrapping the path to TriggerJob.exe in SQL Agent jobs provided a workaround for us, the removal of the C:\Program folder is obviously the proper fix.

Free Windows Admin Tool Kit Click here and download it now
September 1st, 2015 8:41am

Further investigation revealed another application installed on the DPM Server host had created a folder in the root of the C drive called 'Program'. The existence of this folder caused problems launching applications under Program Files, because any paths to executables under Program Files that were not wrapped in quotes, would first find C:\Program, which was not a valid win32 executable (which is what was causing our SQL Agent jobs to fail).

Although wrapping the path to TriggerJob.exe in SQL Agent jobs provided a workaround for us, the removal of the C:\Program folder is obviously the proper fix.

September 1st, 2015 8:41am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics