Protection Agent failure error during backup to tape with DPM 2010.

DPM TechNet Forum,

We are experiencing an intermittent error when backing up to tape using DPM 2010 (with latest QFE roll-up applied).

"The operation failed because of a protection agent failure. (ID 998 Details: The device is not connected (0x8007048F))"

Any idea what could be causing this error?

Thanks in advance,

Joe

November 11th, 2013 7:17pm

Hi,

Basically, what is happening is the library / drive is disappearing from the system.  I would suspect that you will see event ID 12s in the system event logs stating something similar to <Library or drive Name> disappeared from the system without first being prepared for removal.

The error code 0X8007048f is returned from Windows and means ERROR_DEVICE_NOT_CONNECTED

Generally, you can resolve this issue with both firmware updates of the drives and library along with a driver update.  Check scsi termination and cabling.

Hopefully, updating the firmware and drivers, checking cabling, termination will resolve your issue. 

Free Windows Admin Tool Kit Click here and download it now
November 11th, 2013 8:19pm

Mike,

Thanks for the reply!  I am only seeing Event ID 15s in the System Event Log.  The source is "hplto" and the error is "The device, \Device\TapeDrive0, is not ready for access yet".  The event posts when the backup to tape fails in DPM.  I seem to be able to re-run the backup jobs and the jobs write to the tape (originally used for the whole backup even though it is marked "Offsite Ready" - DPM Co-location is enabled).

The autoloader is an HP StorageWorks 1x8 G2 SAS autoloader with an LTO5 drive.

Thoughts on the next steps?

Thanks in advance,

Joe

November 13th, 2013 5:03pm

Hi,

The drive may be on its way out - if the library is under warranty or service contract, a call to HP is in order.

Free Windows Admin Tool Kit Click here and download it now
November 13th, 2013 6:11pm

Mike,

A remote possibiity...

However, I ran the HP Library and Tape Tools with assistance from HP Support (as the autoloader/drive is still under Maintenance).  No errors were detected in the reports I sent them.

I did notice, however, that the Windows drivers and firmware for the HP SAS controller were out of date.  I updated them today after verifying with HP that doing so would not cause problems with the existing DPM storage volumes.

I'll continue to monitor the operation of the autoloader and contact HP in the event the Event IDs persist.

Thanks,

Joe

November 13th, 2013 9:45pm

Well...we are still having intermittent Event ID 15 errors relating to the hplto.sys driver that reports "The device, \Device\TapeDrive0, is not ready for access yet".

Some days, the backups complete to a single tape while other days, the backup splits across an additional tape or 2.  For example, this past Monday and Tuesday, the backups completed on 1 tape per night (1.75TB).  However, last night's backup spanned across 2 tapes.

I've got all of the correct entries in the Registry as per Mike's blog post and other thread recommendations.  We even have a separate HP Smart Array P212 controller just for the tape drive.

However, I did see an old thread regarding the same issues:

http://social.technet.microsoft.com/Forums/en-US/944562b4-6943-46f2-b4d3-7c33bda647a0/problem-with-tape-and-event-id-15-and-11?forum=dpmtapebackuprecovery

The post from Leon79 indicated that the generic LTO driver seemed to cure all of the experienced issues.

I'd like to try that, if possible, on our DPM 2010 server (Windows Storage Server 2008 R2 Standard).  How can I replace the HP driver with the generic (Microsoft) LTO driver and test it?

Thanks in advance,

Joe

Free Windows Admin Tool Kit Click here and download it now
March 20th, 2014 3:55pm

Hi,

To use MS in-box LTO tape drivers.

 1) Open device manager
 2) Locate the tape drive
 3) Right-click and look at the properties.
 3) Under the DRIVER tab, select UPDATE DRIVER
   a) Select the Install from list or specific location (Advanced) - next.
   b) Select Don't search. I will choose the driver to install. - next.
   c) Uncheck the Show compatible hardware checkbox.
   d) Highlight LTO under the manufacturer.
   e) Highlight the LTO tape drive under model - then next.
   f) This should install the Microsoft ltotape.sys driver.
 4) Retry the tape backups.

March 20th, 2014 4:36pm

Mike,

Thanks for the reply!  (the link sent to my email address sent me to the German TechNet Forum page for this question - Ach du Lieber, mein Herr!)

I was in the right place and performing "almost" the right procedures. :-)

We had HP look at the tape drive and autoloader via the reports in the Library and Tape Tools.  They had previously replaced the autoloader but that didn't resolve the intermittent issue.  HP reported that the internal tape drive was 100% operational.  Since the backups will complete if "resumed" or if all of the Protection Group members are re-selected, I assume that I can rule out hardware (autloader, tape drive and SAS cable).

This is worth a shot...

Thanks again and I'll report back to this thread on the outcome.

Joe

Free Windows Admin Tool Kit Click here and download it now
March 20th, 2014 4:47pm

Hello,

Did you manage to get anywhere with this?

We've got the exact same problem, no hardware problems whatsoever.

Thanks

brendan 

May 23rd, 2014 4:05pm

Brendan,

No...we haven't.  Sorry that I haven't kept current with this thread so here goes...

I've involved HP 2nd level Support and the Microsoft DPM Team (opened a Support ticket).  So far, our HP autloader chassis, the internal tape drive and the SAS cable replaced.  I am using a new HP Smart Array P212 controller (Zero memory) dedicated to the autoloader but also tried the P212 controller that serves the OS and DPM storage pool arrays.  Still getting intermittent driver failures (Event ID 15 errors in the System Event log) and backups spanning up to 5 tapes!  So, in my opinion...no hardware issues.  HP even sent a field tech out to install a "loaner" memory module in the dedicated HP P212 controller.  No dice!

The HP Library and Tape Tools (when running a full tape test - 1.5TB) generated an error at about 645GB into the test.  The error reported was:

"Write operation failed with the OS error - 1167 the device is not connected"

I did swap the tape driver to use the "canned" LTO driver but that didn't resolve the issue.

The last communication I received (yesterday) from 2nd level HP Support was that there are some reports that applying the latest Proliant Support Packs and/or performing a clean install using the HP-provided server image would resolve the issue.

I am all about re-doing the server from scratch but will need to preserve the disk recovery points on the DPM Storage Pool and the DPM database.  I'm sure the DPM Team can assist with this.

I just want this issue resolved...it's a major PITA!

Joe

Free Windows Admin Tool Kit Click here and download it now
May 23rd, 2014 4:26pm

Hi Joe,

Not a problem, thanks for replying!

Same here, no real resolution but to re-install.

The problem seems to be with the database being corrupted, therefore the only way to solve the issue is a complete re-install.

Ball ache like no other, but we need backups!

Thanks

brendan

  • Proposed as answer by brendan88 Monday, June 09, 2014 2:56 PM
May 23rd, 2014 4:30pm

Hi Joe,

Not a problem, thanks for replying!

Same here, no real resolution but to re-install.

The problem seems to be with the database being corrupted, therefore the only way to solve the issue is a complete re-install.

Ball ache like no other, but we need backups!

Thanks

brendan

  • Proposed as answer by brendan88 Monday, June 09, 2014 2:56 PM
Free Windows Admin Tool Kit Click here and download it now
May 23rd, 2014 4:30pm

Hi Joe,

Not a problem, thanks for replying!

Same here, no real resolution but to re-install.

The problem seems to be with the database being corrupted, therefore the only way to solve the issue is a complete re-install.

Ball ache like no other, but we need backups!

Thanks

brendan

True that!

May 23rd, 2014 4:31pm

Hi,

Event ID 15 errors relating to the hplto.sys driver that reports "The device, \Device\TapeDrive0, is not ready for access yet".

The operation failed because of a protection agent failure. (ID 998 Details: The device is not connected (0x8007048F))"

The error code 0X8007048f is returned from Windows and means ERROR_DEVICE_NOT_CONNECTED

None of the errors above reported by Windows or the tape drive are DPM related, rebuilding DPM will not help resolve this issue.  DPM cannot continue writing to a device that does not ex

Free Windows Admin Tool Kit Click here and download it now
May 23rd, 2014 4:35pm

Mike,

Understood...but neither HP 2nd level "Storage" Support, the HP Engineering Team or the Windows Server Team has any answer other than what I posted.  It is not my opinion...it's theirs...collectively (until I hear otherwise).

It is my opinion that "something" is stepping on the tape device driver and causing the disconnect.  We have yet to find out what that is.

Short of having multiple (i.e. more than 2 so far) tape systems to "swap out" to ensure our 2TB backups to tape complete...I don't know what else to do except refer to the experts. 

Thanks??


May 23rd, 2014 4:44pm

Mike,

Understood...but neither HP 2nd level "Storage" Support, the HP Engineering Team or the Windows Server Team has any answer other than what I posted.  It is not my opinion...it's theirs...collectively (until I hear otherwise).

It is my opinion that "something" is stepping on the tape device driver and causing the disconnect.  We have yet to find out what that is.

Short of having multiple (i.e. more than 2 so far) tape systems to "swap out" to ensure our 2TB backups to tape complete...I don't know what else to do except refer to the experts. 

Thanks??


Free Windows Admin Tool Kit Click here and download it now
May 23rd, 2014 4:44pm

Hi,

If the problem is that prevalent, I would insist that HP support try to duplicate the issue in their lab. They can hook up an analyzer and monitor IO to the tape drive and see why it disappears.   

May 23rd, 2014 4:51pm

Hi,

If the problem is that prevalent, I would insist that HP support try to duplicate the issue in their lab. They can hook up an analyzer and monitor IO to the tape drive and see why it disappears.  

Free Windows Admin Tool Kit Click here and download it now
May 23rd, 2014 4:52pm

Mike,

You want want to consider letting the DPM support staff in on this.

It was them who said that the solution would be a complete re-install due to database corruption.

As I said before, same problems that Joe has experienced, no problems whatsoever with the tape drive, controller etc.

May 27th, 2014 9:16am

Hi Joe,

We're going to be performing the re-install soon, I'll let you know if it resolves our problems or not.

Thanks

brendan 

Free Windows Admin Tool Kit Click here and download it now
May 27th, 2014 9:18am

Hi,

You can try to reproduce the problem outside of DPM using some external utilities.  If you have more than one drive in the library, run the test against both drives simultaneously to simulate multiple backup jobs running.  If you get an error before the tape fills you can use net helpmsg errorcode to see what the error was.

Download the DPMerasetape.zip file from the following link and extract to c:\temp folder.


https://onedrive.live.com/?cid=b03306b628ab886f&id=B03306B628AB886F%21524&sc=documents


The utilities are not that user friendly, but here are the basics.

Always Stop DPMLA Service prior to running MCT.EXE Commands.

  NET STOP DPMLA

C:\> mct-x64.exe -p

Opening changer \\.\Changer0

     ********** Changer Parameters **********

         Number of Transport Elements : 1
         Number of Storage Elements : 50
         Number of Cleaner Slots : 0
         Number of of IE Elements : 0
         Number of NumberDataTransferElements : 6
         Number of Doors : 0

         First Slot Number : 0
         First Drive Number : 0
         First Transport Number : 0
         First IEPort number : 0
         First Cleaner Slot Address : 0

         Magazine Size : 0

         Drive Clean Timeout : 600

  Flags set for the changer :
         CHANGER_BAR_CODE_SCANNER_INSTALLED
         CHANGER_POSITION_TO_ELEMENT
         CHANGER_STORAGE_DRIVE
         CHANGER_STORAGE_SLOT
         CHANGER_DRIVE_CLEANING_REQUIRED
         CHANGER_VOLUME_IDENTIFICATION
         CHANGER_VOLUME_SEARCH
         CHANGER_SERIAL_NUMBER_VALID

 Changer can move from Slot to :
                 Slot
                 Drive


 Changer can move from Drive to :
                 Slot
                 Drive

 Changer is Capable of positioning transport to Slot.
 Changer is Capable of positioning transport to Drive.

C:\> mct-x64.exe -d

Opening changer \\.\Changer0
Product Data for Medium Changer device :
  Vendor Id    : STK
  Product Id   : L180
  Revision     : 030
  SerialNumber : 3077520000

For MCT utility we have the  -m [MOVE] command to move media around inside the library.

-m [ElemType-T] Transport# [ElemType-Source] S_lot#/D_rive# [ElemType-Destination] S_lot#/D_rive#

Get / view command syntax for m (move) command for changer 0

C:\>mct-x64 0 -m

Opening changer \\.\Changer0
MoveMedium : mct -m t N s\d N s\d N   [Where s/d means Slot or Drive and N is ZERO based].

 

Some Examples:

mct-x64 -m t 0 s 0 d 0    (Using transport-0, move media from slot-0  to drive-0)
mct-x64 -m t 0 d 0 s 0    (Using transport-0, move media from drive-0 to slot-0)
mct-x64 -m t 0 s 0 s 100  (Using transport-0, move media from slot-0  to slot-100)
mct-x64 -m t 0 d 0 d 1    (Using transport-0, move media from drive-0 to drive-1)
mct-x64 -m t 0 s 0 ie 0   (Using transport-0, move media from slot-0  to IEPort 0)

 

Once you move a tape into a drive, use mytape commands Loadtape, taperewind, locktape, Disable hardware compression, Set block size to 65536 (64K), writeforspanning.

You need the symbolic name for the tape drive you loaded media into - look in the DPM console by clicking the tape drive and look at the details for \\.\tape########.  use that in the following command.

 

Mytape.exe \\.\Tape2147483638

Status: Getting the handle for \\.\Tape2147483638...Success

TapeConsole_1.0>taperewind">\\.\Tape2147483638>TapeConsole_1.0>taperewind

Status: Rewinding Tape ...Success

TapeConsole_1.0>setdriveinfo">\\.\Tape2147483638>TapeConsole_1.0>setdriveinfo

Hardware error correction  [y]-Enable / [n] Disable : y
Hardware data compression  [y]-Enable / [n] Disable : N   (BE SURE TO DISABLE)
Data padding  [y]-Enable / [n] Disable : n
Setmark reporting   [y]-Enable / [n] Disable : n
Number of bytes between the end-of-tape warning and the physical end of the tape: 0
Status: Setting Drive Information...Success


TapeConsole_1.0>writeforspanning">\\.\Tape2147483638>TapeConsole_1.0>writeforspanning

Status: Writing onto tape...Failed !!!
Error_ID reported: 1100                 (net helpmsg 1100 = The physical end of the tape has been reached.
Number of bytes written: 983040     (Ignore bytes written, we'll get physical tape position later)
Giving up
Time taken: 15788ms

TapeConsole_1.0>taperewind">\\.\Tape2147483638>TapeConsole_1.0>taperewind

Status: Rewinding Tape ...Success

 

REPEAT TapeConsole_1.0>erasetape">\\.\Tape2147483638>TapeConsole_1.0>erasetape s

Short erase / Long Erase [s/l]:Status: Erasing the tape...Success

 

May 27th, 2014 2:34pm

Brendan,

Thanks!  I look forward to hearing if the rebuild fixed your issue.

Over the weekend, I forwarded Mike's comments (and the link to this thread) to the HP and Microsoft teams I am working with on this issue.  I have yet to hear back on what our next steps should be.  I was able, however, to get a full "weekly" backup to tape on Friday night into Saturday morning.  The tape driver generated 5 errors (Event ID 15) and disconnected from the OS.  "Resuming" the jobs on Saturday morning allowed it to complete.  Note: I am currently using the "canned" Microsoft LTO driver and not the specific HP LTO driver.  Both drivers exhibit the same "disconnect" issues.

I believe that a server rebuild will fix the issue (based on what I have seen).  This is not a "DPM rebuild" per se but a rebuild of the OS on that server.  After the server is rebuilt and the tape system passes a full tape test, I can work with Microsoft to get DPM reinstalled and re-attach the DPM database.

Joe

Free Windows Admin Tool Kit Click here and download it now
May 27th, 2014 2:34pm

Sounds about right!

We've tried 3 different drivers and they all had they same issue.

I ran sfc /scannow on our DPM server and found no issues, so we're going to dump the whole thing and go for a re-install of DPM, starting with a fresh database - hopefully it will have positive results.

I'll let you know

Thanks

brendan

May 27th, 2014 2:43pm

Sounds about right!

We've tried 3 different drivers and they all had they same issue.

I ran sfc /scannow on our DPM server and found no issues, so we're going to dump the whole thing and go for a re-install of DPM, starting with a fresh database - hopefully it will have positive results.

I'll let you know

Thanks

brendan

Thanks, Brendan!

Joe

Free Windows Admin Tool Kit Click here and download it now
May 27th, 2014 2:44pm

Mike,

Thanks for the information on using the utility you've listed.  I will run this by the HP and Microsoft teams I am in contact with to see if we need to run this to reproduce the problem.

I hope to hear back from them today at some point.

Joe

May 27th, 2014 2:47pm

Hi Joe,

Just a quick update, we're not going to re-install DPM for now, we have a trial version of Backup Exec 2012R2 that we're going to install.

We want to see if the problem persists on BE.

Again, I'll keep you informed.

Thanks

brendan

Free Windows Admin Tool Kit Click here and download it now
May 28th, 2014 12:56pm

Thanks for the update, Brendan!

Late yesterday, HP 2nd level Support asked me to update the Proliant Support Pack to the latest version for Windows Storage Server 2008 R2.  I did that last night and rebooted the server.  There were some online references to disabling the HP Management agents on the server (which we did previously to no avail).  HP wanted me to make sure those agents remained disabled (I had to disable them again) as well as making sure that the registry entries (Storport, BusyRetryCount, etc.) remained.  They did.  I rebooted a 2nd time and manually ran long-term recovery points to tape of all Protection Group members.  The backups to tape spanned across (so far...as they are still running) 2 LOT5 tape cartridges and the tape system experience 6 "disconnects" so far (Event ID 15 errors).

So...I asked the HP and Microsoft Support teams to relay the next steps to rebuild the server, reinstall DPM, re-attach the database and mount/recover the disk-based Recovery Points.

Wating to hear back...

Joe

May 28th, 2014 2:26pm

Hi,

Event ID 15 errors relating to the hplto.sys driver that reports "The device, \Device\TapeDrive0, is not ready for access yet".

The operation failed because of a protection agent failure. (ID 998 Details: The device is not connected (0x8007048F))"

The error code 0X8007048f is returned from Windows and means ERROR_DEVICE_NOT_CONNECTED

None of the errors above reported by Windows or the tape drive are DPM related, rebuilding DPM will not help resolve this issue.  DPM cannot continue writing to a device that does not ex

Free Windows Admin Tool Kit Click here and download it now
June 2nd, 2014 7:37am

Brendan,

That's good news!

I am trying to get HP to send me the server image for Windows Storage Server 2008 R2 as the OS, apparently, is an OEM revision that Microsoft does not support directly.  They will support it via Premier Support if HP contacts Microsoft directly (this was explained to me by Microsoft's DPM Support).

My plan is to reload the OS on the server and work with Microsoft to reconnect the DPM database so that our recovery points are available.  I'll first confirm tape system operation on the server via HP's Library and Tape Tools prior to reloading DPM.  Backup Exec is not currently in the budget but we have an MS Enterprise Agreement (hence, us using DPM).

Thanks for your update!

Joe

June 2nd, 2014 1:47pm

Hi Joe,

Not a problem, I knew it wasn't a hardware problem....but even though I've proved that I guess Microsoft will still argue that it's not their product at fault either! Yawn.

A little victory I suppose!

But all is working well, I would like to suggest an answer of 'Discard DPM Entirely', but I don't think that would get the vote.

Thanks

brendan 

Free Windows Admin Tool Kit Click here and download it now
June 2nd, 2014 2:21pm

Hi Joe,

Not a problem, I knew it wasn't a hardware problem....but even though I've proved that I guess Microsoft will still argue that it's not their product at fault either! Yawn.

A little victory I suppose!

But all is working well, I would like to suggest an answer of 'Discard DPM Entirely', but I don't think that would get the vote.

Thanks

brendan 

Yeah, we'll get no love from Microsoft with that...

Joe

June 2nd, 2014 2:29pm

Hi Folks,

Apples and Oranges comparison - many other backup products install proprietary drivers and / or write to tape using proprietary format that may not exhibit the same errors. All of the errors seen in this post are Device errors reported by Windows and DPM is the victim and not the cause of the errors.

Some tape drives have trouble processing multiple buffers which usually cause other problems - but you can try reducing the bufferqueuesize and see if it helps in your case.

Registry setting: HKLM\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent:BufferQueueSize:REG_DWORD:3 

Has anybody tried running the writeforspanning test to see if that shows any signs of problems ? 

Free Windows Admin Tool Kit Click here and download it now
June 2nd, 2014 2:50pm

Hi Mike,

We had already tried the reg fix, as well as a host of different drivers, both HP and Microsoft and they all produced the same problem.

After 3 days the hardware error hasn't occurred, the only thing that has changed is the driver - which is the symantec driver for the tape drive, thus far no problems, same server, same tape drive.

To be honest, I'm quite happy to have moved away from DPM as I do believe it is the cause of the problem, as do the DPM support staff that we were in contact with.

I really do doubt that the multiple drivers, two tape drives, controllers and cables we went through, let alone the server, are the cause of the problem. Highly unlikely.

As I have heard from others that have suffered at the hands of DPM, it's a nightmare, up there with Vista.

I just find it a bit bemusing how so many people suffer the same, unresolved problem with dpm - with different tape drives.

I'm glad to be done with it now.

Brendan

June 2nd, 2014 3:31pm

Sorry to be blunt, but we've had enough of it.

If the problem does come back with Backup Exec then I'll eat my shoes.

But after spending months on the phone to Microsoft and scouring the internet for a solution, I'm happy to wash my hands of it.

Brendan

Free Windows Admin Tool Kit Click here and download it now
June 2nd, 2014 3:40pm

Sorry to be blunt, but we've had enough of it.

If the problem does come back with Backup Exec then I'll eat my shoes.

But after spending months on the phone to Microsoft and scouring the internet for a solution, I'm happy to wash my hands of it.

Brendan

I'm getting near the end of my rope, too.

...but shoes just do not taste good. :-)

Joe

June 2nd, 2014 3:49pm

Hi Joe,

Just a quick update really, we've had no problems at all since moving to Backup Exec, backed up 2.4TB last night with no issues.

Hope you get the issue resolved, if I were you I'd get rid of DPM.

Thanks

Brendan

Free Windows Admin Tool Kit Click here and download it now
June 5th, 2014 11:31am

Well...we've made some headway with regard to our problem.  I have Support tickets open with HP (2nd level) and Microsoft.  I was able to obtain the server image for our HP X1600 G2 server running Windows Storage Server 2008 R2.  I reloaded the OS and, with the Microsoft DPM Team's asistance (thank you, Prosenjit!), was able to restore the backed-up DPM database and get back my Protection Group and Recovery Points.

However, prior to reinstalling DPM, I installed the HP Library and Tape Tools and ran a "full tape" test.  That test failed (as it did before rebuilding the server OS).  So, to confirm everyone's suspicion, there is some continuing issue either with the tape hardware (it was already ALL swapped out) or the server image.  It is NOT DPM! (right, Mike?) :-)

HP has me now installing the latest Service Release and Proliant Support Pack for that server.

DPM will perform a daily backup to tape process this evening.  I will update this thread on what happens.

Joe


June 12th, 2014 8:08pm

Well...we've made some headway with regard to our problem.  I have Support tickets open with HP (2nd level) and Microsoft.  I was able to obtain the server image for our HP X1600 G2 server running Windows Storage Server 2008 R2.  I reloaded the OS and, with the Microsoft DPM Team's asistance (thank you, Prosenjit!), was able to restore the backed-up DPM database and get back my Protection Group and Recovery Points.

However, prior to reinstalling DPM, I installed the HP Library and Tape Tools and ran a "full tape" test.  That test failed (as it did before rebuilding the server OS).  So, to confirm everyone's suspicion, there is some continuing issue either with the tape hardware (it was already ALL swapped out) or the server image.  It is NOT DPM! (right, Mike?) :-)

HP has me now installing the latest Service Release and Proliant Support Pack for that server.

DPM will perform a daily backup to tape process this evening.  I will update this thread on what happens.

Joe


Free Windows Admin Tool Kit Click here and download it now
June 12th, 2014 8:08pm

Also...when I reattached the DPM database, the C: drive of the DPM server (set as a Protection Member in the Group) reported as "unable to continue protection".  I had assumed that this was because the partition or drive GUID changed when the server was reimaged from the HP Recovery CD.

I removed the DPM server from the list of Protection Group members and deleted the recovery points from disk and tape.  When I try to add the server back as a protected member, I cannot add the C: drive.  I can add the SQL databases and the System State...just not the C: drive.

A dialog box appears stating something like:

"C:\ contains a mount point at C:\xxxx\xxxxx\xxxxx\xxxx\xxx\vol_xxxxxx whose destination volume is \\?\Volumexxxxxxx-xxx-xxx-xxxxx\.  Do you also want to protect the volume  \\?\Volumexxxxxxx-xxx-xxx-xxxxx\ ?"

If I click "No", I must click this about 70 times to return to the Protection Group modification window.

Any idea what causes this and how to get all members of the DPM server back into the Protection Group properly?

Thanks in advance,

Joe

 
June 12th, 2014 8:17pm

Hi,

That is normal - all the mounted volumes used by DPM under C:\Program Files\System Center 2012\DPM\DPM\Volumes are being enumerated and you are asked if you want to include them in the backup. No is the correct response.  Alternately, instead of selecting C: - select just the sub folders and exclude the Volumes folder.

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2014 8:32pm

Hi,

That is normal - all the mounted volumes used by DPM under C:\Program Files\System Center 2012\DPM\DPM\Volumes are being enumerated and you are asked if you want to include them in the backup. No is the correct response.  Alternately, instead of selecting C: - select just the sub folders and exclude the Volumes f

June 12th, 2014 8:39pm

Hi,

Yes, that would bring relief to several customers that reported that problem - so I'm keeping my fingers, legs, and eyeballs crossed. 

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2014 8:42pm

Hi,

Yes, that would bring relief to several customers that reported that problem - so I'm keeping my fingers, legs, and eyeballs crossed. 

June 12th, 2014 8:43pm

Update:

The tape backup finished last night all on one tape (however, it was the only cartridge in the autoloader at the time).

The good news is that NO hplto.sys or Event ID: 15 errors were seen in the System Event Log!

So, there is the possibility that the latest Service Release and Proliant Support Pack for the HP X1600 G2 server resolved the intermittent tape issue we've been experiencing for quite a while.

However, after applying the SR and PSP to the server, DPM generated an error trying to display the Reports under the Reports tab.  The DPM MMC also "crashed" when trying to click on the Recovery tab.  I just got off a remote support call with Prosenjit of the Microsoft DPM Team and we both agree that DPM needs to have the database backed up, DPM and SQL uninstalled, reinstalled and the database restored as the HP Service Release and Proliant Support Pack installs must have broken "something" in DPM to cause those 2 issues.

We're almost there...

I am in the process of doing the uninstall/reinstall thing for DPM and will let the tape backups run for several days before calling this issue resolved.  This might help some that are having the same (or similar) issues.

Joe

Free Windows Admin Tool Kit Click here and download it now
June 13th, 2014 5:07pm

DPM TechNet Forum and Mike Jacquet,

Well...I've finally appeared to have resolved the issue!

I was getting nowhere with HP re: the "breakage" of the DPM Reporting and Recovery tabs.  Therefore, I resorted to wiping the server OS again using the HP Recovery image for Windows Storage Server 2008 R2, reinstalling DPM, reconnecting the backed-up database, running a consistency check on each of the Protection Group members, downloading/installing all Windows critical updates, SP1, all DPM QFEs and rebooting the server when prompted.

I was able to get full backups to tape during the week of 06/16/2014.  Monday's (06/16) backup spanned across 2 tapes but there were no Event ID: 15 errors in the System Event log regarding the hplto.sys driver.  I attributed the spanning to the particular LTO5 cartridge needing a "full erase" via DPM (3 hours!).  I've sinced erased other cartridges and have had no spanning issues.  We continue to get full backups to tape and on 1 cartridge (we are backing up just over 2TB worth of data).

I was also able to run HP's Library and Tape Tools and have the Drive Performance test run with the full tape option (which writes then reads 3TB - compressed capacity of an LTO5 cartridge).  The test passed with no issues.

Prosenjit Kanjilal (from the Microsoft DPM Team) was extremely helpful with problem identification and providing direction (as were you, Mike!).  Thank you, both, and Microsoft!

In summary, it appears that the HP OEM operating system (Windows Storage Server 2008 R2) and the HP tape drivers were the root cause of the problem which seemed to get steadily worse since November 2013.  DPM 2010 was never the issue but suffered (backups to tape only) due to the problems experienced with the OS and tape drivers.

Hope this helps someone else that is experiencing the same issue...

I am just glad that it is now working properly!!! :-)

Joe

June 24th, 2014 5:10pm

DPM TechNet Forum and Mike Jacquet,

Well...I've finally appeared to have resolved the issue!

I was getting nowhere with HP re: the "breakage" of the DPM Reporting and Recovery tabs.  Therefore, I resorted to wiping the server OS again using the HP Recovery image for Windows Storage Server 2008 R2, reinstalling DPM, reconnecting the backed-up database, running a consistency check on each of the Protection Group members, downloading/installing all Windows critical updates, SP1, all DPM QFEs and rebooting the server when prompted.

I was able to get full backups to tape during the week of 06/16/2014.  Monday's (06/16) backup spanned across 2 tapes but there were no Event ID: 15 errors in the System Event log regarding the hplto.sys driver.  I attributed the spanning to the particular LTO5 cartridge needing a "full erase" via DPM (3 hours!).  I've sinced erased other cartridges and have had no spanning issues.  We continue to get full backups to tape and on 1 cartridge (we are backing up just over 2TB worth of data).

I was also able to run HP's Library and Tape Tools and have the Drive Performance test run with the full tape option (which writes then reads 3TB - compressed capacity of an LTO5 cartridge).  The test passed with no issues.

Prosenjit Kanjilal (from the Microsoft DPM Team) was extremely helpful with problem identification and providing direction (as were you, Mike!).  Thank you, both, and Microsoft!

In summary, it appears that the HP OEM operating system (Windows Storage Server 2008 R2) and the HP tape drivers were the root cause of the problem which seemed to get steadily worse since November 2013.  DPM 2010 was never the issue but suffered (backups to tape only) due to the problems experienced with the OS and tape drivers.

Hope this helps someone else that is experiencing the same issue...

I am just glad that it is now working properly!!! :-)

Joe

Free Windows Admin Tool Kit Click here and download it now
June 24th, 2014 5:10pm

Hi Joe,

That's brilliant news, glad you got it resolved!

Seems I was right about the reinstall then....might mark that as the answer haha

Thanks

brendan 

June 25th, 2014 8:43am

Hi Joe,

Thanks for the update, so for other customers experiencing the Event ID 15 device errors due to problematic OEM operating system, how can customers get the updated HP OEM operating system (Windows Storage Server 2008 R2) and the HP tape drivers one that works ?

<snip>
 I was able to obtain the server image for our HP X1600 G2 server running Windows Storage Server 2008 R2. 
>snip<

Free Windows Admin Tool Kit Click here and download it now
June 25th, 2014 4:45pm

Hi Joe,

Thanks for the update, so for other customers experiencing the Event ID 15 device errors due to problematic OEM operating system, how can customers get the updated HP OEM operating system (Windows Storage Server 2008 R2) and the HP tape drivers one that works ?

<snip>
 I was able to obtain the server image for our HP X1600 G2 server running Windows Storage Server 2008 R2. 
>snip<

June 25th, 2014 5:18pm

Also, we experienced the tape spanning issue again last night (see earlier post) that again wrote 4.5GB to a 2nd tape.  I've marked that tape's barcode to perform a full erase on it before it is used for backup again.

Note: I don't think that the registry settings made it back to the backup server for "Storport" and "BusyRetry".  This may be causing DPM to write to the 2nd tape...

I'll investigate...

Joe

Free Windows Admin Tool Kit Click here and download it now
June 25th, 2014 5:24pm

Hi Joe,

That's brilliant news, glad you got it resolved!

Seems I was right about the reinstall then....might mark that as the answer haha

Thanks

brendan 

It figures...you IT guys are all the same.

Wait!  ...I'm one of them, too! :-) Hahahaha!!!

Brendan, I also thought all along that it was something with the OS and/or tape drivers - not hardware or DPM.

Peace out, brother!

Joe

June 25th, 2014 5:30pm

Also, we experienced the tape spanning issue again last night (see earlier post) that again wrote 4.5GB to a 2nd tape.  I've marked that tape's barcode to perform a full erase on it before it is used for backup again.

Note: I don't think that the registry settings made it back to the backup server for "Storport" and "BusyRetry".  This may be causing DPM to write to the 2nd tape...

I'll investigate...

Joe

Nope!  The registry settings from Mike's blog post were not on the server (obviously, after the rebuild...)

http://blogs.technet.com/b/dpm/archive/2012/05/14/things-you-can-do-to-help-data-protection-manager-utilize-your-tapes-full-capacity.aspx

I just added "BufferQueueSize", the "Storport" key and the "BusyRetryCount" to the server's registry. I'll reboot the server after the 6:00 PM Recovery Points run and monitor the tape backup (starts at 11:00 PM).

I'll update the thread with the results of tonight's backup to tape.

Joe

Free Windows Admin Tool Kit Click here and download it now
June 25th, 2014 5:42pm

Well...

DPM wrote information again to 2 tape cartridges during last night's backup.  Thankfully, there were none of the "tape system disconnects" (Event ID: 15 errors) that we were plagued with prior to the 2nd rebuild of the backup server.  DPM just wrote to 2 tapes.  There was 2.11TB written to one LTO5 cartridge and 12.2GB written to the 2nd LTO5 cartridge.

This was after the registry tweaks from Mike blog post above were added and the server rebooted.  The cartridges used had been "full erased" earlier yesterday.

Any idea?

I know that we won't be able to get a full 3.0TB to an LTO5 cartridge due to not realizing true 2:1 compression but I would think that I should get more than 2.11TB.

Thanks,

Joe

June 26th, 2014 3:30pm

Hi,

See if the tape drive is reporting IO error 0x8007045D that equals "The request could not be performed because of an I/O device error", you can run the following commands on the DPM server.

  1. Open an Administrative command prompt.
  2. CD C:\Program file\Microsoft DPM\DPM\Temp
  3. Find /I "0x8007045D" MSDPM*.Errlog >C:\temp\MSDPM0x8007045D.TXT
  4. Notepad C:\temp\MSDPM0x8007045D.TXT
  5. See if there are any entries in the file, if not look in the DPMRA logs
  6. Find /I "0x8007045D" DPMRA*.Errlog >C:\temp\DPMRA0x8007045D.TXT
  7. Notepad C:\temp\DPMRA0x8007045D.TXT

Also search for "-2147023779" which is the decimal equivalent.

If not, then most likely end of tape was detected, so search for 0x8007044C.

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 3:51pm

Mike,

Thanks for the reply.  I will do so now.

Joe

June 26th, 2014 3:54pm

Mike,

The searches did not find 0x8007045D, 0x8007044C or "-2147023779" in any of the logs.  The output text file for MSDPM is:


---------- MSDPM0.ERRLOG

---------- MSDPM1.ERRLOG

---------- MSDPM2.ERRLOG

---------- MSDPM3.ERRLOG

---------- MSDPM4.ERRLOG

---------- MSDPM5.ERRLOG

---------- MSDPM6.ERRLOG

---------- MSDPM7.ERRLOG

---------- MSDPM8.ERRLOG

---------- MSDPM9.ERRLOG

---------- MSDPMCURR.ERRLOG

The output text file for DPMRA is:


---------- DPMRA0.ERRLOG

---------- DPMRA1.ERRLOG

---------- DPMRA2.ERRLOG

---------- DPMRA3.ERRLOG

---------- DPMRA4.ERRLOG

---------- DPMRA5.ERRLOG

---------- DPMRA6.ERRLOG

---------- DPMRA7.ERRLOG

---------- DPMRA8.ERRLOG

---------- DPMRA9.ERRLOG

---------- DPMRACURR.ERRLOG

Any other ideas?

Joe

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 4:02pm

Hi,

Normally when we need to change a tape during a backup, we will see in the DPMRA log either the IO error or the end of media error.

Below is a normal end of tape error - and the NeedMoreMedia = [1]

1158	1A64	06/14	21:32:48.022	29	mtaperformiosubtask.cpp(378)	[00000000037D60D0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	CMTAPerformIOSubTask::IntermediateResponseSent => startPBA = [8009640]
1158	2DAC	06/14	21:33:00.443	18	bufferedmediawriter.cpp(409)	[00000000038F89B0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	CBufferedMediaWriter::NonIOMediaOpDone - StartPBA[8009640]
1158	2DAC	06/14	21:33:00.443	18	bufferedmediawriter.cpp(426)	[00000000038F89B0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	CBufferedMediaWriter::NonIOMediaOpDone - StartFLA[0]
1158	17B0	06/15	00:48:27.724	22	genericthreadpool.cpp(537)	[000000000155AAF0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	WARNING	Failed: Hr: = [0x8007044c] Thread 6064 received ERROR
1158	17B0	06/15	00:48:27.725	18	mtfsetupdater.cpp(1487)	[00000000038F8B40]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	Closing continuation buffer, SSET FLA - 326197568, SSET len - 2048, VOLB len - 1024, DIRB len - 1024, FILE len - 1024, this: [00000000038F8B40]
1158	17B0	06/15	00:48:27.725	29	mtamedia.cpp(673)	[0000000003973210]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	CMTAMedia::SetEndFLAOnSpanning => Media Map details are: startFLA = [0] starPBA = [8009640] endFLA = [326197631] endPBA = [9223372036854775807]
1158	1AD0	06/15	00:48:27.725	29	mtaperformiosubtask.cpp(489)	[00000000037D60D0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	NORMAL	CMTAPerformIOSubTask::DoneWithMedia => NeedMoreMedia = [1]
1158	1AD0	06/15	00:48:27.725	29	mtasubtask.cpp(649)	[00000000037D60D0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	WARNING	Failed: Hr: = [0x00000000] CMTASubTask::UpdateStatus => hr
1158	1AD0	06/15	00:48:27.731	29	mtasubtask.cpp(544)	[00000000037D60D0]	9F1A9239-7ABA-4985-8B36-8281CB5BE3C1	ACTIVITY	CMTASubTask::DeactivateSubTask => Deactivating SubTask
You may want to search the logs just for the NeedMoreMedia and see what error is logged right before it's set to [1].
June 26th, 2014 4:23pm

Mike,

Should I try changing our current registry setting of "fa" (hex) to "4b" (hex) as per this thread?

http://social.technet.microsoft.com/Forums/en-US/e718e8e6-c0d6-496e-9407-ea1ccf51bb98/dpm-2010-not-filling-tapes?forum=dpmtapebackuprecovery

Joe

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 4:38pm

Mike,

I did find 2 instances of "NeedMoreMedia = [1]" in the MSDPM*.ErrLog logs on the days where DPM wrote to a 2nd tape (06/19, 06/25 and 06/26).

So, from the logs, can we conclude that we are filling an LOT5 cartridge after 2.11TB is written to it?

Is there anything else I can check or adjust to make everything fit on 1 tape?  We do full backups every day here.

Joe

June 26th, 2014 4:54pm

Hi,

DPMRA is the process that writes to the tapes, so the "NeedMoreMedia = [1]" should have been found in the DPMRA*.errlog file.  Right above that, there should be a reason (error code) why more media was required. 

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 5:02pm

Hi,

DPMRA is the process that writes to the tapes, so the "NeedMoreMedia = [1]" should have been found in the DPMRA*.errlog file.  Right above that, there should be a reason (error code) why more media was required.

June 26th, 2014 5:08pm

Here is the output of the DPMRA10.log file with the "NeedMoreMedia = [1]" entry for last night (06/19/2014):

13CC 131C 06/26 12:09:11.810 22 genericthreadpool.cpp(517) [00000000003BF750] ABF1CC92-08D0-4E2D-B8D5-6F92D0E34D11 WARNING Failed: Hr: = [0x8007044c] Thread 4892 received ERROR
13CC 131C 06/26 12:09:11.810 18 mtfsetupdater.cpp(1484) [0000000002CA43E0] ABF1CC92-08D0-4E2D-B8D5-6F92D0E34D11 NORMAL Closing continuation buffer, SSET FLA - 19923648, SSET len - 2048, VOLB len - 1024, DIRB len - 1024, FILE len - 1024, this: [0000000002CA43E0]
13CC 131C 06/26 12:09:11.810 29 mtamedia.cpp(673) [0000000002BD5E50] ABF1CC92-08D0-4E2D-B8D5-6F92D0E34D11 NORMAL CMTAMedia::SetEndFLAOnSpanning => Media Map details are: startFLA = [0] starPBA = [35107864] endFLA = [19923711] endPBA = [9223372036854775807]
13CC 1B80 06/26 12:09:11.810 29 mtaperformiosubtask.cpp(489) [0000000002BBDDB0] ABF1CC92-08D0-4E2D-B8D5-6F92D0E34D11 NORMAL CMTAPerformIOSubTask::DoneWithMedia => NeedMoreMedia = [1]

So, there 0x8007044c code means that the end of tape was detected.  OK, apparently that means we can only backup just over 2TB to an LTO5 cartridge, correct?

If that is the case (and no other possibility why), is there a way to "de-select" things for backup so the Protection Group fits on 1 tape or do I need to start incremental backups to tape (if that can be done in DPM)?

Thanks for the help, Mike!

Joe

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 5:18pm

Hi,

Hmm Curious why you didn't find the 0x8007044C in earlier find command ?  

The 2:1 compression ratio advertised by tape drive manufactures is an average - your mileage may vary depending on the type of data being backed up.  File types of .Jpeg, .MP4, .VHD, .zip and other files that are already compressed, cannot be compressed further when written to tape, so you would only get native capacity in that case.  So it sounds like for that particular data source you have a combination of compressible and not-so compressible files.  

DPM only support full backups when doing long term tape protection.

June 26th, 2014 5:30pm

Hmm Curious why you didn't find the 0x8007044C in earlier find command ? Don't know...used the commands that you posted.  No worries, I found the cause and the reason for tape spanning.

DPM only support full backups when doing long term tape protection That is what I figured...looks like I'll have to either eliminate static data from PG members...or buy an LTO6 autoloader! ($$$)

Thanks,

Joe

Free Windows Admin Tool Kit Click here and download it now
June 26th, 2014 5:36pm

All,

Well...it looks like the problem has resurfaced. :-(

Last night, Windows generated 2 Event ID: 15 errors in the System Event Log and the DPM backups to tape backed up to 2 LOT5 cartridges.

Note: the most recent spanning issue listed above was the result of us backing up too much data to fit onto an LTO5 cartridge.  Once I adjusted some shares and deleted some static data, the backup jobs to tape all went to 1 cartridge.

Since 06/17, we have received 5 "tape system disconnects" (Event ID: 15 errors) which caused DPM to either fail jobs (which can be "resumed" back to the same cartridge) or write the remaining jobs to another cartridge.  All other backups to tape were successful to a single cartridge per night. :-)

There have been 5 Windows Updates applied to that server since 06/17 so I am wondering if an update is lending to (or causing) the issue.  I can always try disabling Windows Updates and uninstalling those updates.

I've reached out yet again to HP 2nd level Support in the hopes that they can assist.

The amount of time I've spent on this issue is just...ridiculous.

I will update this thread as I receive more information.

Joe


July 11th, 2014 4:29pm

All,

Well...it looks like the problem has resurfaced. :-(

Last night, Windows generated 2 Event ID: 15 errors in the System Event Log and the DPM backups to tape backed up to 2 LOT5 cartridges.

Note: the most recent spanning issue listed above was the result of us backing up too much data to fit onto an LTO5 cartridge.  Once I adjusted some shares and deleted some static data, the backup jobs to tape all went to 1 cartridge.

Since 06/17, we have received 5 "tape system disconnects" (Event ID: 15 errors) which caused DPM to either fail jobs (which can be "resumed" back to the same cartridge) or write the remaining jobs to another cartridge.  All other backups to tape were successful to a single cartridge per night. :-)

There have been 5 Windows Updates applied to that server since 06/17 so I am wondering if an update is lending to (or causing) the issue.  I can always try disabling Windows Updates and uninstalling those updates.

I've reached out yet again to HP 2nd level Support in the hopes that they can assist.

The amount of time I've spent on this issue is just...ridiculous.

I will update this thread as I receive more information.

Joe


Free Windows Admin Tool Kit Click here and download it now
July 11th, 2014 4:29pm

Update:

HP 2nd Level Support had me generate a Support Ticket in their Library and Tape Tools diagnostic software and a new HPS Report then upload them to their FTP site for review.

They determined that the HP Smart Array P212 controller (dedicated to the tape system) did not have the most current version and the Storport driver was also outdated.  This was probably due to the 2nd server OS rebuild and the drivers just never got updated.

I updated both of these drivers and was able to get a successful backup.  However, running a "full tape" Device Performance test in HP L&TT still fails with the "Write operation failed with the OS error - 1167 The device is not connected".  This error occurred while all DPM and SQL services were stopped.  Daily backups to tape in DPM continue to fail intermittently and generate Event ID: 15 errors in the server's System Event Log.

I have not heard from anyone at HP on this for the past 2 days.  It is obviously something with their server image, driver and/or the tape system.

I will update this thread once I have more information.

Joe


July 18th, 2014 1:44pm

Update:

HP 2nd Level Support had me generate a Support Ticket in their Library and Tape Tools diagnostic software and a new HPS Report then upload them to their FTP site for review.

They determined that the HP Smart Array P212 controller (dedicated to the tape system) did not have the most current version and the Storport driver was also outdated.  This was probably due to the 2nd server OS rebuild and the drivers just never got updated.

I updated both of these drivers and was able to get a successful backup.  However, running a "full tape" Device Performance test in HP L&TT still fails with the "Write operation failed with the OS error - 1167 The device is not connected".  This error occurred while all DPM and SQL services were stopped.  Daily backups to tape in DPM continue to fail intermittently and generate Event ID: 15 errors in the server's System Event Log.

I have not heard from anyone at HP on this for the past 2 days.  It is obviously something with their server image, driver and/or the tape system.

I will update this thread once I have more information.

Joe


Free Windows Admin Tool Kit Click here and download it now
July 18th, 2014 1:44pm

Hi Joe,

I wanted to check to see if you had any updates on this issue?

-Tenzin

March 19th, 2015 6:28pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics