NTFS compression hangs
I'm using an up-to-date Windows Server 2008 for backing up some disk images. The files are about 100GB in size, with a summary size of about 2TB. They should be placed on an 1.5TB disk with NTFS compression. The images fit: summarizing their disk usage on the source locations (also NTFS compressed) yields 1.25TB. I'm copying with a file manager wich uses the create->setsize->write approach to copy files. Copying is surprisingly slow and finally hangs: causes windows being unable to start up a terminal, hanging at the "welcome" screen.Surprisingly slow means the following: between the very same drives the file manager copies with 90MB/s in uncompressed case. When using NTFS compression on the destination, speed drops to 15MB/s with minimal CPU utilization (about +5% on a quad core machine). So the system is neither waiting for I/O nor busy with comression. A few hours before hanging, the system enters a state where the file manager reports slightly higher speed (about 40MB/s) and then it's I/O thread is suspended permanently (possibly waiting for the system to return from the I/O operation), while the "system" process is writing to the destintation drive with approximately 15MB/s (most of this goes to the destination file, a small amount goes to the volume's $LogFile). What might be wrong?
March 19th, 2009 7:22pm

Hi, I am afraid it can be slow if we copy a very large file to a compressed location directly. Now I would like to provide the following suggestions: Suggestion 1: Uncompress your destination location. Compress your files first on your original computer and then copy it to the destination location. Suggestion 2: As you have a lot of (near 20) 100GB files, I suggest copying one of them at a time. What are the results? Tim Quan - MSFT
Free Windows Admin Tool Kit Click here and download it now
March 20th, 2009 1:09pm

Thanks for the response!Suggestion1:As stated before, source files (on the origin location) are already comressed, this is how I could compute the summary of real disk usage. If I uncompress the destination and then copy the files, then they will not be compressed. If I copy one file and then compress it, then I waste a lot of time: instead of reading 60GB + writing 60GB that would require read 60GB + write 100GB + read 100GB + write 100GB (assuming the global 60% compression ratio; worse: the last 3 operation would involve the same drive and thus make any paralellism unfeasable). So if that solved the problem, I would be first disappointed, and then anyious to know why. Is there a real difference between creating + filling a compressed file, and compressing an already existing one (I mean a difference that would make the latter operation in any sense less expensive).Suggestion2:I would of course never initiate two operation in paralel that use the same I/O channel. As most file managers, the one I use (it is quite common in fact) also queues files and transfers only one at a time. So the original setup alredy fulfilled this sugestion.The point is that I would accept that the NTFS driver is simply incapable of handling some use cases (e.g. large files, nearly filled volumes) but how can that have such a fatal impact on a server, that it becomes inaccessible (leaving the reset button the only option).FYI: another strange thing I have noticed since was that I got the error "not enough disk space" while reading one of the compressed files (on the original location).
March 20th, 2009 1:32pm

May I ask a few questions? Are you talking about MS File Transfer Manager? Are you copying the files across a network?If so, I can share my experience of working with FTM. When I download MS iso files I seldon have 5%Network usage, usually it is 0.5%. I have two quads (Dell R5400) - 8 CPUs but it is of no help at all. When I leave machine unattended the download usually stops when the screen saver comes up. The download is seldon continuous. It is in fits and bounds. Sometimes I wait for 10 min for the next few KBs. I attribute it to MS servers. AlexB
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2009 1:59am

Thanks for replying!Yes, of course you may :)The operations are performed with Total Commander (it usually yields good performance).Network is not involed. Source and destination reside on two different SATA RAID controllers (no array, the controllers just provide additional independent channels).Screen saver is off (the machine is seldom accessed by console).But your suggestion makes me wonder whether it might be terminal services itself that causes the trouble.Things happen as follows: 1) I log on with remote desktop. 2) Copying is started. 3) I disconnect the session. 4) Later I check to see the progress, but find that the system has entered the state in which it freezes the poor old TC. Other parts of the system are still responsive, but I cannot tell when TC got frozen. Estimation by progress indication suggests that it is not frozen for long (even might have frozen just as/because I logged on). From this point on I can work on the machine, but 5) no new login attempt will succeed anymore (regardless of whether using the same or other user, and of whether trying remote desktop or the console). Local and remote shutdown request fail initiate shutdown (no error message, and no visible effect of shutdown initiation).
March 22nd, 2009 2:32am

Did you write any part of this softwareyourself or it is a naked API? I want to give you a piece of my mind on another issue that may perhaps shed some light. I tend to write huge applications which eventually exceed a quarter of million lines of code after which I lose track. I just keep adding. It is all for myself. I have this form with 12 pages and a million variaous supporting functions that at one point employed numerous Dictionaries (generics) to store search data from a large database. This temporary storage was needed in case I wanted to look at just found item again after another search. I soon noticed that after about 20 minutes my objects were gone and I was getting error messages that the object was not instantiated. I was in despair and finally came to the conclusion that it was all GC fault. I laundered the issue at one of MSDN forums (xml) and MSFT moderators in one voice claimed that I was wrong and this cannot happened. As long as the objects are referenced thay said GC had no power. I disagreed. They even offered to evaluate my code which was impossible: part of it went to the butcher already and everything was so convoluted I could not extricate anything for a demo anyway. I eventually redid every storage as an external xml files and solved the problem completely. I believe it is possible that with a poor design some of the inside of your API might be eaten up. A shell remains, nothing else.AlexB
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2009 5:05am

I want to add that there are numerous and very powerful MS tools that can evaluate runtime performace of the apps, find memory leaks which might be your case as well, profile everything and give you a complete and sometimes a graphic answer. One of them is CLR Profiler downloadable from MSDN, there are kernel debuggers and about a dozen or so useful debuggers all of MS production. I believe you will have to install symbols for that though.AlexB
March 22nd, 2009 5:09am

One more clarification. I had that phenomenon observed (object disappearance) only when I left the form idle for variable time: from 20 to 40 min. Sometimes if I reuse it after 35 min the objects would still be there, sometimes they would be gone after 20 min or so.AlexB
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2009 6:15pm

Thanks for trying, but this is not the case. TC is a stable commercial application. And even IF it would be the worst code in the world, a server should not become unavailable just because of runing one faulty app.The debuggers should not help, since I clearly see that the app is not running: the worker thread is not receiving CPU cycles (FYI: the app is not .NET based).Kernel monitoring is a good idea. I might be able to define the problem better if something told me what the different "System" processes really are, and what they do currently. I would also be eager to know how hashing a large file is eating up the physical memory (while the application has only a minimal working set). It is quite common on vista-kernel systems that the sum memory usage of applications is only 5-10% of available physical memory and at the same time it is reported that physical memory is 98% full and even the simplest processes swap like hell.
March 22nd, 2009 9:01pm

What do you mean "stable commercial application." Stability does not mean anything. Your code is too stable to your taste:) You probably want it to be a bit unstable. I am sure there is a flaw somewhere inside of that code. Perhaps it is not designed to handle huge downloads who knows. Perhaps they never tested it in supra gigabyte area. What kind of files are those? Movies? History of the world? The Library of Congress? AlexB
Free Windows Admin Tool Kit Click here and download it now
March 23rd, 2009 5:10am

AlexBB, your comments are not even close to constructive. You seem to ignore both the stated facts and the questions in the preceding comments: Stable commercial application: Google for its name! You will see millions of users worldwide. But I wont argue about stability, because the operation is performed well when NTFS compression is disabled and fails in the opposite case. NTFS compression supposed to be transparent and its guaranteed that the app does not check for it. "Your code is too": Why do you think I have anything to do with the code?! I cant find any such suggestion. Please dont have the preconception that Im too stupid to post development-related questions to MSDN; I posted here because I think this problem cannot be connected with application development. "I am sure there is a flaw somewhere inside of that code.": You might be right about any (or all) applications being buggy just as I might be right about assuming the same from windows subsystems; this is not the point. Lets suppose TC is not just buggy, but has not a sane line of code! The question (which you also seem to have ignored) remains: "how can that have such a fatal impact on a server, that it becomes inaccessible"... sorry but windows hanging at the "Welcome" screen is no application error. "... not designed to handle huge downloads ..." This same point was already cleared: "Network is not involed. Source and destination reside on two different SATA RAID controllers (no array, the controllers just provide additional independent channels)." "Perhaps they never tested it in supra gigabyte area." Perhaps the tons of people who use it as the primary file manager never copy large files. Let me assure you, I do. Im moving terabytes of data with it every day. No problems unless NTFS compression is involved. "What kind of files are those? Movies? That too is already settled (in the very first sentence of this thread). They are disk images. "Movies" Do you seriously think anyone has >100GB movie files that can be compressed by NTFS with a 60% compression ratio? (I hope this was just a bad joke.) So I dont see how any of your comments could be termed an "answer". The suggestion with debuggers I will follow (this weekend I will have the opportunity to try the operation again; I will use Windows Explorer, since Im fairly sure it will crash the system in exactly the same manner). I could accept that "it can be slow if we copy a very large file to a compressed location" but really would appreciate some info on the reasons (because I think that invalidates NTFS compressions claimed transparency). Could someone post a link to any official info on that?
March 26th, 2009 6:33pm

Assumption confirmed: copying with WindowsExplorer hangs the whole system in exactly the same manner as with other applications (see previous posts).
Free Windows Admin Tool Kit Click here and download it now
March 27th, 2009 1:24pm

Why do you think I have anything to do with the code?! I cant find any such suggestion.I did not imply you wrote the code. It was conversational. You use it, it is "yours." I am sure you use this pronoun in similar contect numerous times daily. This same point was already cleared: "Network is not involed. Source and destination reside on two different SATA RAID controllers (no array, the controllers just provide additional independent channels)."This does not matter. It is still a download in a way, at least it is how I see it.You seem to be offended my comments. Of course I don't understand the problem. If I did I would have told you the solution. And you are right I don't get deep into it and can be redundant, I am not that interested in a way. I am interested in a sense should something like this happen to me in remote future I may be armed, that's it. I think it is better for you to have at least some input even such as I can give you here.AlexB
March 27th, 2009 3:25pm

Thanks Alex, I mostly agree with you. I'm not offended, just don't want already answered points to waste our time (yours by asking again, and mine by quoting the answer).Whether download and local file operations are the same would many IT specialists argue. But this is irrelevant too. The problem is that any application that tries to write that kind of file not only fails, but crashes the whole system.For now, I accepet the oppinoin that local copying is download. But then the assumption "Perhaps it is not designed to handle huge downloads who knows." then reads: "maybe it can't handle large files". I only pointed that this is not the case, because TC and WindowsExplorer both can handle large files (BTW: I've experiences with files in the TB range).Sorry that I could not make the problem clear, I will try to short it:If something (regardless of application) tries to write a large (>100TB) NTFS-compressed file then in a while Windows server 2008 makes the thread starve, and finally crashes itself so that any logins will stuck. Someone has to access the SERVER, and push the reset button.
Free Windows Admin Tool Kit Click here and download it now
March 27th, 2009 3:49pm

Take a fresh look. Post this as a "bug" at MS Connections. It may be very difficult to prove though. They will require source code to test. You can imagine what will be involved.Another thing is topost at TechNet Windows Server 2008. There are good poeple in there, they are MS developers, they may pick up the problem.AlexB
March 27th, 2009 4:48pm

Thanks, MS Conn is a go. (They have the source code: WindowsExplorer fails too - and that is their code.)"post at TechNet Windows Server 2008" - I meant to do that, didn't find the topic specific to 2008 this is why I posted in the general windows server section. I'll get a closer look, thanks.
Free Windows Admin Tool Kit Click here and download it now
March 27th, 2009 4:53pm

Ok, this is another throw. Any file transfer involves parity checks. With huge files that process may take lots of memory. When I download iso files from MS even if the sizze is 3.5 GB (typically) after the download is over the parity check takes 3 minutes. WIth a large file like yours it may take much longer. Perhaps this is what you are seeing without realizing what is going on? Perhaps if you let it do the job it will finish?Do you have more than one processor? WinSer2008 must be run at least on four. So, if you put your job into a background thread it may relieve the OS for doing other things. It will not do it automatically for you. It will not even be a feature in Windows 7. It is a .NET 4.0 future thing. You have to program parallel processing now.Edit: then he realized that the OS crashes:) So forget the above. AlexB
March 27th, 2009 4:59pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics