Incremental Crawls Same Amount of Items as Full (Network Steve Forum)

Incremental Crawls Same Amount of Items as Full

MOSS 2007 Enterprise SP1, 1 WFE w/4GB RAM (x64 vm on Hyper-V), 1 DB server (x64). Indexer runs on WFE. Total items: ~100,000Up until a couple of weeks ago, our indexing and search was working beautifully. Suddenly a couple of weeks ago - at least it seemed that sudden - Incremental crawls of our MOSS content sources thrash the WFE: all CPU cores utilization goes to ~100%, page faults are high, etc. Filtered documents per second is ~50, and eventually all 100,000 items are indexed throughout the thrashing.Previously, Incrementals would not crawl all ~100k items, rather would just crawl changes as expected. Now, all ~100k items are crawled at every Incremental, and clearly not all items are changed. I have removed crawl schedules for all content resources, reset all crawled content, and manually completed a full crawl successfully. Yet a manual Incremental will attempt to crawl all ~100k. No crawler impact rules have been defined (yet).Any ideas on how to get Incrementals to crawl only the changes? Troubleshooting tips will be appreciated!Thee are a few interesting errors in the crawl logs, like: "The item may be too large or corrupt. You may also verify that you have the latest version of this IFilter." and "The filename or extension is too long." We've always seen these and didn't think much of them (some of our sites are glorified trash heaps for the business, but that's another discussion).ULS did reveal the following:Application Server Administration job failed for service instance Microsoft.Office.Server.Search.Administration.SearchAdminSharedWebServiceInstance (2cbe0de5-228d-4c81-ae9c-8ef33e8dacc5). Reason: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. Techinal Support Details: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.An exception occurred while executing the Application Server Administration job. Message: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. Techinal Support Details: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. The Execute method of job definition Microsoft.Office.Server.Administration.ApplicationServerAdministrationServiceJob (ID 0a6523f8-f4b3-4789-a21d-5934735c0e2f) threw an exception. More information is included below. Attempted to read or write protected memory. This is often an indication that other memory is corrupt.and we applied hotfix http://support.microsoft.com/kb/946517, but the above errors still ocurr.A cluster is really a GOOD thing.

February 3rd, 2010 11:50pm

Take a snapshot of your machine and try applying this http://support.microsoft.com/default.aspx/kb/923028 See if the problem goes away, I've read that 946517 can fail again, even though you applied that hotfix and this one resolves that problem. If not I would guess (and could easily be very wrong) that your content DB change log has become corrupted, which could indicate that your content DB(s) itself is a bit corrupted! The reason I say this is that Index incremental crawl is driven by the change log, which exists in each content DB. Index looks into them for its "things that have changed and should now be crawled list" - it sounds like you need to clear the change log and see what happens. Before you even think about doing that I'd get some integrity checks done on the content DB's by your DBA. I could be way off the mark though, so I would seek the hotfix route first - but if you want to try and force clear the change logs then you have to reattach your content DBs as described here: http://technet.microsoft.com/en-us/library/cc263422.aspx. Not an easy activity at all. Regards John Timney

Free Windows Admin Tool Kit Click here and download it now

February 4th, 2010 1:37am

Thanks, John. Great points. I saw 923028 but didn't register a match for our symptoms. On second thought, I think you're on to something and I missed it. Nevertheless, I'm pretty conservative about applying MOSS/SPS hotfixes (have scars to show for it), but I'll snapshot and try it.I read http://technet.microsoft.com/en-us/library/cc263422.aspx, just about fell off my chair when I read, "As an administrator, you should always know when and if a change log should be cleared." All Technet humor aside, is there any other way to clear the change log other than detaching running stsadm.exe -o addcontentdb -clearchangelog? The integrity checks on the content DBs return that all's well, BTW. The largest content DB is < 50GB - but growing quickly.A cluster is really a GOOD thing.

February 4th, 2010 7:49am

Well 923028 is just a suggestion - the benefit of using virtualization is that you can snap - apply and revert if it fails, don't forget you might have to revert your DB's also. The changelogs to be honest is a bit of a punt if your showing no DB corruption, and you shouldn't do it unless your out of options. I'm not aware of any other way to clear the logs other than to go directly into the content DB which invalidates your support. I think there may be some technet humour in that "know if it needs cleared", but they are not wrong. Its a conscious choice that has to be made when you import a content DB. Let me know how you get on :) Regards John Timney

Free Windows Admin Tool Kit Click here and download it now

February 4th, 2010 1:59pm

This topic is archived. No further replies will be accepted.