BDC Filtering and Incremental Crawl

Is there anyway we can set Incremental crawl not to delete previous items when the gatherer does not found them in previous incremental crawl.

The reason being is we have a BDC with timestamp to crawl external content source which is a huge SQL table with millions of records and when we just set normal incremental crawl for this table, it always stuck at one million plus records. This table is growing daily with average of 11,000 records/day. As an alternative we configured the BDC to filter today's records only for crawling but the incremental crawl always deletes away the records that have indexed yesterday.

We have tried ChangeLog-Based crawl by following the steps in msdn blog, which seems everything is fine until we realized that incremental crawls does not populate crawled properties and also managed properties which are mapped to them.

We would appreciate anyone who could help us or provide us some directions on this. Is there any workaround? 


  • Edited by mark.tz Wednesday, December 18, 2013 8:10 AM
December 18th, 2013 10:40am

Hi mark.tz,

For this issue, i would involve someone familiar with this topic to further look at this. It may task some time, thanks for your understanding.

Free Windows Admin Tool Kit Click here and download it now
January 6th, 2014 3:01am

Hi mark.tz,

"Users can only search on managed properties and not on crawled properties. To make a crawled property available for search queries, you must map the crawled property to a managed property. You can map multiple crawled properties to a single managed property or map a single crawled property to multiple managed properties."

http://technet.microsoft.com/en-us/library/jj219667.aspx

perhaps you can check the map settings on this part.

for this sentence, i am a little bit confused: 

"Is there anyway we can set Incremental crawl not to delete previous items when the gatherer does not found them in previous incremental crawl."

as i know, incremental crawl will add/remove the index based on the changelog, but if the items that changed are to big or after some periods of time, then it may delete previous information. 

and also, an incremental crawl will delete any content it cannot find, may even if it was previously found. you need to be careful regarding the crawl rules about add and delete, and also you need to check if your pages working properly or they will be deleted.


January 6th, 2014 5:56am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics