Checking for duplicate files before processing

Hi All,

We have several BizTalk 2006 applications. The one in question polls a directory for a file and when it sees a file it picks it up, converts it from XML to a flat TXT file and copies it to another directory.

We had an incident recently where the system feeding that file to BizTalk created a duplicate, which BizTalk picked up and processed, causing some duplication issues in the destination system.

Quite simply I would like to use BizTalk to check the file name before it copies the file, and if its file name contains the same date as a file it has already copied that day (should only ever be 1 file per day), through an error/mail and move it somewhere else so the destination system doesn't try processing it.

I appreciate most of this work should be done by both the feed and destination systems, but for various reasons - including timing, this isn't currently an option.

Any advice on a direction I can go with this would be gratefully received.

I have access to the visual studio project and the BizTalk 2006 server console, so I can in theory add any changes that way.

Thanks!


EDIT: Would it be possible to achieve this by creating a Policy and Rules through the Admin Console?
  • Edited by Deejay Quest Wednesday, November 28, 2012 8:08 PM
November 28th, 2012 11:00pm

The only way that this can be done is to store the already processed file names in a database and when processing a new file check the database to see if that file name already exists, if so, flag the new file as a duplicate and stop processing and send an error instead.
Free Windows Admin Tool Kit Click here and download it now
November 28th, 2012 11:46pm

I would suggest using a stored procecure that locks a table, checks to see if the file name already exists in the table, and if it doesn't, inserts a new row with the filename and date/time stamp.  An output parameter can be returned from the stored procedure indicating a new file or an existing file was processed.  Using a custom data access method (referenced assembly), or one of the SQL adapters, you can call the stored procedure from within an orchestration to determine if the file should be processed.

A SQL Job can be created to clean the table at a specified time and alternatively move the deleted rows to an audit table for future reference.

Additional logic can be included in the orchestration logic to handle duplicate logging.  Scoping can be used to process the messages transactionally.

November 29th, 2012 1:12am

Create a custom pipeline component (use it in receive location) and implement the logic to check for duplicate filename as David or Bill suggested. If it identified as duplicate then you can send email, write the file to some directory or any such requirement and to avoid publishing this message, you can produce a null message as output from the pipeline component. If the file is not a duplicate, return the input file as output from the pipeline component.

Free Windows Admin Tool Kit Click here and download it now
November 29th, 2012 2:06am

also you can query on directory directly from custom pipeline like using fileinfo object and check if the incoming filename exists in the target directory. if yes copy that file to some other directory/send an email else process the file.

Regards,

Amit

November 29th, 2012 8:28am

I have seen similar posts in the past, so I decided to create a sample duplicate file filter processor that uses a data access method from an external assembly.

The sample can be downloaded from: http://code.msdn.microsoft.com/BizTalk-Duplicate-File-1c77931b

Free Windows Admin Tool Kit Click here and download it now
December 2nd, 2012 2:26am

Hi,

What is the filename format for the file that is coming in from the source & the one going out at the destination folder.

Thanks,

Sumit

December 3rd, 2012 7:42am

i would suggest you DuplicateFilesDeleter. it works wel and fast results.
Free Windows Admin Tool Kit Click here and download it now
January 31st, 2014 3:28pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics