OCR Existing PDFs on SharePoint 2013

Hello,

My organization has a large document repository for contracts and other final versions of files. Most of these files are PDFs, and we need the ability to search within the text of those PDFs. Unfortunately, whoever originally scanned in those PDFs did not OCR many of those files. They are basically just images.

So from what I can tell, SharePoint Search in 2013 can easily search the text inside of PDFs, but only after they've been saved with character/text recognition.

My question is how can I OCR PDF files in mass? I am noticing also that I cannot tell which files have had OCR performed unless I open them. It would be nice to OCR only the files that need it.

We also have a number of metadata tags on each file that we need; so any solution should not erase our metadata tagging.

Thanks!

July 23rd, 2015 3:54pm

You may want to look at 3rd parties who specialize in this type of activity, like KnowledgeLake. OOTB, SharePoint doesn't have a feature that could help, here.
Free Windows Admin Tool Kit Click here and download it now
July 23rd, 2015 3:58pm

Thanks Trevor. I will look into that. I am curious if anyone else has tried any third parties for batch OCR on files stored in SharePoint.
July 23rd, 2015 5:00pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics