How to check a pdf file if not corrupt?

We have several hundred PDF files submitted to us for display.

We have a database to allow search and display of these files but occasionally some are corrupt, displaying 'There was an error opening this document. There was a problem reading this document (14).' or 'Expected a dict object.'

Is there a way we can add code to our web page to check these prior to display?


Elaine Langford


2 Answers

If you have Adobe Acrobat XI Pro you can use the Preflight tool to check a PDF file or many files at once if you create an Action with Action Wizard to do it in batch.
You can use the "Report PDF syntax issues" under "PDF analysis" profiles clicking the Preflight tool in Print Production panel. Go to menu View -> Tools -> Print Production -> Preflight to do it in the current document:

enter image description here
enter image description here

To create an Action to check PDF files in batch, go to menu View -> Tools-> Action Wizard

enter image description here

enter image description here

enter image description here


Almir R V Santos   

Your question is about adding a feature to the web page to identify these documents. Unfortunately you cannot use the preflight tool suggested in the previous post. You cannot run Acrobat on a server - for technical and legal reasons. This means that you either need to check these files before you upload them, or come up with a different solution.

I would try to identify what is common to these files (e.g. produced by the same application, from the same source, received in a certain time frame, all ported over from an old server, ...) and then try to find those files on the server, download all of them and check them on a client.

If you are creating these files, you may also want to look into why they are corrupt. Bad software usually produces bad PDF files, and you may have to replace the software that creates these files.

If you need help debugging these files, this is what I do as part of my consulting business, so feel free to get in touch with me.

Karl Heinz Kremer
PDF Acrobatics Without a Net
PDF Software Development, Training and More...
http://www.khkonsulting.com


Karl Heinz Kremer   


Please specify a reason: