Is there a way to tell if a document has been OCR'd?

Laurie Somers

3 Answers

Voted Best Answer

A preflight, via Acrobat Pro.

By David Austin   

It is not quite whether the document has been OCRd, but as the result of OCR is that you have text, you can test for the number of words on a given page. To do that, you will use getPageNumWords() as in:

if (this.getPageNumWords() > 0) {
app.alert("the first page of the document has searchable words") ;
} else {
app.alert("the first page of the document has most likely not been OCRd") ;

You can run the above code from the Console, or you can create a bookmark or a menu item with it.

Hope this can help.

Max Wyss   

Hi startrek411,

I'm not sure of a way to tell if it has been OCR'd but there is a way to tell if it hasn't- in Acrobat if you cannot select any text using the Select tool (I-beam with slanted arrow icon in toolbar) or see an I-beam cursor when you click in some text on the PDF, then that indicates the PDF is an image only, i.e., contains no actual text.

Hope this helps,

WindJack Solutions

Dimitri Munkirs   

Please specify a reason: