How to find out why is text not searchable in a PDF (and make it searchable)

So, what else could be the reason for the DPF not being searchable? And how to make it text-searchable?

Rabarberski asked Mar 6, 2013 at 9:45 Rabarberski Rabarberski 8,640 27 27 gold badges 73 73 silver badges 88 88 bronze badges Interesting, is that document contains any sensitive data? if not can you share it? Commented Mar 6, 2013 at 9:49

@SparKot: I am not sure if I can share the document, so I prefer rather not to. Although I understand this would greatly aid in troubleshooting.

Commented Mar 6, 2013 at 10:02

Have you tried to upload it to Evernote and check if they can make it searchable? AFAIK they have a good OCR engine for that task.

Commented Mar 6, 2013 at 10:17

7 Answers 7

To make it text searchable, the best way may be to go back to the original source (e.g. a Word document) and use a different process to produce the PDF. Alternatively you could try rendering your current PDF as a bitmap and then using OCR, but this will be tedious and produce poor results.

4,127 8 8 gold badges 25 25 silver badges 37 37 bronze badges answered Mar 6, 2013 at 10:24 RedGrittyBrick RedGrittyBrick 83.9k 20 20 gold badges 141 141 silver badges 213 213 bronze badges

Ah, the encoding seems indeed to be the issue. When I try to copy paste text, I get garbage. And the Font tab in Acrobat says for each listed font 'encoding: custom'

Commented Mar 6, 2013 at 10:30

I found a way around this problem. I did tools -> edit document text, then for each page, I hit Control-A (select all), then right-clicked and went to properties, and changed the font to something else. After I did this, the text was searchable and I could copy the text!

answered Apr 29, 2016 at 7:27 21 2 2 bronze badges I think the edit document text option is only available in the paid version of Acrobat. Commented May 1, 2016 at 18:57

Probably - the original poster has Acrobat Professional 8. That should have it. This approach (changing the font) may work with other tools.

Commented May 4, 2016 at 3:03

this might be old but characters encoding issues in compound path pdf are still an issue today I solved by

Test source

Environment

answered Apr 20 at 11:08 111 4 4 bronze badges I would love to avoid illustrator but I did not find any commands line solution yet Commented Apr 20 at 11:11

go to Edit / preferences - select 'search' from the left hand side of preferences screen - then 'Purge Cache Contents' - select OK then close and reopen the document

answered Jun 1, 2017 at 22:09 hope this helps hope this helps

So after trying a lot of things that didn't work. Here's how I actually got this done:

  1. Find yourself a PDF to Word converter or something. (I recommend https://www.online-convert.com/ )
  2. Follow al the necessary steps to convert BUT before that--
  3. Find the button that says something like 'optical character recognition' and click that
  4. Convert your file and you should be golden.
answered Jun 1, 2018 at 20:39

One possible cause resulting in a non searchable or partially searchable PDF document may be a watermark overlapping the text and preventing the text underneath it from being searchable.

To make such PDF searchable (with, for example, pdfgrep ) you can try the ocrmypdf utility.

This single command is all you need (but there are tons of other options if you want to tweak the process):

$ ocrmypdf in.pdf out.pdf --force-ocr --sidecar 
answered Aug 10 at 20:41 Johnny Baloney Johnny Baloney 503 5 5 silver badges 11 11 bronze badges

I was having the same problem, and in frustration, googled to find an answer. It turns out that for me, the problem was simply that I was using Preview on my iMac to view and search the PDF. In most cases, searching works in Preview. But for a large book downloaded from Google Books, it didn't.

What worked was simply opening the PDF in Adobe Reader. (Duh, what a concept, I know.) Now I can search. This probably won't work for everyone with a Mac, but it might help someone.

answered Jan 2, 2017 at 19:18 "I've tried with Adobe Acrobat Professional 8" OP said. Please read the question carefully. Commented Jan 2, 2017 at 19:43 Please read the question again carefully. Your answer does not answer the original question. Commented Jan 29, 2017 at 15:47

You must log in to answer this question.

Highly active question. Earn 10 reputation (not counting the association bonus) in order to answer this question. The reputation requirement helps protect this question from spam and non-answer activity.