First of all, I would like to say that the viewer is working normally for text-based PDFs. The problem is when I try to do the same with a PDF image-based that was OCR. Instead of getting the correct text, I got just some ‘22222’.
If you open the attached PDF document in a PDF reader, you’ll see that you can search and select the words, but GroupDocs doesn’t recognize those words.
Both, if you use the web app and try to select the text and copy and paste it in another place you will see strange text, you can try to search for the text too, if you do you won’t find any result.
In the back-end I tried to render it to HTML and the result file has some just ‘2’ characters instead of the correct text.
html-generated-file.png - HTML generated file by the Viewer with HtmlViewOptions in the cache folder
pdf-opened-on-gd-viewer.png - Any result was found for “shrimp” word
pdf-open-on-chrome.png - I can find “shrimp” word when I open the same file on Chrome
search-2222-gd-viewer.png - If I search for “2222” it finds occurrence because during the Html creation it was added a lot of “2222” instead of the correct words.