PDA

View Full Version : App to edit scanned PDF?



Florius Frammel
03-06-2019, 07:52 AM
Hi!

I scanned a book that's not available since years. I don't want to buy the adobe pro version (too expensive).

Any recommendations for this? I need to seperate the scanned pages (two scanned at once), twist them, whiten the background, blacken and straighten the characters, reduce data size and connect the different files.

In the past I used three different programs for this work. I hope there is an easier way with less effort.

OCR is not necessary, but wouldn't be bad either.

Edit: Changed thread topic due to different MO

Edit 2: Just found out I am not able to change thread topic, but only the post topic :D


Thanks!

Illen A. Cluf
03-06-2019, 01:59 PM
Hi!

OCR is not necessary, but wouldn't be bad either.

Thanks!

Based on the amount of editing, it might be better in the long run to just OCR the text. There are many free online OCR's that are easy to use. For example: onlineocr.net.

Florius Frammel
03-06-2019, 04:06 PM
Thanks Illen,

I guess you're right. I tried onlineocr.net, but they have a limit of 50 pages even for registered users. The file was corrupt after six pages too. Other online OCR services can't handle my text at all.

So I change my question:

Does anyone know a good OCR program, either on- or offline, that can handle german and has no page or file size limits?

It doesn't have to be free, but I hate the new "monthly/yearly paying" policy of Adobe, Microsoft and others..I see the point with Netflix, Spotify and others but not with software.

vigilance
03-06-2019, 05:08 PM
I usually just use the built in OCR in Adobe Acrobat Pro. But Abby FineReader is better.

Edit:

or you could always just send me the book and I could run OCR on it for you.

Edit again:

What you need to do is export the pages first (I use photoshop, drag and drop the PDF onto the window, and it will let you choose the pages to extract). Then you need to edit the images in something like PhotoShop or GIMP, and then rebuild the PDF from the images.

For seperating the pages, you would use Macros or Actions, as long as its pretty consistent. I could do this programmatically in visual basic or something, read in the image and split it at "x", and save two images.

Florius Frammel
03-06-2019, 05:19 PM
I usually just use the built in OCR in Adobe Acrobat Pro. But Abby FineReader is better.

Edit:

or you could always just send me the book and I could run OCR on it for you.

Thanks Greg for your suggestions and your offering to help!

However there is some work I have to do myself. That's mostly manually correcting words the software doesn't get right. Sometimes it's because some pages are scanned sloppy (especially at the beginning - shit in -> shit out), sometimes I have no idea why it couldn't decipher it right.

vigilance
03-06-2019, 05:29 PM
Yeah you always have to do that. I'm not even sure how to do that within the saved PDF text.. I always have to make the corrections when I copy/paste it somewhere.

Whats good about Abbyy Finereader is that you can choose to "train" it in different dictionaries. I might actually try doing this with Fraktur at some point.

zoas23
03-06-2019, 06:23 PM
Something like this: https://pdf2doc.com/ ?
I have used similar webs in the past to edit .pdfs that I wrote and could not find the .doc version.
(though it may be useless if you have one of those pdfs in which the text is simply photos.