3

Well, I have a scanned PDF with some slightly changes made by hand and a source file. I wish to make a PDF, which would be searchable (based on the text from the source, the changes would remain as they are).

I am searching a free (and even better - portable) software which would allow me to somehow "combine" the images from a scan and the text from the source DOC file. So it SEEMS like the image is selectable and searchable.

UPD: use case: I have the source DOC file. Then, I printed it. Then, I made some notes by hand on the sheet with the printed document. Then - I scanned it. What I want - is making a PDF with the scanned images, but at the same time the text on this image should be selectable and searchable. Like the "OCR" feature of the Acrobat, but without doing actual OCR - cause I have the oiginal source text - and with an freeware and portable software.

7
  • Why not just make the doc into a pdf (perhaps with the scan as a background)? Commented Mar 28, 2012 at 11:41
  • The text will be doubled by this way. Well, I am already thinking about making somehow the letters transparent and inserting the scans as a background... But isn't there a better solution? And I have no ideas about how to make a text transparent in Office Word Commented Mar 28, 2012 at 11:47
  • I'm havinging some truoble understanding your question. Can you please clarify your question? For example, please reword "some slightly changes made by hand and a source file"... I'm not explaining properly... Please just make your question easier to understand. Commented Mar 29, 2012 at 16:29
  • I mean - I have the source DOC file. Then, I printed it. Then, I made some notes by hand on the sheet with the printed document. Then - I scanned it. What I want - is making a PDF with the scanned images, but at the same time the text on this image should be selectable and searchable. Like the "OCR" feature of the Acrobat, but without doing actual OCR - cause I have the oiginal source text - and with an freeware and portable software. Commented Mar 31, 2012 at 18:57
  • Take a look to pdfsandwich command, here the website of the project. Commented Apr 5, 2018 at 5:59

4 Answers 4

2
+50

as of this answer, you can do it with the free commandline pdf tool pdftk, as follows:

$ pdftk file1.pdf multibackground file2.pdf output combinedfile.pdf

Use the searchable text as background and the scanned file as foreground, otherwise it will be a mess to see superposed text.

In acrobat reader, the text highlight from the "search" command will be visible in front of the image.

1
  • This answer seems to be spot-on. I'll mark probably this as the solution if no other answers will happen within 2 days, thank you! Commented Oct 6, 2022 at 20:02
1
  • make your corrections on paper in a color clearly distinct from the printed text
  • scan the document as image file, possibly as .tif to avoid compression artefacts

Option A) - for word 2010 & onwards

  • import your picture into word
  • set your text color as "transparent":
    Select the picture, and go to Picture Format > Color or Picture Tools > Format > Color.
    Select Set Transparent Color. (from MS Word Help)

set transparent color

  • scale the image to full page
  • set image to appear behind text (Picture format -> wrap text -> behind text)
  • Adjust the position of the image until it matches the text of the document
  • export your pdf

Option B) - might get better results

Use a free image editor, such as Gimp, to remove the text color. Then import into word.

I can elaborate on this if this answer is helpful.

1
  • This does seem like a viable approach (and I actually did exactly this on some occasions, although it's tedious for doing in multipaged documents - imagine a 100 page contract signed with all parties on each page), but I'm looking more for tools to actually manipulate this hidden text "layer" in PDFs. Commented Oct 5, 2022 at 17:46
0

Ehow tech posted three methods of converting Word documents to PDF (aka Portable Document Format) two of which I am sure work fine, not sure about Zamar.

  1. Go to the Zamzar website. Zamzar provides free conversion to and from different formats. This option works well if you don't need to convert Word documents to PDF frequently.

  2. Purchase and install Adobe Acrobat. At the time of publication, Adobe Acrobat Standard was selling for approximately $300 (now only $139). A new "Save as PDF" option is added to [Microsoft Word] after installing Acrobat. Most libraries, schools, Sony PCs, Work lapotps (the ones provided by your company) already have Adobe Acrobat installed.

  3. Microsoft Office Add-in: Microsoft Save as PDF or XPS This add-in allows you to export and save to the PDF and XPS formats in eight 2007 Microsoft Office programs.

2
  • 4. OpenOffice.org reads Word docs and writes PDF Commented Mar 28, 2012 at 12:32
  • 2
    This is not exactly what I wanted. Sure, I can transform a plain DOC in PDF, this is not a problem at all. But I need to add a "selection layer" or something like this to an existing PDF document (or just the images of this document), so the scanned document gets the ability of being selected and searched. It seems that Acrobat is making such a layer when we use it's "OCR Recognition" feature - but I don't need OCR - I already have the original text. And I can't use Acrobat because of some restrictions. I need a freeware and portable software. Commented Mar 29, 2012 at 5:30
0

Change your approach to the problem, use a computer with stylus like Microsoft Surface Pro. You can find cheaper alternatives online.

This way your notes will retain good quality and searchable. You'll save hassle of printing and scanning.

1
  • Edits are not made by me, I'm getting an already altered piece of paper back from my chief. Eg he places his signature on the paper. I can't really tell him to use a stylus instead. Commented Oct 5, 2022 at 17:44

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.