Python Forum
How to compare two PDFs for differences
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to compare two PDFs for differences
#1
Hello,

I receive two PDFs daily from two different sources. Theoretically they're supposed to have the exact same numbers - but I would like to create an automated report which confirms this and notifies me of any issues.

Unfortunately, certain 'titles' within the reports differ ever so slightly, which makes recognition on this basis difficult.

For example, they may have the same balances, but one might be called "duck USD" while the other just called "duck" (example names).

Bearing in mind I'm relatively new when it comes to Python, what road could I go down in order to create this automation?




To give an example of the style of layout:

Report 1
Dog 50 2,000 5,000
Cat 80 5,000 10,000

Report 2
Dog USD 50 2,000 5,000
Cat EUR 80 5,000 10,000
Reply
#2
Can you explain more what you want to do?
Reply
#3
This has a pdf to text function:
https://pypi.org/project/pdf/
I would convert to txt and run a system file compare (until I learn to use the python functions for that!)

Better is this:
https://pypi.org/project/pdf-diff3/
Which will display the diff in a png image.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting data from bank statement PDFs (Accountant) a4avinash 4 15,642 Feb-27-2025, 01:53 PM
Last Post: griffinhenry
  Compare 2 files for duplicates and save the differences cubangt 2 1,745 Sep-12-2024, 03:55 PM
Last Post: cubangt
  Comparing PDFs CaseCRS 5 5,895 Apr-01-2023, 05:46 AM
Last Post: DPaul
  Calculate the sum of the differences inside tuple PUP280 4 3,104 Aug-12-2022, 07:20 PM
Last Post: deanhystad
  Sort Differences in 2.7 and 3.10 Explained dgrunwal 2 2,514 Apr-27-2022, 02:50 AM
Last Post: deanhystad
  download pubmed PDFs using pubmed2pdf in python Wooki 8 10,119 Oct-19-2020, 03:06 PM
Last Post: jefsummers
  Concatenate multiple PDFs using python gmehta1996 0 3,160 Mar-29-2020, 09:48 PM
Last Post: gmehta1996
  Most optimized way to merge figures from multiple PDFs into one PDF page? dmm809 1 3,255 May-22-2019, 10:32 PM
Last Post: micseydel
  Condition check differences and how to organise code? adam2020 4 4,196 May-12-2019, 04:12 PM
Last Post: Yoriz
  Merging pdfs with PyPDF2 Pedroski55 0 4,143 Mar-07-2019, 11:58 PM
Last Post: Pedroski55

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.