Skip to content

Conversation

@SavageTiger
Copy link
Contributor

@SavageTiger SavageTiger commented Jan 22, 2021

Description

As a preparation to move away from string manipulation (HtmlDiff) and having to ping pong between DOMDocument (TableDiff) / SimpeXML + extended library (ListDiff), I have ported the ListDiff to also use DOMDocument.

We can now get rid of the third party php-simple-html-dom-parser.

Huge string to dom overhead

We have huge overhead now in both ListDiff and TableDiff where we keep going from an html string to objects and back, we should get rid of this once HtmlDiff is also ported

Performance impact

I have been doing loads of testing, and the performance impact is minimal, it turns out it is slightly faster.

In some special cases it can even be 30% faster, but that is not in really realistic scenario's

Big cleanup

All the old ListDiff code was still in the codebase, but none of it was used since it was replaced by HtmlDiffList so that was a huge easy cleanup

Sven Hagemann added 2 commits January 27, 2021 15:50
A long time ago DiffList was replaced by DiffListLines, but the old classes have always lingered around as dead code. This commit cleans them up.
These methods never got called, so we can safely remove them.
@SavageTiger SavageTiger force-pushed the sven/feature/list_improvements branch 2 times, most recently from e7a62cf to 0d0f608 Compare January 27, 2021 14:57
Sven Hagemann added 2 commits January 27, 2021 18:00
Right now TableDiff already uses DOMDocument but this component still used a third party library that extended the old SimpleXML extension from PHP. I ported to the code to use DOMDocument, making the library absolete, and slightly improving performance. Mostly this is preperation for moving the entire library to DOMDocument
It is not used anymore since ListDiff get ported to DOMDocument, so we can get rid of it
@SavageTiger SavageTiger force-pushed the sven/feature/list_improvements branch from 0d0f608 to c946d79 Compare January 27, 2021 17:00
Since a node with just a space in it is not interpreted as a space by a HTML parsing engine we should not insert empty tags. These seem to result from indent spacing around text being interpreted as regular spaces.
@SavageTiger SavageTiger force-pushed the sven/feature/list_improvements branch from 97c64c0 to 2a5a3ce Compare January 27, 2021 22:05
@SavageTiger SavageTiger changed the title WIP Ported ListDiff to DOMDocumen Jan 27, 2021
@jschroed91
Copy link
Member

Really liking these improvements here @SavageTiger ! I haven't had a chance to review in depth, and don't let my lack of a review be a blocker to you if you're confident in the work.

@SavageTiger
Copy link
Contributor Author

@jschroed91 I know you trust me to do the right thing and believe me, I have been doing loads of testing, and found loads of tiny bugs in the process haha

Just invited you as a reviewer more as a celebratory moment, that some cool improvements have been done again.

@jschroed91
Copy link
Member

@SavageTiger Yes this is AWESOME! It sounds like huge improvements without any noticeable tradeoffs, which is great. So glad that you've found and squashes some pesky bugs, and being able to drop an entire dependency of another library will be a great bonus.

@SavageTiger SavageTiger changed the title Ported ListDiff to DOMDocumen Ported ListDiff to DOMDocument Jan 28, 2021
@SavageTiger SavageTiger merged commit 25bce7b into master Jan 28, 2021
@SavageTiger SavageTiger deleted the sven/feature/list_improvements branch January 28, 2021 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 participants