This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Dennis Sweeney
Recipients Dennis Sweeney, Zeturic, ammar2, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-08.11:06:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1602155219.38.0.293310549875.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Indeed, this is just a very unlucky case. >>> n = len(longer) >>> from collections import Counter >>> Counter(s[:n]) Counter({0: 9056995, 255: 6346813}) >>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@') b'..............................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@' >>> Counter(s[n:]) Counter({255: 18150624}) When checking "base", we're in this situation pattern: @@@@@@@@ string: .........@@@@@@@@ Algorithm says: ^ these last characters don't match. ^ this next character is not in the pattern Therefore, skip ahead a bunch: pattern: @@@@@@@@ string: .........@@@@@@@@ This is a match! Whereas when checking "longer", we're in this situation: pattern: @@@@@@@@@ string: .........@@@@@@@@ Algorithm says: ^ these last characters don't match. ^ this next character *is* in the pattern. We can't jump forward. pattern: @@@@@@@@ string: .........@@@@@@@@ Start comparing at every single alignment... I'm attaching reproducer.py, which replicates this from scratch without loading data from a file.
History
Date User Action Args
2020-10-08 11:06:59Dennis Sweeneysetrecipients: + Dennis Sweeney, tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, Zeturic
2020-10-08 11:06:59Dennis Sweeneysetmessageid: <1602155219.38.0.293310549875.issue41972@roundup.psfhosted.org>
2020-10-08 11:06:59Dennis Sweeneylinkissue41972 messages
2020-10-08 11:06:59Dennis Sweeneycreate