This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Created on 2017-09-17 17:27 by scoder, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug1.py vstinner, 2017-09-18 07:00
bug2.py vstinner, 2017-09-18 07:00
test_issue31499.diff serhiy.storchaka, 2017-09-18 09:37
Pull Requests
URL Status Linked Edit
PR 3641 merged vstinner, 2017-09-18 08:59
PR 3645 merged python-dev, 2017-09-18 12:29
Messages (17)
msg302375 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-17 17:27
I'm seeing crashes in the latest Py3.7 when I run this test (taken from lxml's compatibility test suite): etree = xml.etree.ElementTree def test_feed_parser_error_position(self): ParseError = etree.ParseError parser = XMLParser() try: parser.close() except ParseError: e = sys.exc_info()[1] self.assertNotEqual(None, e.code) self.assertNotEqual(0, e.code) self.assertTrue(isinstance(e.position, tuple)) self.assertTrue(e.position >= (0, 0)) It crashes in expat/xmlparse.c, line 1464: 1459 for (;;) { 1460 BINDING *b = bindings; 1461 if (!b) 1462 break; 1463 bindings = b->nextTagBinding; 1464 FREE(b->uri); // <<<<<<<<<<<<<<< crashes here 1465 FREE(b); 1466 } 1467 } Probably related to the new expat version (issue 31170).
msg302383 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-17 19:18
I can't reproduce a crash. Could you please provide a Python script?
msg302384 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-17 20:06
Sorry, wrong line number. Was using an installed Py3.7, not a fresh build. However, my crashing installed version is from September 1st, *before* the expat update, which was apparently on September 5th. With a clean debug build, I get a reproducible crash during shutdown, now with the correct line number: Program received signal SIGSEGV, Segmentation fault. PyExpat_XML_ParserFree (parser=0xd01a20) at .../Modules/expat/xmlparse.c:1487 1487 tagList = tagList->parent; (gdb) p taglist No symbol "taglist" in current context. (gdb) list 1482 break; 1483 tagList = freeTagList; 1484 freeTagList = NULL; 1485 } 1486 p = tagList; 1487 tagList = tagList->parent; 1488 FREE(p->buf); 1489 destroyBindings(p->bindings, parser); 1490 FREE(p); 1491 } (gdb) bt #0 PyExpat_XML_ParserFree (parser=0xd01a20) at Modules/expat/xmlparse.c:1487 #1 0x00007fffeec83517 in xmlparser_gc_clear (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3414 #2 xmlparser_dealloc (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3435 #3 0x000000000041fd40 in frame_dealloc (f=0xd017c8) at Objects/frameobject.c:428 #4 0x000000000044f5e5 in tb_dealloc (tb=0x7ffff632d218) at Python/traceback.c:56 #5 0x000000000049c59d in BaseException_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:78 #6 SyntaxError_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:1381 #7 0x0000000000458104 in delete_garbage (old=<optimized out>, collectable=<optimized out>) at Modules/gcmodule.c:759 #8 collect (generation=generation@entry=2, n_collected=n_collected@entry=0x7fffffffd6c8, n_uncollectable=n_uncollectable@entry=0x7fffffffd6d0, nofail=nofail@entry=0) at Modules/gcmodule.c:911 #9 0x0000000000458d2b in collect_with_callback (generation=2) at Modules/gcmodule.c:1020 #10 PyGC_Collect () at Modules/gcmodule.c:1507 #11 0x0000000000439aaf in Py_FinalizeEx () at Python/pylifecycle.c:1002 #12 0x000000000043b3d8 in Py_Exit (sts=sts@entry=0) at Python/pylifecycle.c:1953 #13 0x000000000043f4e8 in handle_system_exit () at Python/pythonrun.c:608 #14 0x000000000043f94d in handle_system_exit () at Python/pythonrun.c:635 #15 PyErr_PrintEx (set_sys_last_vars=set_sys_last_vars@entry=1) at Python/pythonrun.c:618 #16 0x000000000043fc4a in PyErr_Print () at Python/pythonrun.c:514 #17 0x0000000000441475 in PyRun_SimpleFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:407 #18 0x0000000000441633 in PyRun_AnyFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:83 #19 0x000000000045666c in run_file (p_cf=0x7fffffffd8e0, filename=0x9973d0 L"test.py", fp=0x9dc050) at Modules/main.c:341 #20 Py_Main (argc=argc@entry=6, argv=argv@entry=0x996020) at Modules/main.c:895 #21 0x000000000041fa2b in main (argc=6, argv=<optimized out>) at ./Programs/python.c:102 I get it in lxml's test suite runs, will try to reduce it to a test script.
msg302385 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-17 20:28
Minimal reproducer seems to be this: ---------- import xml.etree.ElementTree as etree def test(): parser = etree.XMLParser() try: parser.close() except etree.ParseError as exc: e = exc # must keep local reference! test() ---------- The master gitrev I tested is 132a7d7cdbc7cb89fa1c1f4e8192241c3d68f549 Since this happens during GC and finalisation, it's quite possible that it's not a bug in ET per se. It might just be a prematurely cleared reference in some tp_clear(), or something like that. (Changing ticket title appropriately.)
msg302411 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:00
I confirm the crash using attached bug1.py (first example, completed) and bug2.py (second example).
msg302412 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-18 07:13
Thanks for confirming, Victor. I hadn't realised that the first update of expat was already back in June. That means it's not ruled out yet as a source of this crash. Bisecting is probably a good idea.
msg302414 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:28
The bug is at this line: Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 static int xmlparser_gc_clear(XMLParserObject *self) { EXPAT(ParserFree)(self->parser); // <--- HERE ... } This function calls XML_ParserFree() twice on the same parser object. The first call is fine and frees the memory. Since we now use Python memory allocators, XML_ParserFree() fills the freed memory with 0xDB byte pattern (when Python is in debug mode). The second XML_ParserFree() call uses freed memory (filled with 0xDB in debug mode). Call 1: a GC collection Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 (gdb) up #1 0x0000000000446636 in delete_garbage (collectable=0x7fffffffd9a0, old=0x9b8f90 <_PyRuntime+432>) at Modules/gcmodule.c:759 (gdb) up #2 0x0000000000446ade in collect (generation=2, n_collected=0x7fffffffda30, n_uncollectable=0x7fffffffda28, nofail=0) at Modules/gcmodule.c:911 (gdb) cont Continuing. Call 2: xmlparser_dealloc() Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 (gdb) up #1 0x00007ffff0038cb8 in xmlparser_dealloc (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3435 IMHO it's an obvious bug in Python. The question is more why/how the code didn't crash before? :-)
msg302415 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:32
Python 2.7 is not affected because it doesn't implement tp_clear (it doesn't have xmlparser_gc_clear()), only xmlparser_dealloc() calls EXPAT(ParserFree)(self->parser). I'm unable to reproduce the bug in Python 3.5 nor 3.6. bug2.py creates a reference cycle the "except etree.ParseError as exc: e = exc # must keep local reference!" which requires to trigger a garbage collection to clear the "parser" variable.
msg302416 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2017-09-18 07:34
> The question is more why/how the code didn't crash before? :-) Typical case of a Schroedinbug.
msg302426 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 08:45
>> The question is more why/how the code didn't crash before? :-) > > Typical case of a Schroedinbug. I don't believe in the chaos :-) I ran a "git bisect" since January 1st, 2017. Attached bug2.py started to crash since the following commit related to bpo-29049. --- 5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e is the first bad commit commit 5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e Author: INADA Naoki <songofacandy@gmail.com> Date: Sat Dec 24 20:19:08 2016 +0900 Issue #29049: Call _PyObject_GC_TRACK() lazily when calling Python function. Calling function is up to 5% faster. --- This change is also correct. It's more that the change showed a bug which was hidden before. It's just that now the garbage collector breaks the reference cycle differently since frames tracked differently by the GC.
msg302430 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:00
Attached PR 3641 fixes bug1.py and bug2.py crashes. Sadly, I failed to write a reliable unit test using bug2.py. The bug requires to trigger the garbage collector in a specific order which depends on how frames are tracked by the GC...
msg302432 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-18 09:20
def test_issue31499(self): def test(): ... test() test.support.gc_collect()
msg302434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:21
Serhiy Storchaka: I tried your pattern, but failed to write a reliable unit test. Can you please write a full patch / test?
msg302441 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:49
test_issue31499.diff: Oh great, it works (to reproduce the crash). I modified your test and included it in my PR, I added you as a co-author of my PR ;-)
msg302447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 12:29
 New changeset e727d41ffcd91b21ce82026ec8c8381d34a16209 by Victor Stinner in branch 'master': bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (#3641) https://github.com/python/cpython/commit/e727d41ffcd91b21ce82026ec8c8381d34a16209 
msg302449 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 12:48
 New changeset 8afd7ab12d7f8915b549cf04af384b495ec73d22 by Victor Stinner (Miss Islington (bot)) in branch '3.6': [3.6] bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (GH-3641) (#3645) https://github.com/python/cpython/commit/8afd7ab12d7f8915b549cf04af384b495ec73d22 
msg302451 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 13:25
The bug is now fixed in Python 3.6 and master. Thanks for the bug report and analysis, Stefan Behnel!
History
Date User Action Args
2022-04-11 14:58:52adminsetgithub: 75680
2017-09-18 13:25:23vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg302451

stage: patch review -> resolved
2017-09-18 12:48:26vstinnersetmessages: + msg302449
2017-09-18 12:29:51python-devsetpull_requests: + pull_request3639
2017-09-18 12:29:40vstinnersetmessages: + msg302447
2017-09-18 09:49:01vstinnersetmessages: + msg302441
2017-09-18 09:37:05serhiy.storchakasetfiles: + test_issue31499.diff
2017-09-18 09:21:28vstinnersetmessages: + msg302434
2017-09-18 09:20:38serhiy.storchakasetmessages: + msg302432
2017-09-18 09:00:54vstinnersetmessages: + msg302430
versions: + Python 3.6
2017-09-18 08:59:06vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request3636
2017-09-18 08:45:05vstinnersetmessages: + msg302426
2017-09-18 07:34:00scodersetmessages: + msg302416
2017-09-18 07:32:39vstinnersetmessages: + msg302415
2017-09-18 07:28:28vstinnersetmessages: + msg302414
2017-09-18 07:13:52scodersetmessages: + msg302412
2017-09-18 07:00:17vstinnersetfiles: + bug2.py
2017-09-18 07:00:12vstinnersetfiles: + bug1.py

messages: + msg302411
2017-09-17 20:28:16scodersetmessages: + msg302385
title: ElementTree crash with new expat -> ElementTree crash while cleaning up ParseError
2017-09-17 20:06:12scodersetmessages: + msg302384
2017-09-17 19:18:39serhiy.storchakasetmessages: + msg302383
2017-09-17 18:50:07jklothsetnosy: + jkloth
2017-09-17 17:27:08scodercreate