Skip to content

Replace weakref to the _WeakValueDictionary with strong ref to outlives keys #136345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

sergey-miryanov
Copy link
Contributor

No description provided.

@sergey-miryanov
Copy link
Contributor Author

!buildbot refleak*

@bedevere-bot
Copy link

You don't have write permissions to trigger a build

@sobolevn sobolevn added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Jul 6, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sobolevn for commit c30038d 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136345%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Jul 6, 2025
@picnixz
Copy link
Member

picnixz commented Jul 6, 2025

Can you add the issue this pr tries to solve? just for triaging purposes

@sergey-miryanov
Copy link
Contributor Author

sergey-miryanov commented Jul 7, 2025

@picnixz Sorry! It was a quick test.

There is no particular issue yet. This is an experiment for #136189. If @nascheme approves this change maybe we go creating an issue or integrate this change into #136189.

@sergey-miryanov sergey-miryanov changed the title Test Replace weakref to the WeakValueDictionary to outlives keys Jul 7, 2025
@sergey-miryanov sergey-miryanov changed the title Replace weakref to the WeakValueDictionary to outlives keys Replace weakref to the _WeakValueDictionary to outlives keys Jul 7, 2025
@sergey-miryanov sergey-miryanov changed the title Replace weakref to the _WeakValueDictionary to outlives keys Replace weakref to the _WeakValueDictionary with strong ref to outlives keys Jul 7, 2025
@neonene
Copy link
Contributor

neonene commented Jul 7, 2025

I don't know what code you are trying. Can you show the leak-reproducible code here
by merging #136189? Then fix the leak.

I'm mainly using these tests: #136189 (comment).

@neonene
Copy link
Contributor

neonene commented Jul 7, 2025

Can you show the leak-reproducible code here by merging #136189?

You can also use 12f0b5c, on which the refleak bots failed.

@sergey-miryanov
Copy link
Contributor Author

sergey-miryanov commented Jul 7, 2025

You can also use 12f0b5c, on which the refleak bots failed.

For 12f0b5c this PR does nothing. Will dig.

@sergey-miryanov
Copy link
Contributor Author

sergey-miryanov commented Jul 7, 2025

Following diff fixes leaks on 12f0b5c for ConfigDictTest.test_multiprocessing_queues

[2025-07- 8 00:04:00] [D:\Sources\_pythonish\cpython\main] [(12f0b5c...) +26 ~1 -0 !] [󰅒 2.863] ➜ git diff diff --git a/Lib/logging/config.py b/Lib/logging/config.py index 3d9aa00fa52..cb27148763d 100644 --- a/Lib/logging/config.py +++ b/Lib/logging/config.py @@ -35,6 +35,7 @@ import struct import threading import traceback +import weakref from socketserver import ThreadingTCPServer, StreamRequestHandler @@ -304,6 +305,11 @@ def valid_ident(s): class ConvertingMixin(object): """For ConvertingXXX's, this mixin class provides common functions""" + @property + def configurator(self): + return self._configurator() + + def convert_with_key(self, key, value, replace=True): result = self.configurator.convert(value) #If the converted value is different, save for next time @@ -312,15 +318,19 @@ def convert_with_key(self, key, value, replace=True): self[key] = result if type(result) in (ConvertingDict, ConvertingList, ConvertingTuple): - result.parent = self + if type(result) in (ConvertingDict, ConvertingList): + result.parent = weakref.ref(self) + else: + result.parent = self result.key = key return result def convert(self, value): result = self.configurator.convert(value) if value is not result: - if type(result) in (ConvertingDict, ConvertingList, - ConvertingTuple): + if type(result) in (ConvertingDict, ConvertingList): + result.parent = weakref.ref(self) + elif type(result) is ConvertingTuple: result.parent = self return result @@ -388,7 +398,7 @@ class BaseConfigurator(object): def __init__(self, config): self.config = ConvertingDict(config) - self.config.configurator = self + self.config._configurator = weakref.ref(self) def resolve(self, s): """ @@ -457,14 +467,14 @@ def convert(self, value): """ if not isinstance(value, ConvertingDict) and isinstance(value, dict): value = ConvertingDict(value) - value.configurator = self + value._configurator = weakref.ref(self) elif not isinstance(value, ConvertingList) and isinstance(value, list): value = ConvertingList(value) - value.configurator = self + value._configurator = weakref.ref(self) elif not isinstance(value, ConvertingTuple) and\ isinstance(value, tuple) and not hasattr(value, '_fields'): value = ConvertingTuple(value) - value.configurator = self + value._configurator = weakref.ref(self) elif isinstance(value, str): # str for py3k m = self.CONVERT_PATTERN.match(value) if m: 

Result

[2025-07- 8 00:02:55] [D:\Sources\_pythonish\cpython\main] [(12f0b5c...) +26 ~1 -0 !] [󰅒 6:31.695] ➜ .\python.bat -m test test_logging -R : Running Debug|x64 interpreter... Using random seed: 391822730 0:00:00 Run 1 test sequentially in a single process 0:00:00 [1/1] test_logging beginning 9 repetitions. Showing number of leaks (. for 0 or less, X for 10 or more) 12345:6789 XX..2 .... 0:06:30 [1/1] test_logging passed in 6 min 30 sec == Tests result: SUCCESS == 1 test OK. Total duration: 6 min 30 sec Total tests: run=272 skipped=13 Total test files: run=1/1 Result: SUCCESS 
@sergey-miryanov
Copy link
Contributor Author

I presume that the cycles weren't handled well. But don't know how to debug this on C-level.

@sergey-miryanov
Copy link
Contributor Author

@neonene on 12f0b5c I got following simpler repro from yours test_call_in_thread_XXX

 def test_call_in_thread_XXX(self): interp = interpreters.create() # leak with this ctx = {} try: assert False except Exception as e: ctx['e'] = e 

But it doesn't leak on main. Along last changes from Neil I'm not sure it is worth investigating further for 12f0b5c.

@sergey-miryanov sergey-miryanov deleted the test-weak-value-dict branch July 8, 2025 20:20
@neonene
Copy link
Contributor

neonene commented Jul 9, 2025

Inspired by your great version:

import weakref, _interpreters wd = weakref.WeakValueDictionary() class TestInterpreterCall(unittest.TestCase): def test_call_in_thread_XXX(self): id = _interpreters.create() wd[id] = type("", (), {}) _interpreters.destroy(id)

Refleak bots usually pass full tests on main, so our just simplified repros cannot leak there. The repros leak only when weakrefs with a callback are cleared later on our PRs. Ideally (if possible), we should try to fix such leaks on C-level to not distinguish what weakref is cleared earlier/later, I think.

@sergey-miryanov
Copy link
Contributor Author

sergey-miryanov commented Jul 9, 2025

Ideally (if possible), we should try to fix such leaks on C-level to not distinguish what weakref is cleared earlier/later, I think.

With this change (applied to last #136189)

[2025-07-10 00:04:59] [D:\Sources\_pythonish\cpython\main] [gh-135552-wr-clear-later ≡ +28 ~2 -0 !] [󰅒 0] ➜ git diff Python/gc.c diff --git a/Python/gc.c b/Python/gc.c index 09c2da64414..08d3bfcadf6 100644 --- a/Python/gc.c +++ b/Python/gc.c @@ -1785,10 +1785,10 @@ gc_collect_region(PyThreadState *tstate, } /* Invoke weakref callbacks as necessary. */ - stats->collected += handle_weakref_callbacks(&unreachable, to); - gc_list_validate_space(to, gcstate->visited_space); - validate_list(to, collecting_clear_unreachable_clear); - validate_list(&unreachable, collecting_set_unreachable_clear); + // stats->collected += handle_weakref_callbacks(&unreachable, to); + // gc_list_validate_space(to, gcstate->visited_space); + // validate_list(to, collecting_clear_unreachable_clear); + // validate_list(&unreachable, collecting_set_unreachable_clear); /* Call tp_finalize on objects which have one. */ finalize_garbage(tstate, &unreachable); @@ -1813,6 +1813,7 @@ gc_collect_region(PyThreadState *tstate, * *not* be cleared so that caches based on the type version are correctly * invalidated (see GH-135552 as a bug caused by this). */ + stats->collected += handle_weakref_callbacks(&final_unreachable, to); clear_weakrefs(&final_unreachable); /* Call tp_clear on objects in the final_unreachable set. This will cause 

Yours test_call_in_thread_XXX, mine test_call_in_thread_XXX, test_gc and test_weakref don't leak.

Testing full tests set now...

@neonene
Copy link
Contributor

neonene commented Jul 9, 2025

With this change (applied to last #136189)

There is no leak in the current PR as-is. handle_weakref_callbacks() needs the following change to clear weakrefs with a callback later like 12f0b5c.

@@ -923,3 +923,3 @@ // finalizers have been called fixes that bug. - if (wr->wr_callback != NULL) { + if (0) { // _PyWeakref_ClearRef clears the weakref but leaves the
@sergey-miryanov
Copy link
Contributor Author

Following change fixes our examples:

➜ git diff Python/gc.c diff --git a/Python/gc.c b/Python/gc.c index 09c2da64414..50ad311d018 100644 --- a/Python/gc.c +++ b/Python/gc.c @@ -921,7 +921,7 @@ handle_weakref_callbacks(PyGC_Head *unreachable, PyGC_Head *old) // can lead to segfaults since the caches can refer to deallocated // objects. Delaying the clear of weakrefs until *after* // finalizers have been called fixes that bug. - if (wr->wr_callback != NULL) { + if (0/*wr->wr_callback != NULL*/) { // _PyWeakref_ClearRef clears the weakref but leaves the // callback pointer intact. Obscure: it also changes *wrlist. _PyObject_ASSERT((PyObject *)wr, wr->wr_object == op); @@ -1813,7 +1813,7 @@ gc_collect_region(PyThreadState *tstate, * *not* be cleared so that caches based on the type version are correctly * invalidated (see GH-135552 as a bug caused by this). */ - clear_weakrefs(&final_unreachable); + clear_weakrefs(&unreachable); /* Call tp_clear on objects in the final_unreachable set. This will cause * the reference cycles to be broken. It may also cause some objects 

But test_gc fails and test_weakref segfaults. But this is a random code shuffle. I need more time to learn how the gc works and which side-effects it has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants