gh-146152: Fix memory leak in _json encoder error paths#146164
gh-146152: Fix memory leak in _json encoder error paths#146164okiemute04 wants to merge 5 commits intopython:mainfrom
Conversation
3eddfdf to
51c39f9
Compare
Only clean up markers on the RecursionError path (PATH B), where objects accumulate during stack unwinding. Other error paths are safe because the markers dict is local and will be freed. Based on review feedback from @raminfp and @serhiy-storchaka.
51c39f9 to
0597100
Compare
serhiy-storchaka
left a comment
There was a problem hiding this comment.
I am not sure that this works. Please add a test.
- Define
default()which creates a new object, adds a weak reference to it to the list, and returns that object. - Call JSON encoding, and after catching an exception, call
test.support.gc_collect(), then test that all weak references in the list return None.
|
@serhiy-storchaka Thanks for the review! I've added a test to test_recursion.py that creates objects tracked with weak references, triggers a RecursionError, and verifies all objects are freed after garbage collection. |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thank you for your update, @okiemute04.
The test does not fail with the current code. I am not sure it tests what it should test.
Lib/test/test_json/test_recursion.py
Outdated
| with self.assertRaises(RecursionError): | ||
| self.dumps(obj, default=default) | ||
|
|
||
| gc.collect() |
There was a problem hiding this comment.
Use test.support.gc_collect() -- it calls gc.collect() several times.
There was a problem hiding this comment.
@serhiy-storchaka thanks for your response, i will update it to use test.support.gc_collect() for more thorough garbage collection and test again
There was a problem hiding this comment.
@serhiy-storchaka updated the test to use support.gc_collect(). I did test locally, and it works fine.
| import weakref | ||
|
|
||
|
|
There was a problem hiding this comment.
Can you put this import before the from imports (and only leave two blank lines after the from imports?)
| self.assertIsNone(ref(), | ||
| f"Object {i} still alive - memory leak detected!") |
There was a problem hiding this comment.
| self.assertIsNone(ref(), | |
| f"Object {i} still alive - memory leak detected!") | |
| self.assertIsNone(ref(), f"object {i} is still alive") |
| Py_DECREF(newobj); | ||
| if (rv) { | ||
| _PyErr_FormatNote("when serializing %T object", obj); | ||
|
|
| } | ||
| Py_DECREF(newobj); | ||
| Py_XDECREF(ident); | ||
|
|
|
|
||
| if (_Py_EnterRecursiveCall(" while encoding a JSON object")) { | ||
| if (ident != NULL) { | ||
| PyDict_DelItem(s->markers, ident); |
There was a problem hiding this comment.
This may fail so we need to check the return type.
| @support.skip_emscripten_stack_overflow() | ||
| @support.skip_wasi_stack_overflow() | ||
| def test_memory_leak_on_recursion_error(self): | ||
| """Test that no memory leak occurs when a RecursionError is raised.""" |
There was a problem hiding this comment.
Make it a regular comment as docstrings are shown
| @@ -0,0 +1,3 @@ | |||
| Fix a memory leak in the :mod:`json` module when encoding objects with a | |||
There was a problem hiding this comment.
This looks like three separate bugs but the test is testing all of them at the same time. Please reformulate this so that it convenes the issue more properly.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Fixes #146152
Add PyDict_DelItem calls to remove objects from the markers dict
on all error paths in encoder_listencode_obj. Previously, objects
were only removed on the success path, causing memory leaks when:
_json: stale markers entries on error paths inencoder_listencode_obj#146152