Skip to content

gh-146152: Fix memory leak in _json encoder error paths#146164

Open
okiemute04 wants to merge 5 commits intopython:mainfrom
okiemute04:fix-json-encoder-leak
Open

gh-146152: Fix memory leak in _json encoder error paths#146164
okiemute04 wants to merge 5 commits intopython:mainfrom
okiemute04:fix-json-encoder-leak

Conversation

@okiemute04
Copy link
Contributor

@okiemute04 okiemute04 commented Mar 19, 2026

Fixes #146152

Add PyDict_DelItem calls to remove objects from the markers dict
on all error paths in encoder_listencode_obj. Previously, objects
were only removed on the success path, causing memory leaks when:

  • default() raises an exception
  • RecursionError occurs
  • Nested encoding fails

Only clean up markers on the RecursionError path (PATH B), where
objects accumulate during stack unwinding. Other error paths are
safe because the markers dict is local and will be freed.

Based on review feedback from @raminfp and @serhiy-storchaka.
@okiemute04 okiemute04 force-pushed the fix-json-encoder-leak branch from 51c39f9 to 0597100 Compare March 19, 2026 14:00
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this works. Please add a test.

  • Define default() which creates a new object, adds a weak reference to it to the list, and returns that object.
  • Call JSON encoding, and after catching an exception, call test.support.gc_collect(), then test that all weak references in the list return None.

@okiemute04
Copy link
Contributor Author

@serhiy-storchaka Thanks for the review! I've added a test to test_recursion.py that creates objects tracked with weak references, triggers a RecursionError, and verifies all objects are freed after garbage collection.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your update, @okiemute04.

The test does not fail with the current code. I am not sure it tests what it should test.

with self.assertRaises(RecursionError):
self.dumps(obj, default=default)

gc.collect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use test.support.gc_collect() -- it calls gc.collect() several times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka thanks for your response, i will update it to use test.support.gc_collect() for more thorough garbage collection and test again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka updated the test to use support.gc_collect(). I did test locally, and it works fine.

Comment on lines +3 to 5
import weakref


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this import before the from imports (and only leave two blank lines after the from imports?)

Comment on lines +146 to +147
self.assertIsNone(ref(),
f"Object {i} still alive - memory leak detected!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.assertIsNone(ref(),
f"Object {i} still alive - memory leak detected!")
self.assertIsNone(ref(), f"object {i} is still alive")

Py_DECREF(newobj);
if (rv) {
_PyErr_FormatNote("when serializing %T object", obj);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this

}
Py_DECREF(newobj);
Py_XDECREF(ident);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert blank lines additions


if (_Py_EnterRecursiveCall(" while encoding a JSON object")) {
if (ident != NULL) {
PyDict_DelItem(s->markers, ident);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may fail so we need to check the return type.

@support.skip_emscripten_stack_overflow()
@support.skip_wasi_stack_overflow()
def test_memory_leak_on_recursion_error(self):
"""Test that no memory leak occurs when a RecursionError is raised."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a regular comment as docstrings are shown

@@ -0,0 +1,3 @@
Fix a memory leak in the :mod:`json` module when encoding objects with a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like three separate bugs but the test is testing all of them at the same time. Please reformulate this so that it convenes the issue more properly.

@bedevere-app
Copy link

bedevere-app bot commented Mar 22, 2026

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_json: stale markers entries on error paths in encoder_listencode_obj

3 participants