[Repo Assist] fix: eliminate unsafe lock upgrade in SessionConnectionPool.Get#2311
Draft
github-actions[bot] wants to merge 1 commit intomainfrom
Draft
Conversation
The previous Get() implementation held an RLock, released it, upgraded to a WLock to update metadata, then downgraded back to RLock. This pattern had two problems: 1. Between the RUnlock and Lock, cleanupIdleConnections (or Delete) could remove the connection and mark metadata.State = ConnectionStateClosed. Get then overwrote that back to ConnectionStateActive, effectively resurrecting a closed connection and returning a stale *mcp.Connection. 2. The four lock operations (RUnlock + Lock + Unlock + RLock) added unnecessary overhead on the critical path. Fix: acquire a WLock from the start. Get always mutates metadata, so a write lock is correct and simpler. The state is re-checked under the write lock, ensuring only live connections are returned. Also adds TestConnectionPoolGetConcurrentDelete, which exercises the race window between concurrent Get and Delete goroutines. Running with -race would have caught the previous data race. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 This PR was created by Repo Assist, an automated AI assistant.
Summary
Fixes an unsafe lock upgrade in
SessionConnectionPool.Getthat could return a resurrected closed connection under concurrent load.Root Cause
The previous implementation held an
RLock, released it, upgraded to aWLockto update metadata, then re-acquiredRLock:In the window between
RUnlockandLock,cleanupIdleConnections(orDelete) could:metadata.State = ConnectionStateClosedThen
Getwould re-acquire the write lock and overwriteStateback toConnectionStateActive, effectively resurrecting a closed connection and returning a stale*mcp.Connectionto the caller.Fix
Acquire a write lock from the start. Since
Getalways mutates metadata (LastUsedAt,RequestCount,State), a write lock is correct and simpler. The state check runs under the write lock, so no concurrent mutation can occur between the check and the update.This also reduces lock overhead from 4 operations (RUnlock + Lock + Unlock + RLock) to 2 (Lock + deferred Unlock).
New Test
TestConnectionPoolGetConcurrentDeleteexercises concurrentGet+Delete/Setgoroutines. Running with-racewould have caught the data race with the previous implementation.Test Status
Code review notes:
TestConnectionPoolConcurrency(1000 concurrent Gets, verifiesRequestCount == 1000) also validates the fixWarning
The following domains were blocked by the firewall during workflow execution:
proxy.golang.orgreleaseassets.githubusercontent.comTo allow these domains, add them to the
network.allowedlist in your workflow frontmatter:See Network Configuration for more information.