feat: proxy pre-agent gh CLI calls through DIFC gateway#2294
feat: proxy pre-agent gh CLI calls through DIFC gateway#2294
Conversation
Add DIFC proxy start/stop steps to repo-assist.lock.yml so that all pre-agent gh CLI calls are subject to the same integrity filtering as the agent's MCP tool calls. Start step (before first GH_TOKEN step): - Runs ghcr.io/github/gh-aw-mcpg:v0.1.19 in proxy mode with TLS - Uses same allow-only policy as agent: repos=[github/*], min-integrity=merged - Adds proxy CA cert to system trust store (update-ca-certificates) - Writes GH_HOST=localhost:18443 to GITHUB_ENV for all subsequent steps - Graceful fallback if proxy fails to start Stop step (before agent execution): - Stops proxy container - Clears GH_HOST from GITHUB_ENV to prevent leaking into awf --env-all Proxied steps via GITHUB_ENV propagation: - Configure gh CLI for GitHub Enterprise - Fetch repo data for task weighting (gh issue list, gh pr list) - Clone repo-memory branch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the repo-assist workflow lock file to route pre-agent gh CLI calls through a locally hosted DIFC proxy so they are subject to the same min-integrity: merged allow-only policy as the agent’s MCP tool calls.
Changes:
- Add a “Start DIFC proxy” step that runs
gh-aw-mcpgin proxy mode with TLS and setsGH_HOST=localhost:18443for subsequent pre-agent steps. - Add a “Stop DIFC proxy” step that removes the proxy container and clears
GH_HOSTbefore running the agent container.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| docker run -d --name awmg-proxy --network host \ | ||
| -e GH_TOKEN \ | ||
| -v "$PROXY_LOG_DIR:$PROXY_LOG_DIR" \ | ||
| ghcr.io/github/gh-aw-mcpg:v0.1.22 proxy \ | ||
| --policy "$POLICY" \ | ||
| --listen 0.0.0.0:18443 \ | ||
| --log-dir "$PROXY_LOG_DIR" \ | ||
| --tls --tls-dir "$PROXY_LOG_DIR/proxy-tls" \ | ||
| --guards-mode filter |
There was a problem hiding this comment.
The step claims a graceful fallback if the proxy can’t start, but docker run ... will still hard-fail the job on any pull/run error because GitHub Actions runs bash with -e by default. To preserve the intended fallback, capture the docker run exit status (or use || / set +e around it) and only proceed to the readiness loop when the container actually started; otherwise emit the warning and continue.
| # Wait for proxy to be ready (TLS cert generated and port listening) | ||
| for i in $(seq 1 30); do | ||
| if [ -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | ||
| # Add proxy CA to system trust store so gh CLI (Go) trusts it | ||
| sudo cp "$PROXY_LOG_DIR/proxy-tls/ca.crt" /usr/local/share/ca-certificates/awmg-proxy.crt | ||
| sudo update-ca-certificates | ||
| if curl -sf "https://localhost:18443/api/v3/zen" -o /dev/null 2>/dev/null; then | ||
| echo "DIFC proxy ready on port 18443" | ||
| # Route all subsequent gh CLI calls through the proxy | ||
| echo "GH_HOST=localhost:18443" >> "$GITHUB_ENV" | ||
| break | ||
| fi | ||
| fi | ||
| sleep 1 | ||
| done | ||
|
|
||
| if [ ! -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | ||
| echo "::warning::DIFC proxy failed to start, falling back to direct API access" |
There was a problem hiding this comment.
The readiness/fallback logic only warns when ca.crt is missing, but the loop can also time out with ca.crt present and curl never succeeding; in that case no warning is emitted and the proxy container may be left running. Consider tracking an explicit ready flag (set only when the curl check succeeds), installing the CA once (not every loop iteration), and warning+cleaning up the container when ready is still false after the loop.
| # Wait for proxy to be ready (TLS cert generated and port listening) | |
| for i in $(seq 1 30); do | |
| if [ -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | |
| # Add proxy CA to system trust store so gh CLI (Go) trusts it | |
| sudo cp "$PROXY_LOG_DIR/proxy-tls/ca.crt" /usr/local/share/ca-certificates/awmg-proxy.crt | |
| sudo update-ca-certificates | |
| if curl -sf "https://localhost:18443/api/v3/zen" -o /dev/null 2>/dev/null; then | |
| echo "DIFC proxy ready on port 18443" | |
| # Route all subsequent gh CLI calls through the proxy | |
| echo "GH_HOST=localhost:18443" >> "$GITHUB_ENV" | |
| break | |
| fi | |
| fi | |
| sleep 1 | |
| done | |
| if [ ! -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | |
| echo "::warning::DIFC proxy failed to start, falling back to direct API access" | |
| READY=false | |
| CA_INSTALLED=false | |
| # Wait for proxy to be ready (TLS cert generated and port listening) | |
| for i in $(seq 1 30); do | |
| if [ -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | |
| if [ "$CA_INSTALLED" = false ]; then | |
| # Add proxy CA to system trust store so gh CLI (Go) trusts it | |
| sudo cp "$PROXY_LOG_DIR/proxy-tls/ca.crt" /usr/local/share/ca-certificates/awmg-proxy.crt | |
| sudo update-ca-certificates | |
| CA_INSTALLED=true | |
| fi | |
| if curl -sf "https://localhost:18443/api/v3/zen" -o /dev/null 2>/dev/null; then | |
| echo "DIFC proxy ready on port 18443" | |
| # Route all subsequent gh CLI calls through the proxy | |
| echo "GH_HOST=localhost:18443" >> "$GITHUB_ENV" | |
| READY=true | |
| break | |
| fi | |
| fi | |
| sleep 1 | |
| done | |
| if [ "$READY" = false ]; then | |
| if [ ! -f "$PROXY_LOG_DIR/proxy-tls/ca.crt" ]; then | |
| echo "::warning::DIFC proxy failed to start (CA certificate not generated), falling back to direct API access" | |
| else | |
| echo "::warning::DIFC proxy failed to become ready (health check did not succeed), falling back to direct API access" | |
| fi |
| -v "$PROXY_LOG_DIR:$PROXY_LOG_DIR" \ | ||
| ghcr.io/github/gh-aw-mcpg:v0.1.22 proxy \ | ||
| --policy "$POLICY" \ | ||
| --listen 0.0.0.0:18443 \ |
There was a problem hiding this comment.
With --network host, binding the proxy to 0.0.0.0:18443 exposes it on all runner interfaces. Since only local steps need access, bind to 127.0.0.1:18443 (and keep using https://localhost:18443/...) to reduce accidental exposure on shared/runner networking.
| --listen 0.0.0.0:18443 \ | |
| --listen 127.0.0.1:18443 \ |
Three bugs from first CI run: 1. Health check used /api/v3/zen which routes to GitHub /zen (403 blocked as unrecognized endpoint). Changed to /api/v3/health which maps to the proxy's built-in /health handler. 2. update-ca-certificates was called on every loop iteration (30 times, ~48s wasted). Now called once after cert file appears. 3. Fallback check only tested for missing cert file. Now uses a PROXY_READY flag and dumps container logs on failure for debugging. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gh CLI requires a git remote matching GH_HOST to resolve the repo. With GH_HOST=localhost:18443, gh finds no matching remote and errors: 'none of the git remotes correspond to GH_HOST' Fix: add a dummy 'proxy' git remote pointing to the proxy host with the repo path. gh discovers this remote, extracts owner/repo, and routes API calls through the proxy. The remote is cleaned up in the stop step. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The gh CLI fails with 'none of the git remotes correspond to GH_HOST' because it tries to resolve the repo from git remotes when GH_HOST is set to the proxy. Using -R $GITHUB_REPOSITORY explicitly specifies the repo, bypassing remote resolution entirely. Also adds git remote -v debug output to verify proxy remote is added. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When gh CLI uses GH_HOST for proxy mode, it sends GraphQL requests to /api/graphql (GHES-style) instead of /api/v3/graphql. The proxy only recognized /graphql and /api/v3/graphql, causing the GraphQL POST to fall through to passthrough which forwarded it to https://api.github.com/api/graphql (404) instead of /graphql. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The published v0.1.22 image doesn't include the IsGraphQLPath fix for /api/graphql. Build the container locally from branch source so the proxy correctly handles GHES-style GraphQL requests from gh CLI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The *.wasm files are gitignored so they're not in the CI checkout. Extract the guard binary from the published v0.1.22 image first, then docker build picks it up via COPY in the Dockerfile. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use rustup + build.sh to compile the guard WASM from the Rust source in the repo, then docker build picks it up for the local container. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add truncated query body to the 'unknown GraphQL query' log message so we can see exactly what gh CLI sends. Also add a debug step to dump proxy container logs after the fetch step. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enable debug logging so proxy:handler messages appear in docker logs. Also fix the dump step to read proxy.log (not mcp-gateway.log). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gh CLI sends __type introspection queries before fetching data (e.g.
'query Issue_fields{Issue: __type(name:"Issue"){fields...}}').
These are safe schema metadata with no repo data, so passthrough
them directly to GitHub without DIFC evaluation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When DIFC filtering passes all items (nothing filtered), return the
original upstream response body instead of the ToResult() transformed
data. ToResult() wraps responses in an items array format that breaks
gh CLI's GraphQL response parsing (expects {"data":{...}} object).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the DIFC pipeline filters all items from a GraphQL response,
return {"data":null} instead of the ToResult() array format.
The gh CLI expects standard GraphQL response shape and can handle
null data gracefully (returns empty results).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the DIFC proxy filters all items from gh issue/pr list responses, the gh CLI returns a non-standard error. Add fallback to empty JSON arrays so the task weighting step continues with zero-weighted defaults. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Some GitHub APIs return the Actions bot login as 'github-actions' without the '[bot]' suffix. This caused issues authored by the Actions bot to receive 'none' integrity instead of 'approved', blocking them under min-integrity policies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The gh CLI returns the Actions bot author as 'app/github-actions' (with app/ prefix). Add this variant alongside 'github-actions' and 'github-actions[bot]' so PRs/issues authored by the Actions bot receive approved integrity regardless of author_association. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Routes all pre-agent
ghCLI calls inrepo-assistthrough the MCP gateway DIFC proxy, applying the samemin-integrity: mergedpolicy that governs the agent's MCP tool calls.Problem
The repo-assist workflow runs several
ghCLI calls before the agent starts (issue list, PR list, repo-memory clone). These calls bypass the DIFC pipeline entirely — the agent's guard policy only applies to MCP tool calls during the agent phase.Solution
Add two steps to
repo-assist.lock.yml:Start DIFC Proxy (before first
GH_TOKENstep)ghcr.io/github/gh-aw-mcpg:v0.1.19in proxy mode with TLS on port 18443{"allow-only":{"repos":["github/*"],"min-integrity":"merged"}}update-ca-certificates) soghCLI (Go) trusts itGH_HOST=localhost:18443to$GITHUB_ENV— all subsequent steps inherit itStop DIFC Proxy (before agent execution)
GH_HOSTfrom$GITHUB_ENVto prevent leaking intoawf --env-allSteps Covered
GH_TOKEN$GITHUB_ENV$GITHUB_ENV$GITHUB_ENVghCLIGH_HOST: github.comoverrideGH_HOSTcleared before agentDesign Decisions
$GITHUB_ENVover per-step env vars: Propagates to framework-injected steps (clone-repo-memory) that we can't modify from the.mdsourceghCLI uses Go'shttp.DefaultTransportwhich reads system CAs —NODE_EXTRA_CA_CERTSonly works for Node.js.lock.ymldirectly for debugging;.mdsource changes will follow once validatedTesting
This is a lock file change — needs to be validated by running the repo-assist workflow in CI.