Add builtin qmd documentation search tool (experimental)#22183
Add builtin qmd documentation search tool (experimental)#22183
qmd documentation search tool (experimental)#22183Conversation
- Add QmdToolConfig type with docs glob patterns field - Add DefaultQmdVersion (0.0.16) and QmdArtifactName constants - Add parseQmdTool parser function - Register qmd as builtin tool in tools_parser, tools_types, mcp_config_validation - Activation job: install @tobilu/qmd, build index, upload qmd-index artifact - Agent job: download qmd-index artifact before MCP setup - MCP renderer: RenderQmdMCP for JSON/TOML (Node.js + npx serve-mcp) - MCP setup generator: qmd added to standard MCP tools list - Claude tools: qmd handled as wildcard MCP server (like serena) - Docker: add node:lts-alpine pre-pull when qmd is configured - System prompt: qmd_prompt.md injected when qmd tool is active - Dependabot: track @tobilu/qmd version in npm deps - Shell injection prevention: single-quote glob patterns in activation steps - JSON schema: add qmd tool schema definition Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e5f827de-afbd-4b2e-98e0-7a7d87ed547a
qmd documentation search tool
|
@copilot mark as experimental |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1dc55c60-1437-4497-a351-d1b7b8d88bfa
Done in abf31d6. The qmd tool now emits Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
qmd documentation search toolqmd documentation search tool (experimental)
|
Every documentation collection can be associated to a "checkout" (same syntax as checkout top level). Default is current repo. The checkout is what paths apply to. Make it configurable the same way as checkout |
|
@copilot Every documentation collection can be associated to a "checkout" (same syntax as checkout top level). Default is current repo. The checkout is what paths apply to. Make it configurable the same way as checkout |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e9a784be-084b-40e0-9bf4-eb76dc81df6d
Done in e09c697. Each documentation collection can now carry its own tools:
qmd:
collections:
- name: current-docs
docs:
- docs/**/*.md
- name: other-docs
docs:
- docs/**/*.md
checkout:
repository: owner/other-repo
path: ./other-repo # optional; defaults to /tmp/gh-aw/qmd-checkout-<name>The activation job emits an Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot the QMD object will be structured as a set of checkouts and a set of searches in GitHub. Take the existing structure and move those entries under a sub-identifier named
The searches will be executed and added to the collection one by one. |
|
@copilot the QMD object may also have a cache key, which means it will be stored in an action cache. This allows you, for example, to declare QMD with only a cache key and omit the indexing, using it as a read-only database. |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e854d5ea-603f-40f9-8672-434f0246af32
Done in 8e48ff6. The
checkouts:
- name: local-docs
docs: [docs/**/*.md]
- name: other-docs
docs: [docs/**/*.md]
checkout:
repository: owner/other-repo
path: ./other-repo
searches:
- query: "repo:owner/repo language:Markdown path:docs/"
min: 1 # fail if fewer results
max: 30 # default 30
github-token: ${{ secrets.GITHUB_TOKEN }}
# OR github-app: { app-id: ..., private-key: ... }Each search step uses Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9b048214-2eee-4a65-b213-30364c2769b7
Done in ef071a0. The Read-only mode (cache-key only, no sources — use a pre-built cache as a database): tools:
qmd:
cache-key: "qmd-index-${{ hashFiles('docs/**') }}"→ activation job: restore from cache → upload as artifact; all indexing steps skipped. Build + cache mode (sources + cache-key — build once, reuse on subsequent runs): tools:
qmd:
checkouts:
- name: docs
docs: [docs/**/*.md]
cache-key: "qmd-index-${{ hashFiles('docs/**') }}"→ activation job: restore from cache → on miss: build index + save to cache → upload as artifact; all build steps carry Uses Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot review all changes, cleanup up. Add new documentation page in reference. |
Done in bc6e3a6. Added a separate Job graph: The indexing job:
|
|
- Add resolveQmdCacheKey() to return user cache-key or default
gh-aw-qmd-${{ github.run_id }} for per-run ephemeral caching
- Always emit cache restore/save steps in indexing job (previously only
when cache-key was explicitly set)
- Add generateQmdIndexCacheRestoreStep() for agent job restore (read-only)
- Add timeout-minutes: 60 to the indexing job
- Build steps are still guarded by cache-hit condition so they are
skipped when a previous run already populated the cache
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/4b67c8f9-0127-41de-9682-0be1a2dcb7f3
Done in ab98791.
|
|
@copilot Emit qmd index cache resolve keys to restore the same cache from the same workflow, and update it if needed. |
Add resolveQmdRestoreKeys() that strips the trailing ${{ ... }} from
the cache key to produce a prefix restore key:
- "gh-aw-qmd-${{ github.run_id }}" → restore-keys: "gh-aw-qmd-"
- "qmd-index-${{ hashFiles('docs/**') }}" → restore-keys: "qmd-index-"
Both the indexing job (generateQmdCacheRestoreStep) and the agent job
(generateQmdIndexCacheRestoreStep) now emit restore-keys so a workflow
can restore the most recently cached index from a previous run even
when the exact run-scoped key is not found, and update it if needed.
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ea6cdefe-d23e-44fc-aaba-668b7df94c17
Done in 190c8fe. Added
Both the indexing job (cache restore before build) and the agent job (read-only restore) now emit these restore-keys, so each run falls back to the most recently cached index and saves an updated copy under the new run-scoped key. |
|
@copilot disable node llma cpp GPU by default, unless runner has GPU and enable if a "gpu" field is set to true. To disable gpu (default), set env NODE_LLAMA_CPP_GPU to false. Also add field to configure image runner of indexer job. See sources for node llama cpp: export const defaultLlamaCppGpuSupport = parseNodeLlamaCppGpuOption( |
- GPU defaults to false: injects NODE_LLAMA_CPP_GPU="false" into the Build qmd index step so node-llama-cpp skips GPU probing on CPU runners - Set gpu: true in workflow frontmatter to re-enable GPU auto-detection - runs-on overrides the indexing job's runner image (e.g. "ubuntu-latest-gpu" or "self-hosted") independently of the agent job runner - JSON schema updated with gpu and runs-on properties for editor completion Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b2bc7a8c-f3fc-4aad-ab35-84de655a9519
Done in a733a53.
tools:
qmd:
checkouts:
- name: docs
paths: [docs/**/*.md]
gpu: true # enable GPU on a GPU runner
runs-on: ubuntu-latest-gpu # use a GPU runner for the indexing jobWarning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot add step in agent job to start qmd MCP server with http transport and probe health, as it may have to download llama binaries . Store bash script in actions/setup/sh |
| with: | ||
| name: qmd-index | ||
| path: /tmp/gh-aw/qmd-index/ | ||
| - name: Restore qmd index from cache |
There was a problem hiding this comment.
@copilot artifact already downloaded, cache restore of qmd index not needed in agent job
There was a problem hiding this comment.
Done in d8aba88. Removed the generateQmdIndexCacheRestoreStep call from the agent job — the artifact download already provides the index.
|
|
||
| indexing: | ||
| needs: activation | ||
| runs-on: ubuntu-slim |
There was a problem hiding this comment.
Done in d8aba88. The indexing job now defaults to runs-on: ubuntu-latest instead of inheriting the safe-outputs runner; still overridable via qmd.runs-on:.
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/92e6bea6-e7ac-4e81-9dc2-4f72aaed1db5
Done in f556ca8. Added
A new Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
…untu-latest Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ece7b2a4-efef-4275-baf2-ccbee0e146ea
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Adds native support for tobi/qmd as a builtin experimental tool, enabling vector similarity search over configured documentation files without requiring
contents: readin the agent job.How it works
The index is split across three jobs:
modelandartifact_prefixactivation,timeout-minutes: 60,runs-on: ubuntu-latest): installs@tobilu/qmdand@actions/github, registers collections from checkouts and/or GitHub searches/issue lists, builds the index using the@tobilu/qmdJavaScript SDK via a singleactions/github-scriptstep, caches the index, and uploads asqmd-indexartifactactivationandindexing): downloads the pre-builtqmd-indexartifact, starts the qmd MCP server with HTTP transport (npx @tobilu/qmd serve-mcp --http), probes health (with a generous timeout to allow llama binary downloads), then connects qmd as an MCP server viaQMD_CACHE_DIRThis means the agent never needs
contents: read— the indexing job (which already has it) does all the file access.The embedding models used to build and query the index are automatically cached. The indexing job uses
actions/cache(restore + post-save) to populate the cache; the agent job usesactions/cache/restore(read-only) to restore models without writing back to the cache.Cache key and restore keys
The qmd index is always cached. The default cache key is
gh-aw-qmd-${{ github.run_id }}(per-run ephemeral), but users can override it withcache-key:for cross-run persistence. Both the indexing job and the agent job emitrestore-keysderived by stripping the trailing${{ ... }}expression from the cache key:gh-aw-qmd-${{ github.run_id }}→restore-keys: gh-aw-qmd-qmd-index-${{ hashFiles('docs/**') }}→restore-keys: qmd-index-This allows a workflow run to restore the most recently cached index from a previous run and update it incrementally rather than rebuilding from scratch.
GPU and runner configuration
By default,
NODE_LLAMA_CPP_GPUis set to"false"in the indexing step so node-llama-cpp skips GPU probing on CPU-only runners. Setgpu: truein the qmd config to re-enable auto GPU detection. The indexing job defaults toubuntu-latestbut the runner image can be overridden independently viaruns-on::qmd MCP server startup in agent job
The agent job starts the qmd MCP server via HTTP transport before the agent runs. A bash script (
actions/setup/sh/start_qmd_server.sh) handles server startup and health probing with a generous timeout (up to 10 minutes) to accommodate llama binary downloads on first run. The server port is exposed asGH_AW_QMD_PORTand the MCP client connects over HTTP (http://host.docker.internal:$GH_AW_QMD_PORT).Note: the agent job does not restore the qmd index from cache — the artifact uploaded by the indexing job is downloaded directly instead.
Usage
Checkouts form (multiple collections with per-collection checkout)
Searches form — code search (GitHub search queries)
Searches form — issue search (GitHub issues as indexed documents)
When
type: issues,queryis the repository slug (owner/repo); if omitted,${{ github.repository }}is used. Each open issue is saved as a markdown file and added to the named collection.Cache key (read-only mode — use a pre-built index without any indexing)
Build + cache mode (build once, reuse on subsequent runs)
Combined form
Each checkout collection can be associated with a
checkout(same syntax as the top-levelcheckout) and an optionalcontexthint for the agent. Each search entry downloads matching files or issues from GitHub and adds them to the index. Both sources can be combined.Changes
QmdToolConfigwithcheckouts []*QmdDocCollection,searches []*QmdSearchEntry,cache-key string,gpu bool, andruns-on string;QmdDocCollectionwithname,paths,context, andcheckout *CheckoutConfigfields;QmdSearchEntrywithname,type,query,min,max,github-token, andgithub-appfields; registered as a builtin acrosstools_parser,tools_types,mcp_config_validation,mcp_setup_generator,claude_tools,dockerDefaultQmdVersion = "2.0.1",QmdArtifactName = "qmd-index",DefaultQmdPort = 3002pkg/workflow/qmd.go: activation step generator — serializes config asQMD_CONFIG_JSON, installs@tobilu/qmdand@actions/githublocally vianpm install --prefix, runs a singleactions/github-scriptstep callingqmd_index.cjs; cache restore/save steps (withrestore-keys) for both the index and the embedding models;resolveQmdCacheKey()always emits a cache key (default:gh-aw-qmd-${{ github.run_id }});resolveQmdRestoreKeys()derives a prefix restore key for cross-run fallback; the indexing job hastimeout-minutes: 60and defaults toubuntu-latest;gpu: false(default) injectsNODE_LLAMA_CPP_GPU=falseto skip GPU probing;runs-onoverrides the indexing job's runner image; the agent job downloads theqmd-indexartifact (no cache restore);generateQmdStartServerStep()emits a "Start qmd MCP server" step in the agent jobactions/setup/sh/start_qmd_server.sh: bash script that startsnpx @tobilu/qmd serve-mcp --http --port $GH_AW_QMD_PORTin background, probes the health endpoint (up to 10 minutes to allow llama binary downloads on first run), and exits non-zero if the server fails to become healthyactions/setup/js/qmd_index.cjs: JavaScript indexer using the@tobilu/qmdSDK — reads config fromQMD_CONFIG_JSON, processes checkout collections (glob expansion) and search entries (GitHub code search or issue list via Octokit), callsstore.update(),store.embed(),store.close(), and writes a Markdown step summaryactions/setup/js/qmd_index.test.cjs: 20 vitest unit tests covering all code paths: missing config, missing SDK, checkout collections (${ENV_VAR}expansion, default pattern), issues search (valid repo, explicitrepofield, invalid slug, empty slug, min-count), code search (file download, min-count, download errors, malformedfull_name), combined checkouts + searches,finallyblock, andwriteSummary.github/workflows/ci.yml: newjs-qmd-indexCI job that runs theqmd_index.test.cjstests in isolationdocs/src/content/docs/reference/qmd.md: new reference documentation page (sidebar order 730)docs/src/content/docs/reference/tools.md: updated withqmd:entry and link to new reference pagepkg/cli/workflows/smoke-codex.md: updated withcheckouts(docs collection) andsearches(500 open issues) to exercise qmd end-to-end⚡ Quickly spin up Copilot coding agent tasks from anywhere on your macOS or Windows machine with Raycast.