Skip to content

Generated monorepo AGENTS.md files contain heavy duplication and anti-patterns #45

@digitarald

Description

@digitarald

Problem

When running agentrc instructions on a large monorepo (Rust workspace with 24+ crates), the generated per-crate AGENTS.md files contain significant duplication of root-level instructions and several anti-patterns identified by Copilot review.

Findings

1. Anti-pattern: Dual file types (copilot-instructions.md + AGENTS.md)

The generator creates both a root copilot-instructions.md and 24+ per-crate AGENTS.md files. The reference docs recommend using only one type, not both. This causes ambiguity about which file takes precedence and wastes context tokens.

Expected: Either consolidate into a single root AGENTS.md (with per-crate AGENTS.md extending it), or use copilot-instructions.md + .github/instructions/*.instructions.md with applyTo globs.

2. Anti-pattern: "Kitchen sink" / Duplicating root content in per-crate files

The majority of each crate's "Key Conventions" section repeats what's already in the root instructions:

  • Workspace dependencies / { workspace = true }
  • Error types / thiserror::Error
  • Logging with tracing
  • Testing with tempfile / test_data/
  • Standard build/test commands (cargo build -p <crate>, cargo clippy-fast, cargo fmt)

The same boilerplate is copy-pasted across all 24 crate files, burning context tokens on every interaction.

3. Redundant "Monorepo Context" section

Each crate file includes git root, workspace root Cargo.toml path, toolchain, and "all cargo commands from root" — all already covered in root instructions with no crate-specific additions.

4. Title mismatch

Generated AGENTS.md files use # Copilot Instructions: <crate> crate as the heading, implying it's a copilot-instructions.md file. For AGENTS.md in a monorepo hierarchy, a simpler heading like # <crate> crate is sufficient.

Desired behavior

Per-crate AGENTS.md files should contain only crate-unique context:

  • Crate purpose / summary
  • File structure → responsibility mapping
  • Crate-specific conventions (e.g., special binary resolution, auth patterns, helper APIs)
  • Internal consumers / dependencies (only if non-obvious)

Everything that applies workspace-wide should live exclusively in the root file, not be duplicated.

Example: well-trimmed crate file

For a git crate, the ideal output is ~60% smaller than current:

# `git` crate

Git utility abstractions — binary resolution, branching, checkout, config, credentials, diffs, and remotes.

## Structure

src/
├── lib.rs         # git_command(), async_git_command(), git_binary(), git_root()
├── branch.rs      # Current branch detection, handles detached HEAD
├── checkout.rs    # Repository checkout operations
├── config.rs      # git config read/write helpers
├── credential.rs  # Bearer token headers, credential store
├── diff.rs        # Async diff stats and file comparisons
└── remote.rs      # Remote URL management

## Crate-Specific Conventions

- **Git binary resolution**: `git_binary()` checks env var first, then falls back to PATH. Use `git_command()` / `async_git_command()` everywhere — never call `Command::new("git")` directly.
- **Auth via git config**: Credentials are persisted via `git config --local`, not per-command env vars.
- **Debug helpers**: `git_debug_env_vars()` / `git_cmd_with_debug()` for diagnosing auth/network issues.

## Internal Consumers

Uses `vars` (env var constants) and `fs_utils` (filesystem helpers).

Suggested improvements

  1. Deduplicate at generation time — diff each per-crate section against the root instructions and strip anything already covered
  2. Avoid dual file types — choose one strategy per repo (root AGENTS.md + per-directory, or copilot-instructions.md + .instructions.md globs)
  3. Use appropriate headings — match the heading style to the file type being generated
  4. Add a post-generation dedup pass — or surface a warning when duplication is detected across files

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions