Article Information
Category: Development
Published: May 5, 2026
Author: Chris de Gruijter
Reading Time: 14 min
Tags

How I Built a Model-Agnostic AI Development Setup in 2026
Published: May 5, 2026
Earlier this year, the economics of AI compute changed. Major providers — Anthropic, OpenAI, Google — started adjusting pricing upward after years of subsidising adoption. The demand-side story is simple: inference capacity is not keeping pace with how fast developers are integrating these tools into their daily workflows. For anyone running a serious amount of AI-assisted work, the cost trajectory is no longer abstract.
I run a web development agency, and I was coding against multiple AI providers daily: Claude Code for terminal and config work, GitHub Copilot inside the IDE, Codex CLI for secondary coding runs. When I sat down and calculated what that looked like at scale — input tokens from behavioral rules loaded on every session, MCP tool schemas injected into every context window, 8 servers active simultaneously when I only needed 1 — the waste was significant and entirely fixable. This post documents exactly what I built to address it: a model-agnostic AI setup where every provider operates from the same behavioral rules, token usage is deliberately managed, and switching models costs me thirty minutes rather than a full rewrite.
The Market Context: Why Provider Lock-In Is Now a Financial Risk
For most of 2024 and into 2025, AI providers were competing aggressively on price and capability. The calculus has shifted. Anthropic raised prices on Claude Sonnet. OpenAI restructured GPT-4o pricing. The reasoning is straightforward — GPU capacity is genuinely constrained, and providers who subsidised early adoption to capture market share are now moving to sustainable unit economics.
The strategic problem with provider lock-in is not just cost. It is agility. If your behavioral rules, agent definitions, and workflow logic are written in Claude-specific syntax — hardcoded into CLAUDE.md with Anthropic-specific hook mechanics, MCP configurations, and slash command invocations — then switching to OpenCode, Codex, or Kimi when a better price or capability appears requires rebuilding everything. I was in exactly that position. My entire configuration was Claude-native.
The two goals I set for myself were: (1) become genuinely model-agnostic — every provider loads the same behavioral rules with no rewriting required, and (2) eliminate token waste — quantify what was loading on every session, cut what was unnecessary, and build explicit controls for what loads when. Both goals turned out to be deeply related, because the mechanism that makes you model-agnostic (a shared canonical rules file) also forces you to audit and trim what you actually need in that file.
The Audit: What Was Actually Bloating Context
Before building anything, I ran a full token audit of my existing setup. The findings were sharper than I expected.
MCP Schemas: The Dominant Cost
Eight MCP servers were active simultaneously: Supabase, AdLoop, nuxt-seo-pro, nuxt, nuxt-seo, Stitch, WoopSocial, cloudflare-docs. Every MCP server injects its tool schemas into the context window on every message. A complex server like AdLoop with 40+ tools, or Supabase with 20+ tools and detailed parameter schemas, can consume 5,000–10,000 tokens by itself. With 8 servers running, I was loading an estimated 15,000–25,000 tokens of MCP tool schemas on every message — in sessions where I often needed exactly one MCP.
There is a community benchmark that puts the ceiling clearly: more than 20,000 tokens of MCP schemas actively degrades model performance. Context gets compressed in ways that reduce response quality on the code that actually matters. I was routinely above that ceiling on the wrong sessions.
Rules Files: Tutorial Code That Claude Already Knows
My coding-style.md and patterns.md rules files contained full TypeScript code blocks: a complete updateUser function demonstrating the spread operator, a full custom hook implementation, a complete repository interface. These were tutorials — demonstrating patterns that Claude already knows — rather than corrections. They added approximately 600 tokens per session to teach concepts that only needed a one-line preference statement.
The right rule is Immutability: always spread/Object.assign, never mutate in-place. Not a full function example. The same principle applied to the agents table — a 116-line lookup reference listing 30+ agents was being loaded in every session including quick one-file fixes where I was not orchestrating any agents at all. That alone was ~400 tokens per session of pure lookup overhead.
The Prettier Hook: A 160,000-Token Trap
The most dangerous issue was latent rather than constant. My hooks configuration included a PostToolUse Prettier hook that ran prettier --write after every TypeScript/JavaScript file edit. The failure mode: Claude edits a file, Prettier reformats it (changing whitespace and semicolons), Claude re-reads the file to verify the edit landed correctly, sometimes re-edits based on the reformatted version. In the worst case, this loops three times, consuming roughly 160,000 tokens on a single edit — for a formatting change that should have been a pre-commit step.
I removed it from PostToolUse entirely. Prettier now runs in the pre-commit hook. It runs exactly once per commit, on exactly the files being committed, with no re-read loop possible.
Vendor-Specific Rules Bleeding Between Tools
The most structurally significant issue: my rules were written for Claude Code. The hooks system, model selection tiers (Haiku/Sonnet/Opus), the /compact command guidance, the agent delegation patterns — all of these are Claude-specific mechanics that mean nothing to OpenCode, Codex, or Kimi. But they were in my core rules files. Any time I used another tool, those rules were either being loaded unnecessarily or causing the tool to refuse legitimate operations because it was reading Claude-specific STOP protocols that did not apply to it.
This is the root cause that the architecture needed to solve: behavioral rules that should be universal (immutability, security, testing philosophy, git workflow) were entangled with tool-specific mechanics that should be provider-owned.
The Architecture: One Canonical Source, Multiple Thin Adapters
The design principle is simple: write behavioral rules once, point all provider configs at them. Each provider has its own thin adapter that loads the shared rules plus only the provider-specific additions that tool actually needs.
The Canonical Source Tree
Everything lives under /home/cjvdeg/Projects/AI/ in a single git repo. The model-agnostic layer is the agents/ folder:
/home/cjvdeg/Projects/AI/
agents/
AGENTS.md ← Canonical behavioral rules (modular source)
AGENTS-compiled.md ← Auto-generated: AGENTS.md + all 4 layers
compile-agents.sh ← Pre-commit hook auto-regenerates this
layers/
coding-style.md ← Immutability, file sizes, error handling
testing.md ← TDD, 80% coverage minimum
git-workflow.md ← Commit format, PR workflow
patterns.md ← API response format, hooks, repository pattern
roles/ ← 31 provider-neutral agent role files
skills/ ← 16 portable skills (canonical)
claude/ ← Claude Code adapter (symlinked to ~/.claude/)
CLAUDE.md ← Loads AGENTS.md + layers + claude/rules/
rules/ ← Claude-only rules (hooks, performance, agents)AGENTS.md contains only the universal behavioral rules: immutability patterns, file organisation, error handling, security constraints, testing philosophy, git workflow, documentation conventions. No code examples. No lookup tables. No tool-specific mechanics. The target is under 600 tokens.
How Each Provider Loads the Rules
Different tools have different mechanisms for loading configuration files. The approach accommodates all of them:
- Claude Code:
~/.claude/CLAUDE.mduses @-imports to loadagents/AGENTS.md+ all four layer files + Claude-specific rules inclaude/rules/. The~/.claude/directory is symlinked to/Projects/AI/claude/so it's version-controlled. - OpenCode:
opencode.jsoninstructions array loadsagents/AGENTS.md+ all four layer files +instructions.md, which is a context override that suppresses Claude-specific directives (like the STOP protocol and hook-based gates) while keeping all universal security rules. - Codex CLI: A
~/AGENTS.mdsymlink points atagents/AGENTS-compiled.md— the fully compiled version of AGENTS.md with all four layers concatenated. Codex has no include support, so the compiled file is the solution. - Kimi Code: Repo-root
AGENTS.mdsymlinks in all active repos point atagents/AGENTS-compiled.md. Same compiled-file approach. - VS Code Copilot: Workspace
AGENTS.mdsymlinks to the same compiled file.
The compiled file (AGENTS-compiled.md) is auto-generated by a pre-commit hook whenever AGENTS.md or any layer file is committed. You never edit it directly. The hook runs agents/compile-agents.sh, which concatenates the source files and stages the result. This means the compiled file is always in sync with the source.
The OpenCode Context Override
OpenCode reads ~/.claude/CLAUDE.md via LSP because it is the de facto standard for rule files. But CLAUDE.md imports Claude-specific hook rules that cause OpenCode subagents to refuse legitimate tool calls — they see instructions like "wait for a checkpoint before running the security reviewer" and interpret them as hard constraints on tool use.
~/.config/opencode/instructions.md is loaded as high-priority context and overrides those Claude terminal-specific directives while preserving all universal security rules. It also carries the Webfluentia project rules (analytics tracking priority, SEO CLI compatibility, CSP policy) because OpenCode cannot read project-level .claude/ directories via LSP. This override is intentional and permanent as long as OpenCode reads the Claude config path.
The Providers in My Stack (and What Each One Does)
The setup currently runs five providers, each with a defined role:
Claude Code: Terminal, Config, and Sysadmin
Claude Code runs in the terminal. Its primary role is sysadmin and configuration work — deployment scripts, infrastructure changes, MCP configuration, agent setup, any task where the hooks system (auto-formatting, TypeScript checks, pre-commit gates) is actively valuable. It is not the primary coding tool. That distinction matters for where you invest configuration effort.
Claude Code's hooks system is the most powerful feature that no other provider has: PostToolUse hooks that run after every file edit, PreToolUse gates that block certain operations, Stop hooks that audit the session on exit. These run automatically without any prompt. For a sysadmin tool this is exactly right. For rapid coding iteration, the interruptions add friction.
OpenCode: Primary Coding via GitHub Copilot
OpenCode is my primary coding tool. It routes through GitHub Copilot, which as of early 2026 is still on a request-based billing model rather than per-token billing — making it the most cost-efficient option for high-volume coding sessions. OpenCode runs the oh-my-openagent routing layer, which assigns different models to different request types: a fast model for simple completions, a stronger model for architecture and review tasks.
The key limitation: OpenCode does not support skill invocation (no slash commands). Skills apply as process guidelines that the model follows when relevant — they are described in the rules files, but there is no mechanism to invoke them by name. For structured workflows like TDD or code review, I reference the role files directly in the prompt.
Codex CLI: Secondary Coding and Parallel Runs
Codex CLI runs GPT-4 series models via the OpenAI API. I use it as a secondary coding tool for tasks where I want a different model's perspective, and for parallel runs — starting a Codex session on one part of a codebase while an OpenCode session handles another. Codex loads the compiled AGENTS.md automatically from the home directory, so it operates on the same behavioral rules as everything else.
Kimi Code: Alternative Reasoning
Kimi (Moonshot AI) is in the stack as an alternative for tasks that benefit from a different training distribution. It loads AGENTS-compiled.md from each repo root and supports skill invocation via an extra_skill_dirs config pointing at the canonical agents/skills/ directory. Five of the 31 agent roles have Kimi-specific YAML adapters with phase hints; the remaining 26 can be loaded directly from the provider-neutral role files.
VS Code Copilot: Inline Completions
VS Code Copilot handles inline completions inside the editor. It loads the compiled AGENTS.md from the workspace root. MCP servers for VS Code are configured separately in ~/.config/Code/User/settings.json and are currently minimal — completions do not need database or docs context on every keystroke.
Portable Skills: 16 Reusable Behavioral Patterns
Skills are reusable behavioral patterns — structured processes that a model follows for a specific workflow. The agents/skills/ directory holds 16 portable skills split into knowledge skills and workflow skills:
- Knowledge skills (7): backend-patterns, clickhouse-io, coding-standards, frontend-patterns, security-review, sentry-cli, tdd-workflow
- Workflow skills (9): plan, tdd, code-review, build-fix, test-coverage, update-docs, refactor-clean, e2e, execute-plan
How each provider accesses them differs by what the tool supports. Claude Code invokes them as /skill-name slash commands via directory symlinks in claude/skills/. Codex CLI uses a skills menu via ~/.codex/skills/ symlinks pointing at agents/skills/. Kimi auto-discovers claude/skills/ and also reads from extra_skill_dirs = [".../agents/skills"] in its config. OpenCode and VS Code Copilot do not support skill invocation — you apply the process described in the skill file manually when relevant.
Seven Claude-only skills remain in claude/skills/ and are not in the portable set: continuous-learning, strategic-compact, multi-agent-plan, learn, update-codemaps, context-dev/review/research. These use Claude-specific features — the memory system, context compaction, sub-agent delegation — that have no equivalent in other tools.
MCP Profile Switching: The Single Biggest Token Win
This is where the largest concrete savings came from. Instead of all MCP servers being active all the time, I built a profile switching system with named profiles that activate only what a given session needs.
The Profile System
Eight profiles cover every workflow I run:
- none: Zero MCP servers. For raw coding, quick fixes, context-window-sensitive sessions.
- minimal: context7 only. General coding where live library docs are useful.
- docs: context7, cloudflare-docs, nuxt, nuxt-seo, nuxt-seo-pro. Framework research sessions.
- webfluentia: supabase + adloop. Webfluentia client work.
- amp: supabase + stripe (manual). AMP SaaS work.
- payments: supabase + stripe. Billing and subscription work.
- social: woopsocial. LinkedIn posting via Claude only.
- planning: context7, sequential-thinking, memory, filesystem. Architecture and planning sessions.
Switching is a single command: mcp-none, mcp-minimal, mcp-docs, mcp-wf, and so on. Each command rewrites the mcpServers key in ~/.claude.json to contain only the servers for that profile. OpenCode shares the same config file (it auto-discovers from ~/.claude.json), so switching affects both tools simultaneously.
Kimi has its own separate MCP config in ~/.kimi/mcp.json and its own kimi-mcp-switch.sh aliases that mirror the same profile structure. The profile names and server assignments are consistent across both systems, so the mental model is identical regardless of which tool you are in.
Why Not Loading All MCPs By Default Matters
The numbers make this concrete. At 10 sessions per day, with 8 MCPs active and the conservative estimate of 15,000 tokens overhead per session from MCP schemas alone, that is 150,000 input tokens per day that adds no value to the majority of sessions. With profile switching, a coding session on a Nuxt project using the minimal profile loads exactly one MCP (context7) — roughly 2,000 tokens of tool schemas instead of 15,000–25,000.
Beyond token costs, there is a quality argument. A model that has 20+ Supabase tools injected into its context for a session that has nothing to do with databases is more likely to reach for those tools inappropriately or to have its attention diluted by irrelevant schema details. Keeping the active tool set small and relevant to the task at hand produces better responses.
The Security Model: Why Vibe-Coding in a Container Is Non-Negotiable
The model-agnostic config layer needs a secure execution environment. Without one, the productivity gains are undermined by the risk surface that AI-assisted coding introduces. I wrote about the vibebox setup in detail in a previous post, but the key architecture points are worth repeating here because they intersect directly with the AI config.
Three Environments, One Rule
The setup runs three environments: the host machine (SSH keys and secrets live here, never exposed to containers), devbox (a Distrobox container for trusted non-AI work), and vibebox (a pure Podman container for all AI-assisted development). The one rule is absolute: all vibe-coding happens inside vibebox, not devbox or host.
vibebox is a custom Fedora-based Podman image with specific security settings baked in: npm ignore-scripts=true prevents malicious install-time payloads from running, even when an AI suggests a package that turns out to be a slopsquatting vector. SSH keys are not accessible inside the container. The ~/Secrets/ folder is never mounted. Only ~/Projects/ is available, and only the specific project's .env file is mounted read-only from its actual location in /Secrets/.
Read-Only Mounts for Agent and Rule Files
The security-sensitive config subdirectories inside ~/.claude/ are individually mounted read-only inside vibebox: agents/, rules/, hooks/, skills/, commands/, and CLAUDE.md. The container cannot modify agent definitions or hook scripts — only the host can do that.
This matters because hooks run automatically. A PostToolUse hook fires on every file edit without any user confirmation. If a malicious package or a compromised IDE extension could rewrite hooks/hooks.json, it could inject arbitrary shell commands into the auto-run pipeline. The read-only mount prevents this entirely, at the cost of requiring all config changes to be made on the host rather than inside the container.
IS_SANDBOX=1 and the Root User Problem
Podman containers run as root by default in rootless mode. Claude Code refuses to enable bypass permissions when running as root, as a safety measure. The escape hatch is the environment variable IS_SANDBOX=1 — Claude Code's built-in signal that it is running in a sandbox environment where root is expected and intentional. The check in the source is: process.getuid() === 0 && process.env.IS_SANDBOX !== "1". With IS_SANDBOX=1 set in the container environment (via vibebox.sh -e IS_SANDBOX=1), Claude Code's full functionality is available including bypass permissions mode. VS Code inherits the environment variable and the Claude Code extension works identically inside the container.
MCP Token Rules Inside the Container
The security architecture for MCP tokens is worth calling out because it intersects with the git-tracked nature of the config. ~/.claude/ is symlinked to /Projects/AI/claude/, which is a git repo. Any token value written into env: blocks in the MCP config would be one git add away from being published to GitHub.
The rule is: no token values in any MCP config block, ever. stdio servers use wrapper scripts (~/.claude/mcp-supabase.sh, ~/.claude/mcp-adloop.sh) that source ~/Secrets/global.env at runtime and pass the tokens as environment variables to the spawned process. HTTP servers that support headers use ${env:VAR_NAME} references that Claude Code expands from the shell environment. The shell environment gets the variables because ~/.bash_profile sources global.env on every login. Tokens exist only in /Secrets/, never in /Projects/.
Agent Roles: 31 Provider-Neutral Definitions
The agents/roles/ directory holds 31 provider-neutral role files. Each file has YAML frontmatter defining the agent's name, scope, mission, reasoning tier, capabilities, and which MCP categories it needs — followed by full behavioral instructions and checklists that contain no provider-specific tool names.
The reasoning tier maps to model capability: lightweight for Haiku or equivalent (simple agents, frequent invocation), standard for Sonnet or equivalent (main work, orchestration), deep for Opus or equivalent (architecture, security, complex reasoning). This tier specification is provider-neutral — when a different provider runs the role, it selects the appropriate model for that tier without the role file needing to name a specific model.
Claude Code has native agent invocation for all 31 roles via /agent-name slash commands. For Codex, Kimi, and OpenCode, the same roles are available by instructing the model to read agents/roles/<name>.md directly — no auto-discovery, but the same behavioral outcome.
What Kimi-Specific Adapters Add
Five of the 31 roles have Kimi-specific YAML adapters in agents/kimi-agents/. These add Kimi-specific phase hints — structured guidance on how to break down the task into phases that map to Kimi's particular planning style. The underlying behavioral instructions remain the provider-neutral role file. The adapter only adds the framing that Kimi responds to most effectively. The remaining 26 roles are loaded directly from agents/roles/ without any adapter.
Token Efficiency: The Quantified Wins
After implementing the full architecture, here is what changed concretely on the input token side:
- Agents table removed from default load: ~400 tokens per session saved. The 116-line lookup table now loads only when orchestrating agents.
- Code blocks stripped from rules files: ~600 tokens per session saved. All tutorial code replaced with one-liner preferences.
- Security detection table trimmed: ~120 tokens per session saved.
- AGENTS.md flattening (redundancy removed): ~150 tokens per session saved.
- MCP profile switching (conservative estimate): 5,000–15,000 tokens per session saved for sessions that do not need domain-specific servers.
Rules changes alone save roughly 1,270 tokens per session. With MCP profile switching, a typical coding session in the minimal profile saves 5,000–16,000 tokens compared to the previous always-on configuration. At 10 sessions per day, that is 50,000–160,000 input tokens per day that now reach the model only when they are actually relevant.
On the output side, the rules now explicitly address Claude's verbosity tendencies: no preamble, no affirmation filler, no trailing summaries. Benchmarks from published Claude Code best-practices guides show 63% output reduction when rules target specific verbosity failure modes. I cannot claim that exact number from my own usage, but the sessions are measurably more direct.
Practical Takeaways: What to Build First, Second, Third
If you want to replicate this setup, the order matters because some pieces depend on others.
Step 1: Audit Your Current Token Load
Before building anything, run a session with context inspection enabled and count what is loading. How many MCPs are active? What are their tool counts? What is in your rules files that is tutorial code versus genuine correction? How many tokens does your config load before your first message? The audit usually surfaces two or three high-impact changes that do not require any architectural work — removing unused MCPs, stripping code examples from rules, deferring lookup tables.
Step 2: Separate Universal Rules from Provider-Specific Mechanics
Create a clean AGENTS.md containing only universal behavioral rules. The test: could this rule be followed by any model from any provider without knowing anything about Claude hooks, OpenCode compaction, or Codex skill invocation? If yes, it belongs in AGENTS.md. If it references Claude-specific mechanics, it belongs in a provider adapter.
Keep AGENTS.md under 600 tokens. This forces discipline. Every rule needs to earn its place by describing a genuine correction to AI behavior rather than a concept the model already knows.
Step 3: Build the Compiled File and Wire Symlinks
Write the compile-agents.sh script that concatenates AGENTS.md and all layer files into AGENTS-compiled.md. Add a pre-commit hook that runs it automatically whenever source files change. Then create the symlinks: ~/AGENTS.md → agents/AGENTS-compiled.md for Codex and any other tool that reads from the home directory, and repo-root symlinks in each active project for Kimi and VS Code.
Step 4: Implement MCP Profile Switching
Start with three profiles: none, minimal, and one domain-specific profile for your most common heavy-MCP session. Build the switching aliases. Use them for one week and see how rarely you actually need the full server set. Most sessions need one or two servers, not eight. Once the habit is established, add profiles for the other workflows you actually run.
Step 5: The Container Security Layer
If you are doing any significant amount of AI-assisted development, the container security layer is not optional. The threat model is real: slopsquatting, malicious install scripts, IDE extensions with overly broad file access. The vibebox setup described in my earlier post is one implementation, but the core principle works on any platform. Keep secrets outside the development tree. Enforce ignore-scripts=true. Mount only what each session actually needs.
The full setup — canonical rules source, compiled file for providers without include support, MCP profile switching, 31 provider-neutral agent roles, 16 portable skills, and the container security model — took several weeks to build iteratively. Switching providers now genuinely costs me one thin adapter file and thirty minutes, not a full config rewrite. That was the goal, and it works.
Sources
- Anthropic — Claude Code best practices (official)
- Anthropic — Claude Code memory and imports (AGENTS.md / CLAUDE.md loading)
- Community Claude Code best practices guide — token efficiency and verbosity benchmarks
- Sabrina.dev — The ultimate AI coding guide: Claude Code
- Dzombak — Getting good results from Claude Code
- MindStudio — Reduce token usage in AI agents: MCP optimization
- OpenCode — Rules and precedence documentation
- OpenAI — Introducing Codex and AGENTS.md behavior
- Kimi Code CLI — Agents and AGENTS.md customization
- GitHub — Copilot plans and billing model
- Anthropic — Claude Sonnet pricing
Frequently Asked Questions
What does model-agnostic mean for an AI development setup?
A model-agnostic AI development setup is one where your behavioral rules — coding standards, security constraints, testing philosophy, git workflow — are written in a provider-neutral format and loaded by every AI tool you use. When you switch from Claude to Codex or Kimi, the rules come with you automatically via a single canonical source file rather than requiring you to rewrite your configuration for each tool.
How does AGENTS.md differ from CLAUDE.md?
CLAUDE.md is Claude Code-specific — it can use @-imports, reference Claude hooks, specify Haiku/Sonnet/Opus model tiers, and include Claude-only mechanics like the STOP protocol and agent delegation syntax. AGENTS.md is provider-neutral — it contains only behavioral rules that any model from any provider can follow without knowing anything about Claude-specific tooling. In the architecture described here, CLAUDE.md is a thin adapter that loads AGENTS.md plus Claude-specific additions on top.
Why not load all MCP servers by default and let the model ignore what it does not need?
Because the model does not ignore it. Each MCP server injects its full tool schema into the context window on every message — parameter types, descriptions, required fields. A complex server like Supabase or AdLoop can consume 5,000–10,000 tokens of context just from its schema. Community benchmarks show that loading more than 20,000 tokens of MCP schemas actively degrades model response quality. Profile switching ensures the context contains only what is relevant to the current session.
What is AGENTS-compiled.md and why is it needed?
AGENTS-compiled.md is an auto-generated file that concatenates AGENTS.md and all four layer files (coding-style, testing, git-workflow, patterns) into a single flat file. Some providers — Codex CLI, older versions of Kimi, VS Code workspace rules — do not support @-includes or instructions arrays with multiple files. They read a single rules file. The compiled file gives those providers the full rule set without any include support required. A pre-commit hook regenerates it automatically whenever the source files change.
How do I handle MCP token authentication without hardcoding values?
For stdio MCP servers (like Supabase or AdLoop), use wrapper shell scripts that source a secrets file at runtime — for example, ~/.claude/mcp-supabase.sh sources ~/Secrets/global.env and then execs the MCP binary. The wrapper path goes in the config, not the token. For HTTP servers that support Authorization headers, use ${env:VAR_NAME} references that the tool expands from the shell environment at runtime. The shell environment gets the variables from global.env being sourced in ~/.bash_profile. Token values never appear in any version-controlled file.
Do I need a container setup to use model-agnostic rules?
No — the model-agnostic config architecture (canonical AGENTS.md, compiled file, MCP profile switching) works on any development environment. The container security model (vibebox, IS_SANDBOX=1, read-only mounts) is a separate concern that addresses the threat surface introduced by AI-assisted coding in general. Both are independently useful. If you are doing significant AI-assisted development, the container layer is strongly recommended, but it is not a prerequisite for the model-agnostic config approach.
How long does it actually take to add a new AI provider to this setup?
For providers that natively support AGENTS.md (Codex CLI, Kimi, modern OpenCode versions), adding support requires creating a symlink pointing at AGENTS-compiled.md and optionally creating agent adapters for the most-used roles. This takes under an hour. For providers with non-standard config formats, you write a thin adapter that loads the canonical files in that tool's expected format — typically 30 to 60 minutes. The behavioral rules themselves require no changes.