One AI Can’t Really Disagree With Itself. So I Wired Up a Council of 18; Across Claude, Gemini, and Ollama.
Last Updated on May 27, 2026 by Editorial Team
Author(s): Alp Demirel
Originally published on Towards AI.
One AI Can’t Really Disagree With Itself. So I Wired Up a Council of 18; Across Claude, Gemini, and Ollama.

Repo: github.com/Alpsource/council-of-high-intelligence-gemini Original upstream by @0xNyk: council-of-high-intelligence (MIT)
“Argue both sides” is theater
You’ve done this. I’ve done this. “Steelman the opposite view. Argue both sides. Play devil’s advocate.” It feels rigorous. The output, almost always, isn’t.
Here’s why. A single LLM has one prior. When you ask one model to argue against itself, the same weights are generating both the position and the rebuttal. The rebuttal isn’t independent thinking — it’s the model’s learned representation of what a rebuttal should look like, which is itself a distribution shaped by training data, RLHF preferences, and the model’s tendency to hedge. You don’t get conflict. You get a model performing conflict, then quietly converging on a centrist conclusion that flatters its own priors.
That’s bad enough on its own. It gets worse when you start trusting it. You walked in unsure. You walked out with a confidently moderated take. The model never pushed on the assumption you should have questioned, because that assumption lives in its weights too.
I wanted a tool that fixes this structurally — not by prompting harder, but by making the disagreement come from genuinely different places. That’s how I ended up porting Council of High Intelligence to Gemini CLI, and then adding something to it.
What the council is
The original framework is by @0xNyk, released as a Claude Code extension. It’s a structured multi-persona deliberation system: 18 sub-agents, each one a historical thinker — Aristotle, Socrates, Ada Lovelace, Feynman, Torvalds, Machiavelli, Sun Tzu, Donella Meadows, Kahneman, Karpathy, Sutskever, Taleb, and seven more — plus a coordinator skill that runs them through a strict protocol before producing a verdict.
It is not “ask multiple AI personas and average the results.” That would be theater at a higher resolution. The point of the protocol is that it’s adversarial by design. The coordinator’s job is to make consensus mechanically harder to reach than dissent — the opposite of how most multi-agent systems behave.
You convene it with one command:
/council Should we rewrite the auth service or add an abstraction layer?
And here is what actually happens, end to end:
- Problem Restate Gate. Before anyone analyzes anything, every member must restate the problem in their own terms. This catches misunderstandings before they compound through three rounds. If Socrates and Torvalds turn out to be solving subtly different problems, you find out now, not at the verdict.
- Round 1 — Blind-first parallel analysis. All members produce their analysis independently, before seeing each other’s output. No anchoring on the loudest voice. This is the one round most “multi-agent” systems skip, and it’s the most important.
- Round 2 — Cross-examination. Members directly challenge each other’s reasoning. Parallel when there are five or more members, sequential when there are four or fewer.
- Post-Round Enforcement Scan. This is the part that makes the protocol bite. Five hard checks every round:
- Dissent quota — at least two non-overlapping objections must exist in the round, or it doesn’t pass.
- Novelty gate — each member must introduce at least one new claim per round. No restating themselves louder.
- Agreement check — if more than 70% of members agree, the coordinator triggers a mandatory counterfactual round. Consensus is a flag, not a conclusion.
- Evidence labeling — every claim must be tagged:
empirical | mechanistic | strategic | ethical | heuristic. You can no longer hide an opinion as a fact. - Anti-recursion / hemlock rule — Socrates is not allowed to just keep questioning forever. After enough loops, someone hands him the cup.
5. Round 3 — Crystallization. Final positions after the dust settles.
6. Tie-breaking. Two-thirds majority, or a domain expert with a 1.5× weighted vote.
7. Verdict synthesis. Structured output: Consensus, Minority Views, Unresolved Questions, Recommended Next Steps.
The personas are flavor. The thing that does the work is the post-round scan — specifically, the dissent quota and the agreement check. Together, they make a particular failure mode — premature consensus — mechanically expensive. You don’t get a centrist convergence by default. You have to earn one by surviving a counterfactual round.

Why I ported it to Gemini CLI
Two reasons.
First, the AI tooling ecosystem is converging. Claude Code, Gemini CLI, Codex CLI, Cursor’s agent mode — they’re all settling on the same primitives: skills, sub-agents, TOML commands, MCP servers. The shape of an “AI CLI extension” is rapidly standardizing across providers. A framework that lives on only one platform is leaving most of the surface area on the floor.
Second, a port done honestly is a test of how universal the protocol actually is. If the council depends on Claude-specific quirks, the port will tell you. If the pattern is real, it should translate.
Mostly, it translated cleanly. Here are the substantive moves:
Claude Code Gemini CLI ~/.claude/agents/ ${extensionPath}/agents/ Spawn isolated sub-agent processes. Read each persona file at the round start, embody it, and re-read at the next round to preserve isolation detect-providers.sh bash script at runtime Declarative mcpServers in ~/.gemini/settings.json Frontmatter: model, color, tools, provider_affinity Frontmatter: name, description only (Gemini's loader rejects unknown keys) Flag: --models [path] Flag: --mcp-route [path]
The full translation table lives in docs/architecture.md. I wrote that file before I wrote any blog post about it. Honest porting notes are rare in OSS, and I wanted this one to be one of them.
Three bugs that taught me the platform
These are the actually useful parts of any port write-up. None of these are in either CLI’s docs. I found them by running the thing and watching it fail.
1. Gemini CLI’s agent loader silently rejects unknown frontmatter keys. The original Claude agents had a rich council: metadata block — domain, triads, polarity pairs, and MCP affinity. The Gemini loader doesn't throw on unknown keys; it just refuses to load the agent. You get a missing-persona symptom that looks like a path bug. Fix: strip the block from all 18 agent files, move the metadata into a standalone file that configs/mcp-provider-slots.yaml the coordinator reads explicitly.
2. Declaring mcpServers In the extension manifest, it starts them unconditionally. The first version of the port registered claude-code and ollama MCP servers are directly in gemini-extension.json. Result: every user who installed the extension got "Disconnected" warnings on every Gemini session, because Gemini tried to spawn an Ollama server on machines that didn't have Ollama installed. Fix: move MCP server registration out of the extension manifest entirely. The extension ships with zero opinions about your providers. If you want multi-provider routing, you opt in by editing your own ~/.gemini/settings.json.
3. ${extensionPath} is substituted in TOML, but not inside SKILL.md body content. This one took an embarrassing amount of debugging. The coordinator skill references agent files like ${extensionPath}/agents/council-socrates.md. In TOML command definitions, Gemini resolves ${extensionPath} to the actual install path. Inside the SKILL.md body — where the LLM reads it as plain text — the variable is not substituted. The LLM reads the literal string. Sometimes the model correctly inferred the path from context. Sometimes it didn't, and you got intermittent file-not-found errors that looked like flakiness. Fix: every TOML command now opens with Extension path: ${extensionPath} so the resolved path lands in the LLM's context before it reads the skill. The skill then refers ${extensionPath} symbolically, and the model substitutes correctly.
If you’re building anything on Gemini CLI’s extension model, bug #3 is the one to remember. It’s a quiet footgun, and the symptom doesn’t point at the cause.
The capability the upstream doesn’t have: members on different models
The port preserves the protocol exactly. But Gemini CLI’s MCP support let me add one thing the original couldn’t do: the same council deliberation, with members running on different model providers in the same session.
Set one environment variable, point at a slot file, and the coordinator distributes seats across providers per the routing config:
- Socrates, Ada, Sutskever → Claude (via the
@anthropic-ai/claude-codeMCP server) - Feynman, Torvalds, Karpathy → Ollama (local models, privacy-preserving)
- Coordinator → the active Gemini model
The protocol’s polarity pair constraint is enforced at the routing layer. There are thirteen polarity pairs in the framework — Socrates/Feynman, Ada/Machiavelli, Aurelius/Machiavelli, and so on — pairs of personas that are deliberately designed to oppose each other on a particular axis. The router enforces a hard rule: opposing members are never assigned to the same provider. The full mapping is in configs/mcp-provider-slots.yaml.
This is the part I want to be careful about claiming. I am not telling you the council produces better decisions than a single model. That’s not measured, and “better” depends entirely on what you’re deciding. What I am telling you is that it produces structurally different reasoning: when Socrates challenges Feynman’s empiricism, those two arguments are no longer being generated by the same weights. Different model families. Different training corpora. Different tokenizers. Different priors. The disagreement is no longer simulated. It is structural.
That distinction — simulated vs structural disagreement — is, I think, the only honest defense of multi-agent systems over a single large model. If your “agents” are all the same weights with different system prompts, you have a single model doing voices. If they’re different models, you have something that cannot collapse to a single posterior. There’s no shared prior to fall back on.
A council where Socrates runs on Claude and Feynman runs on a local Llama is not the same thing as a council where both run on the same Gemini model with different system prompts. The second one is improv. The first one is an argument.
Why I find this interesting beyond the tool
I work on robotics and perception — visual-inertial odometry, multi-sensor fusion, and dynamic scene understanding. Robust decision-making under disagreement is a constant theme in that world too: a camera says one thing, an IMU says another, and the system has to either reconcile them or admit it can’t.
The standard move is to fuse them with a filter that picks the lower-uncertainty signal. That’s efficient and usually correct. It’s also exactly the move that fails in the cases that matter — adversarial conditions, sensor degradation, degenerate motion — because two channels that look independent often share a hidden common source, and the system collapses confidently onto a wrong answer.
I don’t think that’s a coincidence. The failure mode in single-model “argue both sides” prompting and the failure mode in naive sensor fusion have the same shape: premature consensus on signals that should still be arguing. A protocol that enforces a dissent quota, requires evidence labels, and triggers a counterfactual round on suspicious agreement is one instantiation of a more general design pattern — make agreement expensive before you trust it.
I haven’t ported that pattern into a perception stack yet. But this is the cleanest instantiation of it I’ve seen for an LLM context, and the fact that it survives a port across CLIs is some evidence that the pattern is real.
Try it
gemini extensions install https://github.com/Alpsource/council-of-high-intelligence-gemini
Then, in Gemini CLI:
/council Should we rewrite the auth service or add an abstraction layer?
Three rounds, enforced dissent, and an actual structured verdict. Free, MIT-licensed, runs on your active Gemini model out of the box. MCP multi-provider routing is opt-in once you decide you want it.
A few useful modes:
/council:quick Should we add Redis caching to the auth flow? # 2 rounds, fast
/council:duo Should we use microservices or a monolith? # sharp dialectic
/council:triad ai What are the limits of current foundation models?
/council --members socrates,feynman,ada Is this abstraction sound?
If you find a bug, open an issue. If you find a fourth one I missed, even better — it’ll be in the next blog post.
Repo: github.com/Alpsource/council-of-high-intelligence-gemini Original upstream: github.com/0xNyk/council-of-high-intelligence by @0xNyk, MIT
I’m a PhD candidate working on visual-inertial odometry, JEPA-based architectures, and dynamic scene understanding. I write about robotics, self-supervised learning, and the occasional developer tool. If you liked this, my last post was I Built an AI Pilot That Plans Like a Robot and Dodges Like a Human.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Towards AI Academy
We Build Enterprise-Grade AI. We'll Teach You to Master It Too.
15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.
Start free — no commitment:
→ 6-Day Agentic AI Engineering Email Guide — one practical lesson per day
→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages
Our courses:
→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.
→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.
→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.
Note: Article content contains the views of the contributing authors and not Towards AI.