Proposition
Marketers obsess over how to make things happen, rarely over what actually needs to be done. Do we even need it? What are the make-or-break points? Same mistake with AI: how to speed up, how to delegate, how to automate. As in Pluribus, if AI has the knowledge of the best experts, why use it to draw pictures instead of guiding the work? Can AI write a growth plan?
The task
I'll be creating a full execution-ready growth plan, until I'm happy with it. This will take several chapters. I'll document progress — challenges, AI performance, what works. Not just to explore AI's limits, but to find the best approach for brain-heavy work.
My AI Toolkit
The toolkit I rely on in everyday work: Claude Opus 4.5 in extended thinking mode, ChatGPT 5.2 Pro, and Gemini 3 Pro. Claude for personality, MCP, and browser integration. ChatGPT for deeper research and web search. More to that, multiple models help validate against each other. No agents — most run on the same foundation models anyway. I'll use them directly and imitate the agentic loop myself, with more control. May experiment with other stuff along the way.
The Product
I decided to stay with something real, though it may look complicated for business leaders from other industries. That said, I believe in you. We'll go with CLion, JetBrains IDE for C and C++. It's real — findings may actually feed into my work. It's niche — specific challenges, but we avoid the mess of AI agents. It's a good product with non-obvious benefits that require explanation, while still leaving room for personal preference. And it doesn't stand out that much, so we need to convince, not just build awareness. It crosses B2C and B2B, with B2C2B logic as well. In other words, a messy enough example to be actually useful.
Does AI need me?
AI is a probability-based system, so it thrives in deterministic environments. This is why coding agents are the fastest-growing AI applications — there's a clean layer of checks: it compiles, tests go green, errors land in logs, quick to debug.

The loop works when Observe gives you clear signal and Verify has ground truth. Code has both. Marketing has neither.
Does text compile? Well, it's text. It reads like English, no obvious grammar mistakes. But does it actually run? The fact that it's easy to read tells us nothing about whether it's worth reading — or whether it works toward its original intent.
Growth plans? Even messier. Attribution is a tangle (multi-touch journeys, three-month lags, that competitor launch nobody factored in). Your analytics say one thing, CRM says another. There's no test suite for "is this the right positioning." An agent can complete the loop, update its state, and come out dumber — with no way of knowing.
So I'll be a growth plan compiler. I'll report warnings and errors back to AI, and push it forward together.
The Irreducibility of Judgment
That's the mechanical answer. There's a philosophical one too. I'm treating AI as a strategist reporting to me, not a director. Professional services logic.
Maister's Managing the Professional Services Firm offers a framework: Brains, Grey Hair, Procedure. You could argue it's obsolete — AI collapses the information asymmetry that PSFs were built on. But Maister's deeper insight wasn't about pyramids. It was about trust and the irreducibility of judgment in complex decisions. Someone still needs to own the problem. AI can't do that. So who does?
Starting clean
Before all the management and supervision, let's start with a clean slate. Shoot a 1-prompt idea into AI and see what comes out. Then dive in and see how far we can go.
Since I'm working with top reasoning models, I don't need to over-specify. On the contrary, they need freedom to figure it out themselves.
For the first prompt:
Here are the results: Opus, ChatGPT.
Evaluating the dry run
Honestly, this already looks impressive. I added our current strategy into the mix and asked both AIs to evaluate all three. They agree: Claude's version provides the best ideation, ChatGPT's is most grounded in market reality, the original is most actionable for the year ahead. Ratings were close — Claude/ChatGPT/original: Claude rated them 6, 8, 7.5. ChatGPT: 7.5, 8.5, 8.
My assessments and watchouts
So? This is exactly what I meant by "it compiles." Looks like a plan, reads like a plan. Easy to believe you're looking at one. But I'm not satisfied. Without arguing about the proposed decisions, here's what needs fixing first:
-
Structure. I'd give the same feedback to my human team, and AI should push even further. People do the job and share the summary — but they understand where it comes from, can reference it. AI thinking has to be visible. Not just a list (here's the audience, here's the message), but a framework, a story: this leads to that, so we do this, because of that.
-
Numbers. Both plans use specific numbers. ChatGPT does better at keeping references. Still, a lot of hallucination — what numbers they use, how they interpret them. Not light warnings. Clear red flags any compiler would scream about. Same for tools they mention, even product details — I know the product better than AI, of course, but even against publicly available info on jetbrains.com, AI falls short.
-
Plan. Both outcomes look more like strategy — weak as plans. Actions are loosely tied to objectives, not really actionable. A lot of "let's publish a blog post" as the loudest 30-day step. That's where I'd push back hard. It's not theory anymore. Should be real, workable in context of the current market and existing toolkit. Not just "win with audience A" — but how, where to find them, how much it costs, what to say.
While fixing those — making AI thinking visible and validating assumptions — we may also challenge the decisions themselves.
Finding the Frame
The best definition of strategy I've encountered: a simple story about solving a commercial problem. So let's make a story out of it.
Four approaches to prompts:
- Direct prompting — give it your best shot without overkill. Reasoning models expand content themselves.
- Exploration prompts — I've got the most value from width, not depth. Ask for intentionally diverse answers covering different points of view.
- Expertise reference — once you want depth, point to where it lives. Roleplay is the easiest example: e.g. growth topics live in business schools, so let's stage a conversation between Harvard and Stanford professors.
- Meta-prompting — instead of "do this," ask "write me a prompt that does this" or "write me requirements for this prompt." Evaluate it, edit it to your liking, then run it.
I'll use the same intro with different continuations.
The intro:
Click the tabs below to see how different approaches lead to very different answers. I learn a lot this way — keeps the picture wide.
Direct prompting
Summary: A five-part framework — Discovery, Diagnosis, Direction, Design, Delivery — that progresses from gathering facts through interpreting them, making strategic choices, building execution plans, and operationalizing with budgets and governance.
My thoughts: I actually feel this is a very solid arc. It works on a high level, and also the document provides nice details and frameworks on how to fill it in. With this, for the structure itself, it is somewhat overwhelming.
Exploration prompts
Summary: A modular framework covering six strategic approaches (Traditional, Hypothesis-Driven, Product-Led, Sales-Led, Market-Driven, and Financial) that progress from foundational assessment through execution architecture, allowing you to select and blend methodologies based on your business context.
My thoughts: On a high level, this structure looks more like a list to me, rather than a story. Maybe good for a final document, but not that much for actual planning work. I do like The Six Approaches, and the level of detail is enough to highlight the difference and provoke thinking. There is a clear value in there.
Expertise reference
Summary: A growth plan diagnoses where you are (economics, position, constraints), maps strategic options with rigorous prioritization, makes explicit bets with resource allocation, architects execution across functions, and governs progress through metrics and adaptation — all while maintaining the humility that the plan will be wrong and the discipline is in confronting reality.
My thoughts: Love it. It's fun, vivid, and provides fresh perspective, pitting ideas against each other. The arc is nice (echo of the direct approach, but formulated differently): Where Are We? Where Could We Go? Where Will We Go? How Will We Get There? How Will We Stay on Track? What Could Go Wrong? How Do We Bring People Along?
Let me quote the practitioner here:
The most common failure mode of growth plans is not strategic error but execution gap. The quality of a plan is ultimately measured not by its sophistication but by its implementation. Build simplicity where you can. Create clarity for the people who must execute. And maintain the humility to adapt when reality teaches you something the plan didn't anticipate.
Meta-prompting
The outcome: Opus
Summary: A modular eight-section framework — spanning current state, strategic context, audience, growth model, initiatives, metrics, risks, and execution — with embedded reasoning logs that capture assumptions, alternatives considered, and decision rationale to enable transparent, traceable, AI-assisted growth planning.
My thoughts: This structure turned out the weakest, but I was just tired already and didn't push it far enough. It's too granular and reads like a spreadsheet. With this, there are some nice details around metrics, formats of deliverables, etc. that I may reuse going forward.
The Frame
Stories are linear, so we don't need to see the whole path now. Just the next step. The first three stages of the direct and expertise reference approaches overlap, and I like them. Discovery (Where are we?), Diagnosis (Where could we go?), Direction (Where will we go?). We'll start there and allow ourselves freedom to adjust later.
What's next?
Next: Discovery. I'll put AI to work getting an unopinionated snapshot of the product and market context. Different models, different research capabilities — and prompts aimed at clean data we can actually verify. The compiler needs something to compile.