A polished product demo screen on the left, a tangled live wire harness on the right, both labelled with the same price tag — the visual metaphor for what gets sold vs. what gets shipped.
What you see on the demo, vs. what gets handed to your customers.

I met a developer at a meetup last month. Smart, friendly, building fast. Three drinks in and pitching me his stack: Claude Code, a Stripe key, a Supabase project, a Vercel account. Four small products shipped to paying clients last quarter. A week each. Weekend-project pricing.

I asked a few questions. How does the app handle a user pasting a malicious script into a form? What happens if the payment provider times out mid-charge? Who can read what, and how is that enforced? What’s the rollback plan when a deploy breaks production at 2am?

He looked at me like I’d asked him for the molecular weight of helium.

This post is for the people writing the cheques on the other side of those conversations — business owners, ops managers, CTOs being told “AI lets us build it in a week, not two months.” That comparison is a misdirection. You’re not buying a cheaper version of the same product. You’re buying a different product entirely, with most of its cost hidden until something goes wrong.

"You're not buying a cheaper product. You're buying a ticking time bomb with the contractor's initials on the build and your business name on the invoice."

The “2 months vs 1 week” pitch only works if the deliverables match

When someone tells you they can build it in a week instead of two months, what they’re really saying is: I can produce the happy-path version in a week.

The happy path is the demo. The screenshot. The part you see during the sales call. AI tools are now genuinely excellent at it. Andrej Karpathy — who coined “vibe coding” — has since dropped the term in favour of agentic engineering, noting that the human’s job is oversight. Whoever steers the agents has to know what to look for.

The 80% AI handles isn’t the part that fails in production. The 20% it can’t reach is. And that 20% stays invisible until real users, real data, and real money hit the system.

Week 1 build
The happy path
Working demo. Login screen. The flow shown in the sales call. The 80% AI is great at.
Hidden cost
The 20% that decides outcomes
Data boundaries, input validation, failure modes, audit trail, observability, maintainability.

What the data says about vibe-coded production code

The numbers are not subtle.

45%
of AI-generated code contains a known security vulnerability
Veracode GenAI Report, 2025
41%
of AI-generated backend code ships with over-permissioned access
Industry analysis, 2026
86%
of generated samples failed cross-site scripting defences
Veracode, 2025
68%
of SMBs reported at least one SaaS-related security incident last year
Cloud Security Alliance

Cloud Security Alliance research also flagged a surge in vibe-coded apps exposing API keys, service-account credentials, and passwords in client-side JavaScript — anyone who opens browser dev tools can read them.

What’s missing from the one-week build

What gets skipped isn’t optional polish — it’s the parts that decide whether your business is exposed:

  • Data boundaries. Who can read what, write what, export what — under which conditions, at which endpoints. Whether that’s tenant isolation in a SaaS, role boundaries in an internal tool, or “is this admin route actually behind admin auth,” the demo doesn’t exercise it. The failure shows up the day someone curious goes looking.
  • Input and trust boundaries. Every form, URL parameter, uploaded file, AI prompt, and webhook payload is an attacker’s entry point. Cross-site scripting (XSS — sneaking malicious code through a form so it runs in another user’s browser and steals their session), SQL injection (manipulating the database through unfiltered input), prompt injection (tricking the AI into ignoring its instructions), server-side request forgery (making your server fetch attacker-controlled URLs) — these aren’t theoretical. 86% of AI-generated samples failed XSS defences. 88% were vulnerable to log injection. The model produces code that looks correct but is not.
  • Failure modes and recovery. Payment provider times out mid-charge. Queue backs up. Two users edit the same record. Third-party API returns malformed data. A retry loop costs you $400 in API spend overnight. The demo never hit any of these. Production hits all of them.
  • Audit trail and observability. When something goes wrong six months in — and it will — can anyone reconstruct what happened, to whom, when, and by whose hand? If not, your regulator, your insurer, and your largest customer all have the same opinion on that.
  • Operational hygiene. Where do secrets live? How do dependencies get updated when a new security flaw is publicly disclosed (what the industry calls a CVE — a Common Vulnerability and Exposure)? What’s the rollback plan? How does anyone know the system is unhealthy before customers tell you?
  • Maintainability. Can a different engineer open this code in eighteen months and extend it? Or does every change require rebuilding it from scratch because no one — including the original author — can read what the model produced?

None of this is on the invoice. All of it is in the product.

A casual developer in a hoodie hands a glossy red gift-wrapped parcel to a businessperson. The buyer's side of the parcel is tearing open, exposed wires and sparks escaping, a thin curl of smoke rising — the developer's side still looks pristine.
The fee is on the developer's invoice. The risk is on yours.

The risk transfer no one mentions in the quote

Here’s the part business owners need to understand. When a developer ships a vibe-coded product and skips the 20%, the cost doesn’t disappear. It gets transferred. To you.

  • Customer data leaks — through a missing access check, an exposed admin endpoint, or a malicious input no one filtered? The developer doesn’t lose customers — you do.
  • A user pastes a script into a feedback form and steals session cookies from anyone who views the admin page? The breach notification has your name on it, not theirs.
  • Stripe key found in client-side JS during a regulator audit? The fine has your business name on it.
  • A dependency ships a critical CVE and no one has a process to patch it? You’ll find out through the incident, not the build.
  • The original developer is unavailable, and the code is impossible for anyone else to extend? Every small change becomes a rebuild quote.
  • An automated retry loop runs unchecked for a weekend and racks up $40,000 in AI API costs before anyone notices? You pay the bill.

Reputational damage is the most underestimated. You can recover from a fine. You can rebuild a database. Rebuilding customer trust after a public disclosure — particularly in a mid-market segment where buyers research before they sign — costs far more than the price difference between a seven-day build and a sixty-day one.

Vibe coding has a legitimate home — just not this one

This isn’t an argument against AI-driven development. We run on four purpose-built AI systems ourselves. Used well, the tooling is extraordinary.

✓ Vibe coding works here
  • Prototypes for ideas still being validated
  • Internal tools where the only user is you
  • Throwaway scripts for one-off data work
  • Exploring a new framework or pattern
  • Wireframes that will be rebuilt before they ship
✗ Not here
  • SaaS products with paying customers
  • Multi-tenant systems holding real data
  • Anything handling payments or PII
  • Regulated industries — health, finance, gov
  • Software your business name is invoiced for

The common thread on the left: low blast radius, single-tenant, no real users, no real money. That’s where AI-driven generation shines.

The problem isn’t the tool. It’s using a prototyping tool as a delivery method for software with real users on the other end of it.

What production work for paying clients actually requires

When real customers are on the other end, the bar moves. The 20% is where production engineering lives:

  • Threat modelling before the first line of code — who can do what, what’s the worst input, where does the data flow?
  • Adversarial testing that actively tries to break the system, not just confirm the happy path — injection, auth bypass, race conditions, malformed inputs
  • Parallel expert review — architecture, security, QA, UX, SEO looking at the same change with different lenses (the work we built Janus to do — five specialist agents review every change in parallel before it ships)
  • Maintainable code another engineer can read, modify, and extend in two years
  • Change resilience — automated tests, rollback strategy, dependency hygiene, observability hooks

You can use AI to do this work faster. We do. But faster multiplies a process that already exists. Multiply “prompt and ship” and you reach the time bomb sooner, not a better product.

The five questions every buyer should ask

Before you sign the quote
  1. How do you decide who can see, edit, and export what — and how is that enforced and tested?
  2. What's your process for hostile input — injection, malicious uploads, abused endpoints — and what evidence do you have that it works?
  3. Who reviews the generated code, against what checklist, before it reaches a real user?
  4. Where do secrets live, how are dependencies patched, and what's the rollback plan when something breaks at 2am?
  5. If a regulator audits this build — or the original developer is unavailable — in eighteen months, who answers the questions?
If the answers are vague, fast, or vibes-based — walk.

You’re not buying a cheaper product. You’re buying a ticking time bomb with the contractor’s initials on the build and your business name on the invoice.

If you’d like the boring, expensive 20% built in from day one — the part that makes the difference between a demo and a product — that’s the work we do. Have a chat with us.