Why most AI implementations in e-commerce fail.
The failure is rarely in the tech. It's almost always in the framing of the problem, and the expectations built around it.

Over the past eighteen months I've watched dozens of e-commerce founders and marketing teams kick off an AI chatbot, an agent, or an automation pipeline. Roughly three out of ten projects deliver what they promised. The rest stall, get quietly switched off, or, the saddest version, keep running while nobody pays attention to them anymore. The problem is rarely in the model, the tool, or the developer. It's in the framing.
The pattern we keep seeing
One number to set the scale: BCG concluded in 2024 that seventy percent of AI implementations fail to deliver measurable value. Other studies (Gartner, McKinsey) sit in the same range. That's a lot. And it isn't the models, those do what they promise in 2026. It's how teams deploy them.
Technology as the goal
The phrase "we want an AI chatbot" isn't a project, it's a tool without an outcome. A workable project sounds different. "We want fifty percent fewer Zendesk tickets in six weeks without dropping our NPS." Or: "we want to bring our customer-service cost from fourteen euros per ticket down to three." Or: "we want twenty-four-hour availability for sizing advice."
Those are goals you can tick off. "Building an AI chatbot" isn't. It's a feature, not a result. The moment the first version goes live without you having defined when it's good enough, it slowly dies.
We start every engagement with the same question to the founder: what should be reduced over six weeks, and how do we measure it? No blueprint, no Figma, no prompts. That question first. If the answer is vague, the KPIs are vague, and no AI implementation escapes the swamp phase.
No baseline
How do you know your AI chatbot handles fifty percent of your tickets if you don't know how many tickets you have? How do you know your email automation generates three percent extra revenue if the attribution baseline isn't even locked?
Nine out of ten teams we meet don't have these numbers. They have Zendesk or Gorgias data, but it's never aggregated. They have a Klaviyo account, but per-flow revenue attribution is off. They measure "it's running" on a feeling.
A two-week baseline before go-live isn't a luxury. It's the only way to be honest about what the AI did. We do it like this: one week before the bot ships, we export tickets per category, average response time, and first-contact resolution rate. That's your zero-measurement. Everything that changes after go-live, you measure against it. No baseline, no honest ROI claim afterwards, no matter how well the bot works.
Generic AI on every page at once
A mistake we see again and again: the AI chatbot gets placed on every page, answers every question, and therefore none of them well. The best AI implementations are narrow. One use case, one type of question, one page as a test environment.
An example that stuck with me. A fashion shop we worked with first put the chatbot only on product detail pages, with one job: sizing advice based on the size chart. No returns, no shipping queries, no Shopify order lookups. Just sizing. Result: forty-six percent fewer size-related tickets in four weeks. Only after that did we expand to returns, stock queries, and order tracking. In steps, with a measurement point per expansion.
This goes against every instinct: more features feels like better. In AI implementations it's almost always the opposite. Start with one sharply scoped question, prove it works, expand only then.
A chatbot trained on nothing
A chatbot without your own data is a Wikipedia search engine with a nicer interface. It can give generic answers, but it misses everything that sets you apart: your own FAQ, your product information, your tone of voice, your return policy, your payment methods, your shipping to Belgium.
What strikes me every time is how little founders invest in this knowledge base before going live. Two hours of input, expecting two years of polish. It doesn't work that way.
Our rule: a minimum of twenty-five frequently asked questions written out by hand, plus the full product catalog, plus a half-page tone document. Below that, it doesn't work. Above that, it actually starts working. A well-trained AI answers like you, not like a generic chatbot.
Nobody owns it after launch
Maybe the quietest killer. The AI ships, the team is relieved, everyone moves on to other projects. Three weeks later a new product appears on the site that isn't in the knowledge base. Five weeks later return policies change. Eight weeks later the bot doesn't know anymore.
AI implementations aren't software projects with a closing ceremony. They're more like a garden: without daily attention they go to seed. With us that includes a monthly report combing through conversations for new question categories, escalation patterns, and knowledge-base gaps. That isn't a luxury, it's maintenance. Without it a chatbot loses forty to sixty percent of its effectiveness within three to six months.
Four questions before you call anyone
If you're considering AI implementation in your e-commerce right now, answer these four questions for yourself before you contact a single vendor.
What is the concrete problem I want to solve, measured in a number I can check in six weeks? Do I have a baseline today, or do I need to set one up first? Can I start narrow, on one page, with one type of question? And who's responsible for keeping the knowledge base up to date after launch?
No answer to those four? Do that first. Got answers? You're already further than most teams in the seventy percent BCG wrote about.
Want to walk through this for your own shop? A short thirty-minute strategy call. We look at your tickets, your product data, and where AI makes economic sense. No pitch deck, no sales talk. Book via the button at the top, or email info@dalvora.nl.