I Gave AI Tools a Real Client Project — Here's Every Mistake They Made

The invoice was due Friday, the client wanted a complete content strategy for their new wellness brand, and I decided this was the perfect time to see if AI could actually handle paying work. Spoiler: I stayed up until 2am fixing things I thought would take 20 minutes.

The Setup: What I Actually Asked Them to Do

Real project, real stakes. A wellness coaching business needed a brand voice guide, five email sequences, landing page copy, and social media content for their launch. Total scope was about 12 hours of my usual work. I figured AI could cut that in half, minimum.

I used Claude for the strategy work and longer writing. ChatGPT-4 for brainstorming and variations. Jasper for the email sequences since that's supposedly their thing. Copy.ai for social posts. Gave each tool the same brand brief, target audience details, and competitor examples.

My assumption going in: the AI would nail the quantity and I'd just need to polish for quality. I was wrong in a way that actually embarrassed me.

The Mistakes That Almost Cost Me the Client

First disaster: brand voice consistency. I had Claude write the brand voice guide, which was actually solid. Clear, specific, usable. Then I fed that exact guide to the other tools for their respective tasks. None of them maintained it.

ChatGPT kept defaulting to this peppy, exclamation-point energy that was completely wrong for a calm, grounded wellness brand. Jasper wrote emails that sounded like a different company entirely — aggressive, salesy, "limited time offer" vibes when the brand was explicitly anti-urgency. I had to rewrite about 70% of the email content.

Second problem: invented specifics. Copy.ai generated social posts that referenced "our signature 8-week transformation program" and "the breathing technique thousands have used." The client had a 6-week program. There was no specific breathing technique yet. The AI just made stuff up that sounded plausible, and if I hadn't caught it, the client would have been posting lies.

Third issue, and this one surprised me: structural repetition. Claude kept starting paragraphs the same way. "As you begin your journey..." appeared in three different documents. "The key to..." showed up five times across the landing page copy. I didn't notice until I read everything back-to-back, and then it was painfully obvious.

But the mistake that really got me: tone deafness to sensitive topics. The wellness brand works with people recovering from eating disorders. I mentioned this in the brief. ChatGPT still generated content about "shedding those extra pounds" and "getting the body you've always wanted." Not just off-brand — potentially harmful. I caught it, but it shook me.

The Kick: What Actually Worked (Against My Expectations)

Here's the thing nobody tells you: AI tools are backwards from what you'd expect. They're better at complex strategy than simple execution.

Claude's brand voice analysis was genuinely impressive. I gave it the client's existing website copy, three competitor examples, and their intake questionnaire responses. It identified patterns I hadn't consciously noticed — like how the client naturally used questions rather than statements, and how they avoided first-person plural ("we") in favor of "you" and "I." That observation alone made the final voice guide stronger than what I would have written solo.

The strategic work saved me about three hours. The execution work — the emails, the posts, the actual writing — cost me four extra hours in revisions. Net loss of an hour, plus significant frustration.

What I do differently now: I use AI for the thinking phase only. Audience analysis, competitive positioning, content structure, angle brainstorming. Then I write the actual words myself. Or I use AI as a first draft that I expect to rewrite entirely, not polish.

The other discovery: AI tools don't flag their own uncertainty. When Jasper didn't know something, it didn't say "I'm not sure about this client's specific offering." It just confidently wrote something wrong. There's no hesitation indicator. Every output comes with the same level of apparent confidence, whether it's accurate or fabricated.

What I'm Still Trying to Figure Out

I submitted the project on time. Client was happy. They have no idea how much of it was AI-assisted or how many mistakes I caught before they saw anything. And honestly, that feels like the real takeaway here.

The labor didn't disappear. It shifted. Instead of writing from scratch, I was editing, fact-checking, and maintaining consistency across tools that don't talk to each other. Different work, not less work.

I keep hearing that prompt engineering is the solution — that if I'd just prompted better, the outputs would have been better. Maybe. But I spent 20 minutes crafting detailed prompts with examples and constraints, and the tools still invented fake programs and ignored the brand voice I explicitly provided. At some point, the prompt refinement becomes its own time sink.

The weird part is I'll definitely use AI on client projects again. Just not the way I did this time. Not as a production tool. More like a thinking partner who occasionally lies to me and needs constant supervision.

Which raises a question I genuinely don't have an answer to: if the supervision takes this much effort, when does the efficiency actually kick in? Or does it only work for people who care less about the output quality than I do?

Heads up: Some links in this post may be affiliate links. I only recommend tools I've personally tested. Opinions are entirely my own.

이 블로그 검색

Actually Tested AI