← Writing

Your AI Isn't Lying. Your Data Is a Mess.

Why building one clean dataset, once, fixes the lies and the bill at the same time.


I did the dumbest possible thing with these tools for the better part of a year, and odds are you are doing it right now, so let us skip the part where either of us pretends to be above it.

You have a real question, and the answer is scattered across a pile of files. Forty of them. Fifty. A heap of spreadsheets and documents nobody has organized since the company was founded. So you take the entire mess, drop it on the machine, and start firing questions. What were the top accounts last quarter. Break it out by region. What changed since last year. And every single time you ask, the poor thing goes back to the top of the pile and reads all fifty files again, from the beginning, like it has never seen them before. Because it hasn't. It does not keep notes. It does not remember this morning. Every question, it starts the whole job over from zero.

You are paying for that. Twice. Once in actual money, because rereading the phone book on every question burns through more of your budget than you think. And once in trust, because anything forced to hold a hundred pounds of unsorted paper in its head at once is going to drop a page. It will misread a number. It will glue together two files that never belonged within a mile of each other. It will hand you a total it more or less invented, and it will do it with the calm, unbothered confidence of a man who has no idea he is wrong. They always sound sure. That is the trap.

And when the answer comes back bad, here is what everybody does. They give it more. More files, more context, more detail, as if the problem were starvation. It is the exact wrong instinct.

Give it less.

Do the boring thing instead, and do it once. Hand the machine the whole ugly pile and give it a single job, the only one that matters: turn this into one clean set. One spreadsheet. Consistent columns. The garbage thrown out, the thing trimmed down to what you actually need. Then save it, like an adult, somewhere you can find it again.

Now start over with a clean slate. You do not drag the fifty files back out. You bring the one clean thing you just built, and you ask your real questions against that, and only that.

Watch what happens. The machine is not digging through the pile anymore. It is looking at one tidy thing, all of it, at once. The lies mostly stop, because there is nowhere left to get lost. The bill drops, because fifty became one. And the answers come back fast, and they come back the same way twice, which is the part everyone forgets to want. You paid for the hard reading exactly once. The rest of the world pays it again on every question they ask, all day, and calls it the cost of doing business.

A few things, since we are being honest. Check the clean set before you trust it, because the machine cuts corners too, and it is your name on whatever you send upstairs. Tell it exactly how you want the thing laid out instead of leaving it to guess. And when the underlying mess changes, build the clean set again from scratch. Do not patch the old one. Patches rot.

You were never short on horsepower. The machine has plenty. You were short on the fifteen unglamorous minutes it takes to set the thing up properly before you start demanding miracles from it. Do the boring part once. Then ask away.

Read next

Why AI Gets Big Numbers Wrong, and How to Fix It

A two-part setup for data too big for the machine to read, and why you split the thinking from the counting.

Coherive Consulting Group