Why AI Gets Big Numbers Wrong, and How to Fix It

Last time I told you to clean up your pile before you ask the machine anything. This is what you do when the pile is too big to clean, when even the tidy version is more than any model can hold in its head at once.

Start with the thing nobody wants to admit. A language model is a magnificent talker and a hopeless accountant. People keep handing it a spreadsheet with a million rows, asking it to add up a column, and then acting betrayed when the number comes back wrong. Of course it came back wrong.

Here is why. A model has a fixed amount of room to work in. When your data does not fit, it quietly starts grabbing handfuls instead of the whole thing, and it does not tell you it is doing it. And even when the data does fit, it is not reading the way a database reads. It is skimming and paraphrasing as it goes. Ask it for an exact total across a million rows and you will get a number that looks right and is not. It is brilliant at understanding what your data means. It is terrible at counting it. Counting, at scale, is a job for code. Full stop.

So split the work, because the request hiding inside analyze this and give me a report is really two jobs wearing one coat. The first is to understand the data. What is in it, how the pieces connect, where the bodies are buried. That takes judgment, and it needs to see the data. The second is to operate on the data. Filter it, add it up, produce the answer. That takes no judgment at all, only precision. Jam the two together and you get the confident, beautifully written, completely wrong report. Pull them apart and hand each one to something built to do it.

I run it as two workspaces. The first holds the real files, full size, and has exactly one job: write the code that runs against them and produces the answer. The accuracy lives in that code, not in the model squinting at a spreadsheet. The second workspace is a mirror of the first, with one change. The files have the same names and the same shape, but the contents are tiny fakes, a few kilobytes instead of however many gigabytes, built to carry the real structure and, more to the point, the real ugliness. Because they are small, this one can actually read all of it, plus a map of every field. Its job is not to compute. Its job is to think. You tell it what you are trying to learn, it looks at the whole picture, and it hands you the precise instruction to carry over to the one that does the counting.

There is a real gift in here if your data is the kind that lawyers worry about. The open-ended poking around, the part where you are thinking out loud and trying things, happens against the fakes, not against your actual customer records. Be honest about the limit. The real data still lives in the other workspace, and the code still runs against it, so it has not magically disappeared. But the messy, exploratory half of the job no longer requires you to pour regulated information into a chat window, and for anyone in finance or healthcare, that separation alone pays for the whole arrangement.

The entire thing lives or dies on those little fake files. The thinker can only plan for what it can see. Feed it neat, well-behaved sample rows and it will hand you a plan that explodes the instant the real data shows its teeth: the empty cells, the three different date formats, the duplicate keys, the column where somebody typed a word into a number field. So build the fakes to carry the ugly on purpose. Generate them from the real data, not by hand, so they stay honest. A good map of the fields is worth more than the sample rows anyway.

It is the same lesson as last time, just one floor down. When the data fits, you clean it once and ask away. When it does not, you shrink the part that needs a brain until the brain can see it, and you hand the part that needs a calculator to a calculator. Either way the rule is the one your grandmother could have told you. Use the right tool for the job. Do not ask the poet to do your taxes, and do not ask the accountant to write the eulogy. Give the model less to think about, and let it think.

Why AI Gets Big Numbers Wrong, and How to Fix It

Everybody Forwarded Me the Apocalypse