News & Events - Collo

Why the factory floor is the hardest place to put AI

Written by Jani Puroranta | Jun 16, 2026

 

Industrial AI isn't behind because it's late. It's behind because it's harder.
Generative AI went from curiosity to boardroom mandate in about two years. Industrial AI has been "almost ready" for a decade. If you run a production operation, you have sat through more pilot reviews than you can count and watched most of them quietly disappear. So have I, having spent close to a decade working with predictive algorithms for heavy industrial processes, and watched plenty of my own work meet that fate.

It is tempting to read that as a technology gap. It isn't. The algorithms have not been the constraint for years. The reason AI conquered the inbox before the plant floor is that the two problems only look similar. Underneath, they are almost opposites.


The appetite is here; the data isn't

Start with the appetite, because it is happening now. In KPMG's 2026 global survey of industrial manufacturers, a whopping 68% of executives expect to be deploying AI at scale within the next 12 months. But the same survey holds the catch: 76% still name unreliable data as a top AI risk. The industry is about to spend heavily on AI while admitting the data underneath it cannot be trusted. 


By now “it’s the data, not the algorithm” has become almost received wisdom. Gecko Robotics’ founder Jake Loosararian says it plainly, the recent crop of failed-pilot studies says it, and most of the big consultancy reports say it too. They are right. But the diagnosis usually stops there, at the symptom. The more useful question is why industrial data is uniquely hard to get right in the first place, because that is what you actually need to fix. The next few reasons are my answer to that question.


Being wrong is physical

Here is the first reason the plant floor is different. Generative AI fails gracefully: a wrong sentence gets deleted. Industrial AI fails physically: a wrong setpoint produces a contaminated batch, a production line that fails hygienic tests, or a line stoppage that eats a whole shift.


And the payback is dangerously asymmetric. A good model might shave 5-15% off losses on a normal run, and do it reliably 99% of the time. But the 1% it gets wrong is not a 5% miss. It can be 1000x the size of a single good run's gain, because one bad call on a high-value tank wipes out the slow accumulation of small wins. The economics of industrial AI are not set by the average case. They are set by the worst case, and a model that does not respect that asymmetry will lose money while looking like it is working.


Nothing transfers between plants

The second reason is that every plant is its own problem. The English language is the same everywhere, which is why one language model serves hundreds of millions of people. Production is not. The same line in two plants behaves differently because of water chemistry, pipe geometry, equipment age, automation quirks and plant-specific interdependencies. A model trained in one plant often gets noticeably worse in the next. This is the quiet killer behind the "AI doesn't scale in manufacturing" complaint, and it is a fair one.


Truth is expensive and slow.

The third reason is one that few who have not lived inside the plant would appreciate: ground truth is expensive and slow. A language model checks itself against the internet for free. To know whether your process model is right, you take a sample, send it to a lab, and wait. You spend the first year of an industrial AI project building a dataset of hundreds of individual measurements, not the model.


And a trap waits even after you have the model. The model has to run on the cheap, indirect signals the process actually offers, a temperature, a flow rate, a conductivity reading, so that is what you build it to use. The lab data is not what the model runs on, it is the ground truth you train and check it against: the reference that tells you whether the model is pulling the real answer out of those poor signals or just fooling you. Without that ground truth you cannot know if the model is doing a good job. The intelligence was never the bottleneck. The missing piece was always a process signal rich enough that a model could read the truth from it in real time. 


Industrial AI has not lagged because the science was behind. It has lagged because being wrong is physical, problems don't transfer between sites, truth is expensive, and live data has been too thin to trust. None of that is solved by a bigger model.


What's changing now

What has changed is that the missing pieces are arriving: better real-time measurement, and a more disciplined way of scoping what you ask AI to do. That is the subject of my next piece. For now, one takeaway for anyone about to approve an industrial AI budget: do not ask a vendor how good their model is. Ask what is feeding it, and what happens the 1% of the time it is wrong. If they lead with the algorithm, be skeptical.

 

Blog series: Getting Industrial AI Right Is Hard by Jani Puroranta

Jani Puroranta is CEO of Collo, a deep tech company building Industrial AI and real-time sensing that enables liquid food and beverage producers to see and eliminate product, water, and energy losses as they happen. He has been working with physics-informed machine learning and industrial processes since 2017, already before the current AI wave.

 

 

Subject of my next piece → What good industrial AI actually looks like?