I spent months building two versions of the same AI agent system. One was an integrated design — a single agent that researches, decides, acts, and adjusts. The other was a pipeline — one component generates signals, another evaluates them, a third executes. The pipeline approach felt cleaner, more "engineered." It was also significantly worse.
The pipeline promise
The idea behind the pipeline was appealing. Separate signal generation from execution. Let one model do the thinking and another do the doing. You get modularity, testability, the ability to swap components. It's how we build most software systems, and it makes intuitive sense for AI too.
Early results looked promising. The pipeline generated a large volume of signals, and initial performance was strong. For about two weeks, it significantly outperformed the integrated approach.
The collapse
Then it fell apart. Over the following months, performance degraded steadily until the pipeline's edge had almost entirely evaporated. The same system that looked brilliant in week two was barely functional by month five.
The integrated agent, meanwhile, remained consistent. Not always spectacular, but stable. It adapted to changing conditions in a way the pipeline couldn't.
Why integrated agents work better
After analyzing what went wrong, I think the core issue is context loss. When you separate the thinking from the doing, you strip away the nuance that informed the decision in the first place. The signal generator knows why it made a recommendation — the subtle factors, the uncertainty, the edge cases. The executor only sees the output: a clean signal that says "act" or "don't act."
The integrated agent, by contrast, carries its full reasoning context through the entire process. When conditions change, it can re-evaluate based on its original analysis. It remembers what it was uncertain about. It knows which assumptions were weak. This contextual memory turns out to be critical for adapting over time.
There's an analogy to human expertise here. A good doctor doesn't separate diagnosis from treatment — their treatment decisions are informed by the nuances of the diagnosis process itself. A good engineer doesn't write a spec and hand it to someone who wasn't in the room for the design discussion. Context matters, and it's hard to serialize.
The lesson for AI system design
I think the instinct to decompose AI systems into clean pipelines comes from traditional software engineering, where it's usually the right approach. But AI agents aren't traditional software components. Their value comes from the richness of their reasoning, and that reasoning doesn't survive compression into a signal.
If you're building a system where an AI agent needs to make decisions and adapt over time, keep the agent integrated. Give it ownership of the full loop — research, decision, action, monitoring, adjustment. The messiness of an integrated system is a feature, not a bug. The context it preserves is what makes it work.
Sometimes the less "engineered" solution is the better one.