The most common reason AI automation projects fail isn't a technology problem — it's a process problem that was there before any AI was introduced. Teams invest in LLM integration, build the tooling, and then discover that the workflow they automated was already broken in ways that a language model can't fix and may amplify.
This is the framework we use before any automation engagement. It's designed to answer one question: is this process actually ready for AI augmentation?
Start with the human-executed version
Before touching any technical layer, document how the process currently works when humans do it. Not the official version in the SOPs — the actual version, including workarounds, exceptions, and the informal knowledge that exists only in the heads of the people doing the work.
Shadow sessions are the most effective method here. Sit with the people executing the process and map what they actually do. The gap between documented and actual processes is almost always significant, and in that gap lives most of the complexity that will cause your automation to fail.
Questions worth answering during this phase
- What percentage of cases go through the standard path vs. requiring manual judgement?
- Where does the process depend on institutional knowledge that isn't written anywhere?
- What happens when something goes wrong, and how does the team recover?
- How does output quality vary, and what causes the variance?
The data question
Most LLM automation requires structured data inputs or the ability to ground the model's responses in reliable data sources. Before any design work, audit the data the process depends on.
The questions that matter most are about quality and availability, not volume. A small, clean dataset with reliable labeling is more useful than a large corpus with inconsistent formatting and missing fields. We've seen teams with millions of records realize during an audit that the field they planned to use as a primary signal is populated correctly for fewer than 40% of records.
Classifying processes by automation viability
Not all processes are equally good candidates for LLM automation. The ones that tend to work well share several characteristics: they have relatively bounded input types, the quality of output can be evaluated at scale, errors are recoverable without serious downstream consequences, and the process doesn't require real-time responses under a second or two.
Processes that are poor candidates include anything where errors are high-stakes or irreversible without human review, anything where judgement depends on context that can't be captured in the prompt or retrieved from a data source, and anything where the definition of "correct" output varies significantly by stakeholder.
The goal of a process audit isn't to find reasons not to automate. It's to understand what you're actually automating before you build it.
Output from a readiness audit
A useful process audit produces a prioritized map of automation candidates with honest assessments of readiness, a list of blockers that need to be resolved before automation makes sense, and a recommendation for where to start if you decide to proceed. It should also include a recommendation to not proceed if the evidence points that way.
The audits we've found most valuable for clients were the ones where we said "this process isn't ready, here's why, and here's what needs to change." That's not a comfortable conversation, but it's more valuable than six months of development that proves the same point at much higher cost.
A note on scope management
Automation scope tends to expand during development. A project that starts as "automate the intake process" becomes "and then automate the routing" and then "and then automate the response drafting." Each addition is locally reasonable, but the cumulative complexity often exceeds what the underlying process quality can support.
Setting a firm scope boundary before development starts — and having a framework for evaluating scope additions against process readiness criteria — tends to produce better outcomes than iterating without those guardrails.