How to Evaluate an AI Consulting Firm
Almost everyone now offers “AI services.” The hard part is telling who can actually ship a system that works and keeps working. Here are the criteria that separate the two — and how to check them yourself before you sign anything.
A demo is not a deliverable
The gap between an impressive demo and a system your team relies on is where most AI projects die. Evaluating a firm well means probing for the parts that don't show up in a sales deck: real data, verifiable answers, what happens after launch, and whether the firm will tell you “don't build this” when that's the honest answer.
Use the criteria below as a checklist. For each one, we've noted how we'd answer it — partly as an example of what a straight answer looks like, partly because you can go verify ours.
Six things to check
Can they show working systems?
Anyone can describe what AI could do. Ask to use something they've actually shipped — not a recorded demo, a live system you can interact with.
How we answer: We keep several live demos you can try right now, no sales call required.
Do they run what they build?
A system that's handed over and abandoned tends to rot. Ask who operates, monitors, and maintains it after launch, and what that costs.
How we answer: We run our systems as a managed service by default — hosting, monitoring, and maintenance — with a clean handoff available if you'd rather own it.
Do the answers show their evidence?
For anything high-stakes, an AI answer you can't verify is a liability. Ask how a user checks that an answer is right.
How we answer: Every answer traces to its source — a document passage, a log entry, a past case — with the requirement built into the architecture.
Will they tell you not to build?
A firm that recommends a custom build for every problem is selling, not advising. Ask when they'd tell you to buy off-the-shelf or do nothing.
How we answer: Discovery includes a build-versus-buy analysis, and we'll recommend buy or wait when that's the honest call.
Do they scope narrow?
Multi-quarter programs with nothing to show are how budgets disappear. Ask how quickly you'll see a working system and how small the first commitment is.
How we answer: We scope one workflow first — typically a 4–8 week build — so you see something real in weeks, one reversible step at a time.
Do they leave your team able?
If only the vendor understands the system, you're locked in. Ask what documentation and training you get, regardless of who operates it.
How we answer: Every engagement includes documentation, runbooks, and training so your team understands what it's using — no black boxes.
Check our work, not our claims
The best way to evaluate a firm is to use something it built. These are live — try to break them.
USGA Rules Expert
A cited document assistant for the Rules of Golf. Ask it something tricky and check the citation.
Compliance Copilot
Policy and procedure review with sources and cross-document findings.
Furuno Diagnostics
A diagnostic copilot over equipment telemetry that shows its evidence for every finding.
Solution Brief Builder
Describe a business problem and watch it shape a scoped solution brief in real time.
Frequently asked questions
How do I choose an AI consulting firm?+
Start from working evidence, not promises. Use a system the firm has shipped, ask who operates it after launch, and check whether its answers can be verified. Favor firms that scope narrow, give an honest build-versus-buy view, and leave your team able to understand the result. A firm confident in its work will let you test it before you commit.
What questions should I ask an AI consultant?+
Ask to use a live system they built; ask who hosts and maintains it after launch and what that costs; ask how a user verifies an answer is correct; ask when they'd recommend buying off-the-shelf instead of building; ask how quickly you'll see a working result and how small the first commitment is; and ask what documentation and training your team gets.
What are red flags when hiring an AI consultant?+
Watch for demos you can't actually try, no clear answer on who operates the system after launch, answers that can't be traced to a source, a recommendation to build custom for every problem, multi-quarter timelines with nothing usable early, and a setup where only the vendor understands the system. Unverifiable statistics in a cold pitch are another warning sign.
Should an AI consultant also operate the system after building it?+
Often yes. AI systems need monitoring, model and prompt maintenance, and quality checks to keep working, and that responsibility frequently falls through the cracks after a build-and-handoff. Having the firm run it as a managed service keeps it healthy. The alternative — a clean handoff with documentation and training — is reasonable if your team has its own AI operations capability.
How much does AI consulting cost?+
It varies with scope, but a sound structure keeps each step small. A fixed-fee discovery (typically 2–3 weeks) tells you whether to proceed before you commit to a build, a fixed-scope first build follows, and managed operations is an ongoing monthly cost. Be wary of large open-ended engagements with no working deliverable along the way.
How do I know if an AI vendor's demo is real?+
Ask to use it yourself, live, with your own inputs rather than the scripted example. Push it toward edge cases. For anything that makes a claim, check whether it shows a verifiable source. A real system holds up to poking; a staged one tends not to.
Putting us through the checklist?
Ask us anything on this list. We'd rather answer the hard questions now than oversell and disappoint later.
Start a conversation