Compliance Chiefs Face New Challenge: Evaluating AI Agents That Promise to Replace Analysts

Financial institutions rushing to deploy AI agents for anti-money laundering and compliance work face a thorny question: how do you evaluate software that's supposed to think like a human analyst, but operates in ways even its creators can't fully explain?

That's the central challenge Peter Piatetsky, cofounder and CEO of Castellum.AI, addressed in a January 21st interview on the Fintech Business Podcast. As agentic AI—software that can autonomously perform multi-step tasks—becomes "one of the hottest topics in fintech, banking and compliance," according to host Jason Mikula, compliance leaders are grappling with how to assess these systems under existing regulatory frameworks designed for simpler automation.

The conversation highlighted a fundamental tension in financial services' AI adoption. Regulators are signaling openness to AI agents in AML and know-your-customer workflows, but they're also demanding the kind of documentation and governance controls that don't map neatly onto systems that learn and adapt. For CFOs overseeing compliance budgets, this creates a peculiar bind: the technology promises efficiency gains, but the regulatory overhead of proving it works safely may eat those savings.

Piatetsky's firm is among a wave of vendors pitching AI agents specifically for financial crime compliance—a labor-intensive function where banks and fintechs employ armies of analysts to review transactions, investigate alerts, and file suspicious activity reports. The appeal is obvious: if an agent can handle the grunt work of case investigation, compliance teams could theoretically shrink headcount or reallocate staff to higher-value work. But the "if" is doing heavy lifting.

The discussion touched on what Piatetsky called the need for "examiner-ready controls"—documentation that can satisfy bank examiners who show up asking how a decision was made. This is where agentic AI gets weird. Traditional compliance software follows rules: if transaction amount exceeds X and involves country Y, flag it. An agent, by contrast, might review a customer's transaction history, cross-reference public records, assess the plausibility of their stated business purpose, and make a judgment call—all without a human in the loop. Explaining that process to a regulator in a way that doesn't sound like "the AI just decided" is the governance challenge.

The conversation also covered model governance, a term that makes compliance officers twitch. Banks already struggle to document and validate traditional machine learning models under frameworks like SR 11-7. Agents introduce new wrinkles: they can invoke other models, chain together reasoning steps, and potentially behave differently in production than in testing. Piatetsky discussed how Castellum.AI approaches this problem, though the podcast format left the technical details somewhat opaque.

For finance leaders, the subtext is clear: agentic AI in compliance isn't a simple "buy and deploy" proposition. It requires rethinking how compliance programs are structured, how performance is measured, and how accountability flows when something goes wrong. If an agent misses a money laundering scheme, who's responsible—the vendor, the bank's compliance officer, or the executive who approved the budget?

The interview suggested that agents will "reshape compliance programs over the coming years," but offered little clarity on the timeline or what that reshaping actually looks like. That ambiguity may be the most honest take available. The technology is moving faster than the regulatory guidance, and faster than most institutions' ability to absorb it. CFOs funding these initiatives are essentially buying options on future efficiency, with the strike price denominated in governance complexity.

What's certain is that compliance leaders can't ignore the trend. As Mikula noted, agentic AI is already the conversation in fintech and banking circles. The question isn't whether to evaluate these tools, but how to do so without either falling for vendor hype or missing a genuine shift in how financial crime work gets done.