The next wave in finance is foundation models that can actually invest, not just talk about markets.
As of today, agentic finance is about foundation models quietly becoming the core infrastructure for how capital decisions are researched, prepared, and executed. These systems are shifting the paradigm from who can hire the most analysts to who can best encode judgment, workflows, and proprietary data into software. Only a handful of tightly constrained agents are trusted to act directly with capital, but the way information moves through firms is already changing.
In other words, finance is beginning to standardize on a two‑layer architecture: a general‑model base that nearly everyone can access, and a vertical layer where each firm encodes its own rules, risk appetite, and data. The base layer is made of frontier models that can read, reason, and call tools across emails, filings, terminals, and spreadsheets. The vertical layer sits on top and defines how a particular fund underwrites credit, runs a strategy, or satisfies regulators, turning tacit institutional know‑how into explicit, repeatable workflows.
Agentic AI in Finance: The Shift from Chatbots to Auquan’s Vertical Workflows
Anthropic’s Claude Skills show how this embeds into real workflows. A user provides a credit agreement, borrower financials, and a memo request; the system infers the checklist and structure that a credit team would use and runs through it end to end. The model is not only chatting but executing steps that map to actual underwriting work, from extracting key terms to assembling an investment committee document. Claude for Excel, and competing features in ChatGPT and others, sit directly inside spreadsheets and listen carefully, follow instructions precisely, and think through complex problems in the environment where finance teams already live.
In trading, this shift plays out publicly in Nof1’s Alpha Arena, where LLMs crush benchmarks on paper but behave like fragile interns in live markets with real stakes. A cohort of models, including GPT‑5.1, Claude Sonnet 4.5, Qwen3‑Max, DeepSeek, Gemini 3, and Grok‑4.20, each got capital to trade real order books under different “personalities.” Grok‑4.20’s best configuration grew its account meaningfully, while the majority burned through theirs despite looking brilliant on static tests. The lesson is that raw general intelligence is not enough; high‑stakes environments expose the gap between static smarts and real‑world execution, and that gap has to be bridged with domain‑specific agents that encode trusted judgment, constraints, and feedback loops.
If live trading shows how brittle unconstrained intelligence can be, credit shows what adoption looks like when you wrap models in tight, audited workflows.
In credit, the change is less visible than in trading but more important for how institutions actually work. Auquan builds agents for institutional finance teams that sit directly in their workflows. These agents read borrower financials, data‑room documents, and external data; extract key fields, ratios, and covenants into structured formats; apply internal risk frameworks and scoring; draft investment committee memos and portfolio review packs in the firm’s own templates; and monitor covenants and portfolio metrics with alerts and reports. Large institutions use this to cut review times on big credit books and to increase the number of deals an analyst can evaluate without adding staff, effectively turning much of the “grunt work” of underwriting into reliable, auditable software.
“We’re not trying to replace how credit teams think,” Auquan CEO Chandi Jain said. “We’re trying to turn the work they already do on messy borrower packs into something that runs end to end as software, without forcing them to change their process.” That distinction captures the new edge: the value is not in generic intelligence, but in how well a firm can encode its existing playbooks, standards, and risk culture into systems that can operate at scale.
A clear division of roles is emerging. General foundation models handle reading, reasoning, drafting, and tool‑calling. Vertical tools like Auquan encode specific workflows, templates, risk rules, and audit requirements for credit investors. The model layer looks like an operating system for financial work. The vertical layer looks like applications that implement the way a specific firm underwrites, monitors, and reports on risk. Credit teams are not asking for open‑ended chatbots; they want systems that behave like an experienced associate following their own playbooks and leave a traceable record of what was done and why.
“Labs are proving you can put a general model at the center of a workflow,” Jain admitted. “But our job is to encode the judgment and checks that real credit teams already trust, so they can scale without lowering their standards.” Once that judgment is in software, the marginal cost of running another deal, another scenario, or another monitoring pass drops close to zero, and the bottleneck becomes the quality of the encoded logic rather than the number of human hours.
Put together, finance is starting to look like a two‑layer cake. At the base, foundation models plus agents handle extraction, research, modeling, checks, and integration with internal and external systems, often through a chat surface. On top, domain‑specific logic, proprietary data, internal models, and decision rules sit as the differentiating layer. Analysts move from manually building every spreadsheet and deck to running and supervising these workflows, checking edge cases, resolving conflicts, and adjusting frameworks rather than retyping numbers between tools.
Once general models are proven inside high‑stakes, audited workflows in finance, the same pattern carries into other sectors. Healthcare claims, industrial maintenance, legal review, and logistics all center on large volumes of text and numbers, tacit rules, and binding decisions. Each can be restructured into a general model layer plus a vertical workflow layer that encodes how that industry operates, which is why labs are already pushing deeper into application and workflow tooling and partnering with or buying companies already embedded in core processes.
Underneath that, there is a compression and redistribution of financial “intelligence.” Historically, high‑end credit analysis and trading signals were confined to a small set of firms with the budget for specialist teams and custom systems. Now vertical platforms turn underwriting and monitoring into productized workflows, and consumer and prosumer tools give individuals access to agents that can build DCFs, parse filings, and assemble investor‑grade decks. In trading, models are being allowed to act directly with capital; in credit, agents are taking over large parts of the document and workflow load.
The advantage shifts from hiring more people to being the firm that best encodes its judgment and proprietary data into systems, then runs them relentlessly.



