AI in Treasury & Finance: Keep the Speed, Keep Control

Q: Is it safe to use ChatGPT or Copilot for accounting or treasury work?

It depends on what goes in and what relies on what comes out. Drafting a memo is low risk. Pasting bank statements, forecasts, customer data, or payment details into a public AI tool — or relying on AI-produced figures without reconciling them to source — is a control gap, not a technology question. Set written rules for what data may enter which tools, and validate any output that feeds a decision.

Q: Does our finance team need an AI policy?

If anyone on the team uses AI — including features inside accounting, ERP, or banking software — then yes, a short one. A useful finance AI policy fits on a page: which tools are approved, what data may never leave the company, which outputs need review before use, and who owns exceptions. Generic company-wide AI policies rarely address finance data specifically.

Q: The AI features in our accounting software come from a major vendor. Aren't they already controlled?

Vendor scale is not the same as control evidence. The practical questions are what data the feature processes and retains, whether it can act or only recommend, whether its actions and overrides are logged, and whether outputs trace back to source records. Those answers vary by vendor, by module, and by release — and features are often switched on by default.

Q: Can we rely on AI for cash forecasting or FX exposure analysis?

AI can accelerate the drafting and data-shaping work, but benchmarks of realistic enterprise finance workflows show even frontier systems fail the majority of end-to-end tasks. Treat AI output as a draft from a capable new hire: useful, fast, and reconciled to source before anything — a hedge, a payment, a board number — relies on it.

Q: AI use is already widespread in our finance team. What should we do first?

Start with visibility, not prohibition. Give staff a safe channel to disclose the AI shortcuts they already use — amnesty first, rules second. Then map what touches material work (cash, payments, reconciliations, FX, reporting), set validation and logging expectations for those workflows, and name an owner. Banning AI without visibility just pushes use further out of sight.

What is actually happening

AI is entering finance faster than the process around it.

None of this is a scandal. It is what adoption looks like when useful tools arrive before workflow rules do. But finance is where the numbers travel: into payments, hedges, covenants, board packs, and filings.

Informal use is already here

Staff use ChatGPT, Copilot, and browser AI to summarize reports, clean exports, draft variance explanations, and reconcile faster. It is invisible by default, and banning it just moves it further out of sight.

Vendor AI arrives switched on

Accounting, ERP, and banking platforms now ship AI features inside licences you already hold. Few teams have reviewed what those features touch, retain, or are permitted to do.

No trail by default

Once an AI-assisted figure is pasted into the workbook, it looks identical to a human one. If nobody flags it, nobody reviews it — and nobody can answer for it later.

Confidence is not accuracy

The failure mode that matters in finance is not a crash. It is a fluent, plausible, wrong number that nobody thought to reconcile back to source.

What the evidence says

Useful is not the same as reliable.

AI genuinely accelerates finance work: summarizing, classifying, drafting, comparing files, spotting anomalies. That value is real, and teams that refuse it outright will simply fall behind teams that use it well.

But when researchers test AI systems against realistic, end-to-end enterprise finance workflows — messy spreadsheets, emails, cross-file retrieval, validation, reporting — the results argue for supervision, not blind trust.

In the Finch benchmark of real-world enterprise finance and accounting workflows, the strongest model tested passed 38.4% of workflows — despite spending an average of 16.8 minutes per task. The tasks are the exact texture of month-end: multi-file, multi-step, judgment-laden work.

That is a warning about unattended reliance, not a verdict on usefulness. Model capability will keep improving. What does not change: an AI system cannot be the accountable party for a payment, a hedge, or a filed number.

Source: Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows (arXiv:2512.13168). Figures reviewed July 2026.

Your result

Where your team stands

0/ 100 control score

How this is scored — fixed logic, no AI, documented

Every answer carries a fixed risk weight of 0–3 points, set in advance and identical for everyone. Question 1 is not scored; it sets the materiality of your result (how much of your AI use touches cash, payments, reconciliations, FX, or reporting).

Each dimension’s control score = 100 × (1 − points ÷ maximum points for that dimension).
Overall score = weighted average: data exposure 25%, validation 25%, audit trail 20%, visibility 15%, policy & ownership 15%.
Grades: A ≥ 85 (Controlled) · B ≥ 70 (Managed) · C ≥ 55 (Informal) · D ≥ 40 (Exposed) · E < 40 (Unmanaged).
“Don’t know” answers score at or near maximum risk, deliberately: in finance, not knowing is a finding.

We publish the logic because that is the standard we think any tool touching finance decisions should meet. If a vendor cannot show you how their AI scores, ranks, matches, or decides — that is one of the questions below.

Vendor due diligence

Ten questions your software vendor should be able to answer.

Before relying on AI features inside accounting, ERP, banking, or treasury systems, ask for evidence — not the brochure. A vendor with real controls will answer these quickly. Hesitation is also an answer.

What data do the AI features use, and where is it processed?
Are prompts, files, and outputs retained — and are they used to train anything?
Can the AI act — post, match, code, pay — or only recommend?
Is every AI action and recommendation logged, and can we export those logs?
Can an AI output be traced back to the source records it drew on?
What permissions does the AI inherit — can it surface data a user could not otherwise see?
Are human overrides of AI suggestions visible in the audit trail?
What has the feature been tested against, and what are its known failure modes?
When it is confidently wrong, whose problem is that — contractually?
Can we switch specific AI features off — per module, per user, per entity?

Where Bastion fits

We review the workflow, not the brochure.

Bastion is independent: not a software vendor, not a reseller, and not paid to recommend anyone’s AI. Our work is finance-workflow judgment — the same discipline we apply to FX, payments, and treasury process.

We use AI ourselves — under controls

Bastion uses AI every week in its own research and publication workflow: drafting under fixed validation rules, independent review, and human sign-off before anything reaches a client. We built these controls for our own work before recommending them to anyone else.

Where AI meets treasury and FX

The highest-stakes AI use in finance is where numbers drive decisions: cash forecasts, FX exposure summaries, payment workflows, reconciliations, reporting. That is exactly the workflow territory Bastion already reviews in its treasury and FX advisory work.

Practical, not prohibitive

The goal is not to ban AI. It is a short set of working rules: where AI is allowed, where it is advisory-only, where it is prohibited, what gets validated, what gets logged, and who owns it — so the team keeps the speed and the company keeps control.

Common questions

Asked by finance teams, answered plainly.

Is it safe to use ChatGPT or Copilot for accounting or treasury work?

It depends on what goes in and what relies on what comes out. Drafting a memo is low risk. Pasting bank statements, forecasts, customer data, or payment details into a public AI tool — or relying on AI-produced figures without reconciling them to source — is a control gap, not a technology question. Set written rules for what data may enter which tools, and validate any output that feeds a decision.

Does our finance team need an AI policy?