Deloitte Repays AU$97,587 Final Tranche After AI Errors in AU$440k Welfare Review
Image Credit: Oleg Zarevennyi | Splash
Global consulting firm Deloitte has agreed to repay the final instalment of a 440,000 Australian dollar contract to the Department of Employment and Workplace Relations after errors in a welfare compliance review were linked to the use of generative artificial intelligence tools. The incident highlights emerging challenges in integrating AI into sensitive government work, where unchecked outputs risk eroding accuracy and accountability in public policy processes.
Deloitte used Microsofts Azure OpenAI system, including the GPT 4o model, to assist in drafting sections and cross referencing business rules with IT code. While the firm did not directly attribute the flaws to AI, the errors bore hallmarks of generative models hallucinations, prompting a revised submission and partial refund.
Origins of the Contract and Report
The project arose from ongoing reforms to Australias welfare payment system, building on the 2023 Robodebt Royal Commission, which exposed systemic flaws in automated debt recovery that impacted hundreds of thousands and triggered substantial compensation. In late 2024, the department commissioned Deloitte for an assurance review of its Targeted Compliance Framework, an IT based mechanism designed to automate penalties for potential overpayments while applying lessons from past issues.
Spanning late 2024 to June 2025, the work produced a 237 page document examining alignment between software operations and legal ethical standards. Titled Targeted Compliance Framework Assurance Review Final Report, it aimed to safeguard against Robodebt style harms. The contract, valued at 440,000 Australian dollars or about 290,000 US dollars, underscored the governments dependence on specialist consultants for tech audits amid fiscal constraints.
This engagement mirrored industry trends, with firms like Deloitte accelerating AI adoption to streamline delivery, having committed 3 billion US dollars globally to generative technologies by 2030, including partnerships with entities such as Anthropic.
Discovery of AI Linked Errors
Issues surfaced in late August when Chris Rudge, a University of Sydney researcher focused on health and welfare law, examined the report published on the departments website in July, with formal publication noted mid August. Rudge identified a reference to a nonexistent book purportedly by colleague Professor Lisa Burton Crawford, whose expertise lies in constitutional law, not the cited topic.
Deeper review uncovered about a dozen references to nonexistent works, including those attributed to Professor Lisa Burton Crawford, plus fabricated items linked to Swedish professor Björn Regnell. Additional flaws included a misquoted excerpt from a Federal Court judgment and a misspelling of the judges name as Justice Davis, referring to Justice Jennifer Davies.
Rudge shared findings with the Australian Financial Review, which published an exposé around August 30, amplifying scrutiny. Such hallucinations, common in tools like GPT 4o, arise when models invent plausible details to bridge data gaps, posing risks in analytical tasks blending language and technical elements.
Deloittes Swift but Limited Recourse
Deloitte responded promptly, submitting a revised report by September 26, which the department uploaded to its site. Edits removed erroneous citations, corrected the judgment summary, and improved clarity without changing core conclusions.
A new preface disclosed Azure OpenAI use to help cross reference business rules and code for traceability. The matter has been resolved with the client, a Deloitte spokesperson noted, affirming the substantive content held firm.
In early October, the department stated Deloitte would repay the final instalment, later disclosed as 97,587 Australian dollars after processing. Greens Senator Barbara Pocock, chairing a Senate committee on corporate accountability, deemed it inadequate and urged a full refund. Deloitte misused AI and used it very inappropriately: misquoted a judge, used references that are nonexistent, she said on ABC radio. I mean, the kinds of things that a first year university student would be in deep trouble for.
The department affirmed the reviews essence persists, with key recommendations on enhancing human oversight in automation and conducting regular bias audits remaining unchanged.
Ripples Through Consulting and Regulation
This case represents an early instance of AI related refunds in Australian public sector consulting, challenging trust in major firms that secure such deals. Deloitte, posting 70.5 billion US dollars in global revenue for fiscal year 2025 and employing over 457,000 staff, has emphasised AI integration to boost efficiency.
Yet broader surveys reveal caution: a 2025 McKinsey Global Survey indicates value driven AI trends but persistent concerns over inaccuracies and transparency. For the department, the event may slow compliance enhancements, leaving welfare users potentially vulnerable to imperfect systems longer.
It also spurs demands for AI disclosure mandates in procurement, akin to international responses like US court penalties for unverified AI in legal filings.
Looking Ahead: Governance Over Hype
As AI proliferates, analysts anticipate a pivot toward structured safeguards. Surveys show varying adoption: a 2025 Pacific AI report notes 59 percent of organisations have dedicated AI governance roles, though comprehensive risk practices lag, with less than 20 percent fully implementing tools like model cards per a Data Exchange study.
In Australia, this could drive tender reforms, such as verification requirements and nondisclosure penalties, extending the 2023 AI ethics framework from the Department of Industry, Science and Resources. Observers see the mishap as a procedural lapse rather than tech failure: models like GPT 4o shine in fluency but require robust human validation, particularly in regulated arenas.
Trends suggest hybrid approaches, with AI for routine duties and experts ensuring integrity, possibly lifting costs by 10 to 15 percent for added checks. The department is pursuing ongoing implementation of the advice, mindful of technologies dual edge in public service reform.
We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.
