A Hallucination is a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus and usually arises from neurological disturbance (such as schizophrenia, Parkinson’s disease, or narcolepsy) or in response to intake of drugs (such as LSD or phencyclidine).
In the context of Artificial Intelligence (AI), an AI Hallucination is a plausible, but false or misleading response generated by an artificial intelligence algorithm. It is a phenomenon where an AI system, particularly a language model, produces information that appears accurate and convincing but is in fact incorrect, invented by the AI system, and not grounded in real data and not linked to factual evidence in real–world situations.
Hallucinations in AI systems can arise when AI systems, especially generative models, produce confident but misleading outputs due to limitations or poisoning of their source and / or training data, lack of context and relation to real life situations and / or flaws in probabilistic reasoning. Left unchecked, AI hallucination errors can result in real–world consequences–from legal sanctions to faulty funds transfers and reputational damage.
AI Hallucination presents a growing challenge for organizations leveraging AI in high–stakes domains, including healthcare, legal and judiciary systems, financial services, travel, defense, etc. It is significantly impactful in the financial services industry, where AI–based systems are entrusted to process banking and financial transactions including but not limited to funds transfers, executing trade transactions, managing deposits, servicing customers and informing regulatory decisions. In this space, AI hallucinations can result in large scale mismanagement of funds, customer complaints, regulatory and reputational impacts and other severe consequences.
A. About This Article
This article offers Financial Technology Frontiers (FTF)’s analysis from a control implementor’s perspective of AI Hallucination and addresses the risk by leveraging the NIST AI Risk Management Framework (NIST AI RMF). It maps the applicable controls of the NIST AI RMF Core onto practical tasks, documentation requirements, and governance roles required to manage AI hallucination. It also helps design a ready–to–use “Responsible, Accountable, Consulted and Informed (RACI)” chart for financial institutions to adopt and tailor them to suit their requirements.
This Article is written in an accessible format to raise awareness about the perils of AI. FTF believes that financial service providers, Fintech entities, consulting firms and technology companies can benefit from this article and ponder over their respective approaches to AI Risk Management. Practitioners from other industries are welcome to adopt this Article to suit their context.
B. Real World Impact of AI Hallucination – Case Study:
A June 2023 legal case in the United States is a well–fit subject of AI Hallucination. Lawyers representing the plaintiffs against Walmart used ChatGPT to draft and research legal arguments in a routine personal injury lawsuit. The case papers included made–up cases generated by ChatGPT in the court filing. While the citations looked plausible and professionally drawn up, they did not exist but were products of the AI’s pattern generation and not actual jurisprudence. Consequently, these fabricated citations generated by ChatGPT led to sanctions in the court. The court sanctioned the concerned lawyers and their Firm, ordering them to pay fines and issue apologies in open court.
Impact of the Incident:
- This incident made headlines, drawing attention to the dangers of accepting AI–generated content in legal professional high–stakes settings, without proper scrutiny and review.
- The case prompted urgent warnings across the legal and judiciary settings, with law firms and bar associations quickly issuing guidance on the responsible use of AI tools in the legal practice.
- Legal professionals worldwide were forced to implement stricter processes for verifying AI–generated research and explicitly disclosing the usage of AI in court filings.
- This incident is now cited in governance and compliance guidelines as a cautionary tale for uncontrolled AI adoption in the legal and regulatory professions.
C. Addressing AI Hallucination using NIST AI RMF
This incident illustrates the need for essential controls to:
- Scrutinize and validate the outputs of AI systems.
- Establish and follow clear documentation of the “Do’s and Don’ts” and limitations of AI systems.
- Ensure continuous monitoring and explicit human oversight on AI systems.
These controls are well–described and included in the NIST AI RMF. The NIST AI RMF highlights mandatory and continuous need to critically review the outputs of AI systems and not blindly trust them, and the Framework’s Core provides robust controls to identify, assess, and address hallucination risks. These include:
- MAP2. Document AI system’s knowledge limits and define necessary human oversight and escalation paths.
- MEASURE4 and 2.5. Continuously monitor, validate, and benchmark AI outputs, focusing on specific cases and contexts that are prone to error.
- MEASURE9. Operate explainability and contextual output annotation and ensure to have supporting traceability and critical review.
- MANAGE4 and 4.3. Mechanisms are instituted to disengage and deactivate AI systems not meeting usage objectives, and incident response, user reporting, documentation, and remediation procedures are documented and communicated.
When operationalized, these controls ensure AI Hallucinations are both visible and manageable and help in minimizing AI Hallucination risks.
D. Actors for Managing AI Hallucinations
Several resources across a financial institution play key roles and responsibilities to manage AI hallucination in the organizational AI systems. These roles are summarized below, along with indicative RACI.
Roles that are predominantly Accountable:
- Chief Information Officer (CIO) (Accountable): Ultimately ensures risk–aligned technology governance and integrity for AI systems.
- Chief Risk Officer (CRO) (Accountable): Directs enterprise risk priorities and ensures execution of mitigation mechanisms for managing AI risks.
- Chief Audit Officer (CAO) (Accountable): Independently audits and verifies the effectiveness of AI risk controls implemented by the 1st and 2nd Lines of Defense.
- Business Line Owner (Accountable): Owns process integrity in AI–enabled workflows for business impact and user safety.
Roles that are predominantly Responsible:
- Chief Technology Officer (CTO) (Responsible): Leads the implementation and technical operation of AI system controls.
- Chief Information Security Officer (CISO) (Responsible): Manages security risks and incident response for all AI–related outputs.
- Head of Compliance (Responsible): Develops, enforces, and oversees compliance policies and incident management processes for AI systems usage in the organization.
- AI / ML Product Owner (Responsible): Translates risk and business requirements into AI product controls and features.
- AI System Engineers & Developers (Responsible): Build, monitor, and enhance technical AI risk and hallucination mitigation controls.
- End–User Operations Leaders (Responsible): Oversees front–line monitoring, staff training, and incident escalation for AI operations.
Roles that are predominantly Consulted or Informed:
- Chief Privacy Officer (CPO) (Consulted): Advises on privacy and data protection requirements in AI operations.
- Chief Legal Officer (CLO) (Consulted): Advises on legal, regulatory and other federal / provincial / state requirements in AI operations.
E. About Financial Technology Frontiers
Financial Technology Frontiers (FTF) is a global media–led fintech platform dedicated to building and nurturing innovation ecosystems. We bring together thought leaders, financial institutions, fintech disruptors, and technology pioneers to drive meaningful change in the financial services industry.
Authored By: Narasimham Nittala.
All sources are hyperlinked.