By Anahita Bilimoria, Decision Lab Innovation Practice Lead
An essential framework for Responsible AI Deployment
In this blog post, we dive deeper into the AI TRiSM principle of Trust. Together, the principles of AI TRiSM (Trust, Risk, and Security Management) add transparency, understandability, and reliability to our AI systems.
Continuing from our previous blog on AI TRiSM, Building Trust in the Age of Artificial Intelligence, where we took a holistic view of the three core pillars of AI TRiSM, this blog post dives deeper into the principle of Trust. Welcome to part two of our AI TRiSM series: Trust in AI systems.
The Foundation of Adoption: Building Trust in AI
AI has integrated into our daily lives in unprecedented ways, from using Gemini or ChatGPT to summarise reports to utilising tools like Google’s NotebookLM for learning. However, the reliability of answers given by AI systems often remains uncertain. While we confidently ask a Language Model for code, broad trust in AI systems is still a major concern.
Trust: An assured reliance on the character, ability, strength, or truth of someone or something
Webster’s dictionary
In the context of AI, a more fitting definition may be: the attitude that an AI agent will help achieve an individual’s goals in a situation characterised by uncertainty and vulnerability. Clearly, building trust goes beyond simple belief in their capabilities—AI chatbots can seem remarkably confident. Rather, we must establish a calibrated trust: confidence that an AI system will behave reliably, ethically, and securely, leading to intended outcomes while understanding its limitations. For the human in the system, this confidence operates on two crucial dimensions:
- Cognitive Trust: based on evidence, competence, and reliability. Confirmed by performance metrics such as accuracy, loss, and F1 score, ensuring the system works as expected. Answers the question: can I trust it?
- Emotional Trust: based on comfort, security, and ethical alignment. The confidence that the system aligns with moral or societal values and will not discriminate against users. Answers the question: do I want to trust it?
The goal of AI TRiSM is to foster a balanced level of trust across these dimensions, steering clear of algorithm aversion (distrusting a competent system) and automation complacency (over-relying on a flawed system). Trust in AI is not a single feature but the culmination of several measurable and governable qualities. The AI TRiSM framework tackles these factors by providing actionable strategies:
1. Explainable and Transparent (XAI)
The defining challenge for trust is the ‘Black Box’ nature of modern, complex AI models. If a system’s decision cannot be understood or audited, it fundamentally cannot be trusted.
Explainable AI (XAI) addresses this by providing insight into how and why a model reached a specific output. This is essential not just for a user’s peace of mind, but for auditing, compliance, and legal accountability.
How it can be achieved:
- Employing Inherently Interpretable Models: Using simpler models (like linear regression or decision trees) when complexity isn’t strictly necessary. Utilising inherent white-box models such as NeSy models (Neurosymbolic AI) and Causal ML routes that provide complete knowledge graphs of model knowledge.
- Providing Decision Tracing: Logging all input features and intermediate steps that lead to an outcome, allowing stakeholders to trace a decision back to its source. Applying methods like SHAP (SHapley Additive exPlanations) or LIME (Local Interpretable Model-agnostic Explanations) to complex models to generate human-readable justifications for individual predictions.
2. Fairness and Bias Mitigation
AI models are trained on data, and that data reflects historical and societal biases. Consequently, models are prone to inheriting these biases, leading to discriminatory or unfair outcomes. This directly breaks emotional trust. Building trust requires active and continuous steps to ensure fairness.
How it can be achieved:
- Pre-Processing Bias Mitigation: Conducting thorough Exploratory Data Analysis (EDA) to identify and balance data imbalances before model training (e.g., re-sampling minority classes).
- Model Bias Mitigation: Implementing constraints during the training process that penalize the model for differential performance across different demographic groups. Defining and monitoring multiple fairness metrics (like equal opportunity or demographic parity) after deployment to ensure equal outcomes across protected groups, going beyond simple overall accuracy.
3. Reliability, Robustness, and Safety
A trusted system must be dependable. Reliability is its ability to perform consistently and accurately under normal operating conditions. Robustness ensures the model’s accuracy is maintained even when facing slight variations or unexpected inputs. The final layer is Safety, which protects against catastrophic failure.
How it can be achieved:
- Continuous Model Operations (ModelOps): Implementing automated systems to monitor model performance in real-time, catching model drift (when performance degrades over time) or degradation after deployment.
- Stress Testing and Adversarial Training: Rigorously testing the model with malicious inputs and unexpected data shifts to improve robustness against adversarial attacks.
- Human-in-the-Loop Controls: Equipped with safeguards like “kill switches” and defined pathways for human intervention to ensure an autonomous system can be overridden or stopped when faced with an unsafe or ambiguous situation.
4. Privacy and Data Protection
In the age of vast data collection, a user’s willingness to use an AI solution hinges on the assurance that their sensitive information will be protected. Trust is lost if data is compromised, misused, or leaked. Adhering to regulations like GDPR and CCPA is a baseline.
How it can be achieved:
- Secure-by-Design Principles: Initiating and maintaining AI solution development that incorporates techniques such as anonymisation and data minimisation (minimum necessary data collection).
- Privacy-Enhancing Technologies (PETs): Utilising advanced cryptographic techniques like federated learning (training models on decentralised data) to protect sensitive information during training and inference.
- Access Control and Security Audits: Implementing strict access controls and regular security audits for data pipelines and model APIs to ensure compliance and prevent unauthorised data access.
The Value of Proactive Trust Management
The journey of AI adoption is paved with the potential for misuse and technical failure. A single, high-profile failure, such as a biased recommendation, a security breach, or a dangerous hallucination, can instantly erode years of trust-building effort.
By embracing the Trust component of AI TRiSM, organisations can move from reactive damage control to proactive trust management. They can operationalise these ethical and performance requirements, embedding them into the entire solution lifecycle.
Investing in these principles is an investment in the long-term viability of AI, ensuring that as systems become more autonomous and integrated into our lives, they remain aligned with our values, transparent in their operations, and secure with our data. This is how we build systems that don’t just perform a task reliably, but adapt and improve in a volatile world. This concept is particularly critical in complex, high-stakes environments like supply chain management, where disruption is a constant threat.
For a deeper exploration of how these principles are applied to create systems that gain from disorder, see our recent white paper: Beyond Resilience: Engineering the Anti-Fragile Pharma Supply Chain of 2030.
Author: Anahita Bilimoria, Decision Lab Innovation Practice Lead.
Follow this series on AI TRiSM from Decision Lab, follow us on LinkedIn!

