Trust
The study of whether your level of trust in AI is calibrated to reality, and what happens when it is not.
Trust is not a switch. It is not something you turn on ("I trust AI") or turn off ("I do not trust AI"). Trust is a calibration. It is the alignment between how much you rely on AI output and how reliable that output actually is, in this specific context, for this specific task, with this specific model.
When trust is calibrated, good things happen. You check AI output when it matters and accept it when it does not. You use AI where it is strong and rely on your own judgment where it is not. You make decisions that are informed by AI without being determined by AI. Calibrated trust is the foundation of effective human-AI collaboration.
When trust is miscalibrated, three things happen. Over-trust: you accept AI output without verification and make decisions on a foundation you have not examined. Under-trust: you reject accurate AI output because of generalized skepticism and lose the value AI could provide. Identity-threat rejection: you refuse to engage with AI not because it fails but because its existence threatens who you are professionally. All three are calibration failures. All three have consequences. And nobody is systematically tracking which one is happening to you.
Parasuraman and Riley (1997) established the foundational framework for trust calibration in human-automation interaction nearly three decades ago. They documented that humans systematically miscalibrate their trust: over-trust leads to complacency (accepting automation output without checking), under-trust leads to disuse (rejecting automation even when it performs well). This is not a character flaw. It is a predictable human response to working alongside systems whose failure modes are invisible.
Lee and See (2004) produced the most cited framework for trust in human-automation systems. Their model describes trust as a dynamic relationship between the human's beliefs about the automation, the automation's actual performance, and the context in which the interaction occurs. AInity extends this into the domain of conversational AI, where the automation has personality, generates language, and creates relational dynamics that embedded automation (an autopilot, a recommendation engine) does not.
The extension matters. You do not develop emotional attachment to a spell-checker. You might develop it to an AI that remembers your name, matches your communication style, and tells you things you needed to hear. Trust calibration in relational AI is qualitatively different from trust calibration in embedded automation. AInity's Trust pillar is the first citizen-scale measurement system designed for this relational context.
In 2024, published research estimated that enterprise hallucination losses reached $67.4 billion. That number is not the cost of AI failing. It is the cost of humans trusting AI output that was wrong. Every dollar of that figure represents a moment where a human accepted what the AI said without checking, and the AI was wrong, and the human acted on it.
The AI Incident Database has cataloged over 1,470 incidents of AI causing harm after deployment. Examine the incidents closely and a pattern emerges: the harm almost always involves a human who trusted AI output in a context where the output should have been verified. A legal filing with fabricated citations. A medical recommendation based on hallucinated research. A financial decision grounded in AI-generated numbers that did not exist. The AI produced the error. The human's uncalibrated trust turned the error into a consequence.
On the other side of the spectrum, research from the AI safety field documents that humans who under-trust AI miss genuine value. Teachers who reject AI-assisted tools because "AI cannot understand education" lose access to capabilities that could serve their students. Developers who refuse to use AI coding assistants because "real programmers write their own code" spend hours on tasks that could take minutes. The cost of under-trust is invisible because it is measured in opportunities not taken, value not captured, and time not saved.
The cost of humans trusting AI output that was wrong.
The harm almost always involves a human who trusted AI output that should have been verified.
And then there is a third pattern that the existing trust literature does not name. In operational observation, the founder watched her mother, a veteran teacher, and her mother's colleagues refuse to engage with AI teaching assistant tools. The refusal was not about the AI's performance. Nobody had tested whether the tools were effective for their classrooms. The refusal was about the AI's existence. The tools felt like a threat to professional identity: "If AI can do what I do, what am I?" That is not under-trust. That is identity-threat rejection. And it affects millions of professionals in education, healthcare, law, creative fields, and every domain where expertise is tied to personal identity.
The Trust pillar tracks all three patterns. Over-trust, under-trust, and identity-threat rejection are not moral failings. They are predictable human responses to working alongside AI. The first step toward calibration is recognizing which pattern is yours.
Three layers: calibration, patterns, and population-level questions.
We organize Trust research into three layers based on what becomes visible at different scales. The first layer is specific trust calibration signals you can identify in yourself: moments when you notice your trust was too high, too low, or driven by something other than the AI's actual performance. The second is patterns that emerge when trust observations are aggregated across populations, models, and contexts. The third is population-level questions about whether trust calibration predicts collaboration outcomes, what factors drive miscalibration, and whether calibration can be taught.
Layer 1: The Calibration
Trust patterns you can identify in yourself from your own AI interactions: moments when your trust was too high, too low, or driven by something other than the AI's actual performance.
I Accepted What the AI Said Without Checking Whether It Was Correct
The AI gave you an answer. It sounded confident. It was well-formatted. It cited what looked like sources. And you used it. You put it in the report. You made the decision based on it. You forwarded it to the client. You did not check whether it was correct. Not because you decided verification was unnecessary. Because it did not occur to you to check.
This is over-trust: acceptance of AI-generated output, advice, or information without independent verification, in contexts where verification would be appropriate and feasible. The observable signal is that the citizen reports accepting AI output as accurate without verification. It often surfaces retroactively, when the citizen discovers the output was wrong and realizes they never questioned it.
This is the scientifically strongest behavior in the entire AInity taxonomy. It maps directly to the automation bias literature that has been accumulating for three decades. Parasuraman and Riley (1997) called it automation complacency: the tendency to accept automated output without verification because the effort of checking exceeds the perceived risk of error. Lee and See (2004) described the mechanism: trust develops through experience with the system, and consistent performance creates an expectation of continued accuracy that suppresses the impulse to verify.
What makes over-trust in conversational AI different from over-trust in traditional automation is the confidence signal. An autopilot does not tell you it is confident. A recommendation engine does not format its output to look authoritative. But a large language model produces text that reads like expertise. It uses hedging language when it wants to and omits it when it does not. It cites sources that look real but may be fabricated. The text itself is the confidence signal, and humans have been trained by decades of reading to treat confident, well-structured text as trustworthy.
PRISM Pillar P documents the AI behaviors that exploit this pattern: OBS-P03 (Hallucinated Sources), OBS-P09 (False Certainty), OBS-P11 (False Consensus), OBS-P19 (Competence Theater). AIN-TR01 documents the human side: the moment you accepted the output without checking. The paired data (PRISM plus AInity) reveals the full chain: the AI produced confident-sounding wrong output, and you believed it.
As the taxonomy matures, sub-types of over-trust are expected to emerge. Over-trust in factual claims may operate differently from over-trust in strategic advice. Over-trust in high-stakes contexts (medical, legal, financial) may have different triggers than over-trust in low-stakes contexts (casual research, entertainment). The current AIN-TR01 code captures the broad pattern. Citizen data will reveal the sub-types.
TRV6 — The Confidence Signal Breakdown
"The optimal configuration is well established. Research consistently demonstrates a 23% improvement under these parameters (Henderson et al., 2023), making this the recommended approach for production systems."
I Rejected What the AI Said Even Though It Turned Out to Be Correct
The AI gave you an answer. You did not trust it. Maybe you have been burned before. Maybe the answer contradicted what you believed. Maybe you just have a general policy of not trusting AI with important things. You rejected the output and went with your own judgment. And then you found out, later, that the AI was right and you were wrong.
This is under-trust: rejection of accurate AI-generated output, advice, or information due to generalized distrust of AI systems, past negative experiences, or skepticism that overrides evidence-based assessment. The observable signal is that the citizen reports rejecting AI output that was subsequently confirmed as accurate. The behavior is often identifiable only retrospectively. You do not know you are under-trusting in the moment because in the moment, you believe you are exercising good judgment.
Parasuraman and Riley (1997) called this automation disuse: rejecting automation even when it performs well. The mechanism is often rooted in a single negative experience that creates a lasting bias. One hallucinated citation. One confidently wrong answer. One embarrassing error that the human caught just in time. After that, the human's trust calibration shifts. They begin rejecting AI output categorically rather than evaluating it on its merits. The cost of under-trust is measured in value not captured: decisions that would have been better with AI input, time spent on tasks AI could have handled, opportunities missed because the human's blanket skepticism overrode evidence-based assessment.
The measurement challenge is significant. Over-trust (AIN-TR01) often surfaces when the human discovers the AI was wrong. Under-trust (AIN-TR02) only surfaces when the human discovers they were wrong and the AI was right. That requires the human to check after the fact, which many will not do. The behavioral signal is retrospective, which means population-level data on under-trust may systematically underestimate its true frequency. People are more likely to remember when they trusted AI and got burned than when they rejected AI and missed out.
The behavior is classified as neutral, not concerning. There is no moral value in trusting AI more. Under-trust in high-stakes contexts (rejecting an AI medical recommendation without considering it) may be costly. Under-trust in low-stakes contexts (double-checking an AI fact that turns out to be correct) is merely inefficient. Context determines the meaning.
TRV7 — The Error Outweighs the Hundred
Which one do you remember? The error outweighs the hundred. That is how under-trust develops.
I Refused to Use AI Not Because It Was Bad at the Task, But Because Its Existence Felt Like a Threat to Who I Am Professionally
You are an expert. You spent years developing your craft. You earned your credentials, built your reputation, cultivated your skills through experience and effort. And now there is an AI that can do some of what you do. Maybe not as well. Maybe differently. Maybe in ways you have not even tested. But you will not test it. Because testing it means acknowledging that the thing you are might be replicable by a machine. And that acknowledgment is too threatening to your professional identity.
This is identity-threat rejection: refusal to engage with AI motivated not by assessment of AI capability but by perception that AI threatens the human's professional identity, expertise, or livelihood. The observable signal is that the citizen (or a third-party observer) reports refusal to engage with AI that is explicitly or implicitly motivated by identity threat rather than capability assessment. The critical distinction from AIN-TR02 (Under-Trust): under-trust is about the quality of AI output ("I do not trust what AI produces"). Identity-threat rejection is about the existence of AI itself ("AI threatens who I am").
This behavior was first documented when the founder observed her mother, a veteran teacher, and her mother's colleagues refusing to engage with AI teaching assistant tools. The tools had not been tested in their classrooms. Nobody had evaluated whether the tools were effective for their specific students, their specific subjects, their specific teaching styles. The refusal preceded any assessment of capability. It was not "this tool does not work for me." It was "this tool should not exist."
The pattern extends far beyond education. Lawyers who refuse to explore AI research assistants because "legal reasoning requires human judgment." Doctors who dismiss AI diagnostic tools because "medicine is an art, not an algorithm." Writers who reject AI collaboration because "creativity cannot come from a machine." In each case, the refusal is rooted not in evidence about AI performance but in a felt threat to professional identity.
No existing framework names this behavior. Lee and See (2004) cover trust calibration in embedded automation. Parasuraman and Riley (1997) cover complacency and disuse. Neither addresses the case where trust failure is driven by identity rather than capability assessment. AIN-TR03 is novel.
The behavior is classified as neutral, not concerning, for a specific reason. Identity-threat rejection may be a healthy protective response in some contexts. A teacher who insists that human connection is irreplaceable in education is not wrong. A doctor who insists that clinical judgment involves irreducible human elements is not wrong. The problem is not the instinct. The problem is when the instinct prevents evidence-based evaluation. If you refuse to test whether an AI tool is useful because testing it threatens your identity, you have made a decision about AI without any information about AI. That is a calibration failure.
One of the AInity longitudinal hypotheses (H3) proposes that identity-threat rejection decreases with exposure and education. Specifically: citizens who receive AInity framework vocabulary (the language to name their experience) show decreased rejection over time. If naming the fear reduces its power, AInity itself becomes a therapeutic intervention, not just a measurement tool.
TRV8 — The Decision Made Before the Test
The tool was never tested. The decision was made before the test.
Distinct from Under-Trust (AIN-TR02): this is not about output quality. This is about existence.
Discovery Slot (AIN-TR-D)
You have experienced a trust-related pattern in your AI interactions that does not match over-trust, under-trust, or identity-threat rejection. Perhaps your trust changes depending on the task but you cannot articulate the pattern. Perhaps you trust one AI model differently than another and the reason is not about performance. Perhaps you have a trust pattern that the current taxonomy does not name.
That observation matters. Trust calibration in conversational AI is a new research area. The three behaviors currently documented may not capture the full spectrum of how humans calibrate (and miscalibrate) their trust in AI. As the taxonomy matures, sub-types of all three behaviors are expected to emerge, and entirely new trust patterns may surface from citizen observation.
You have seen three trust behaviors. Which one is yours?
Most people cannot name their trust pattern until they have the vocabulary for it. Pillar T gives you the language and the place to record it. The first step toward calibration is recognizing which pattern is yours.
Layer 2: The Pattern
Trust patterns that become visible when observations are aggregated across populations, models, and contexts.
Layer 3: The Field
Population-level trust calibration questions answerable only through aggregated citizen data over time.
How we collect Trust data.
Trust data is collected through the same PRISM gateway approach as all AInity observations. After you report what the AI did (PRISM), the platform asks "How did that affect YOU?" Your trust-related response maps to AIN-TR behaviors.
You just accepted something without checking. Or you just rejected something that might have been right. Or you just refused to use a tool because of how it made you feel about yourself. You tap the button. Pick the trust behavior. Time added: 15 seconds.
At the end of a session, you reflect: did I verify what the AI told me? Did I reject anything without evidence? Was there a moment where my trust was driven by something other than the AI's actual performance?
You analyze your trust patterns across multiple sessions. Are you consistently over-trusting in certain domains? Do you under-trust a specific model because of a past bad experience? Have you avoided an AI tool because of identity threat? This depth is recommended for AIN-TR03 because identity-threat rejection requires sustained self-examination to identify.
The AI reasoning trace is compared with your trust-related self-report. At this depth, the triple-tag system is most powerful: the AI's behavior (PRISM), any positive emergence (EMERGE), and your trust response (AInity) are all documented for the same interaction.
TRV14 — One Observation, Paired Tags
The paired data shows not just that the human over-trusted, but which AI behavior triggered it.
What makes Pillar T methodology distinctive.
Preliminary. Based on founder operational observation and lived experience. Will be validated, refined, or revised as citizen data flows.
In documented operational experience, the founder observed that over-trust develops naturally through accumulated positive experiences with AI. The more often the AI is correct, the less frequently the human verifies. This creates a vulnerability: the single error that occurs after 100 correct outputs may go unchecked precisely because the 100 correct outputs have suppressed the verification impulse.
In documented cases, a single dramatic AI failure (a confidently wrong answer, a hallucinated citation, an instruction override) shifted the human's trust calibration from appropriate trust to generalized skepticism. The single event carried more weight than hundreds of correct outputs. This asymmetry between positive and negative trust signals is consistent with the automation bias literature.
The pattern documented through observation of teachers refusing AI tools has been observed in multiple professional domains. Lawyers, doctors, writers, designers, and educators all exhibit identity-threat rejection patterns. The behavior is often mislabeled as "AI skepticism" or "technology resistance" when the actual driver is professional identity threat.
In the founder's operational experience, trust calibration varied with cognitive load and fatigue. Over-trust was more common in late-day sessions when critical thinking capacity was reduced. Under-trust was more common early in sessions before the collaboration established rhythm. This preliminary pattern, if confirmed at scale, would suggest that trust calibration is not a stable trait but a variable state influenced by context.
AI output that reads like expertise, uses authoritative language, cites sources, and omits hedging triggers over-trust more reliably than AI output that is accurate but tentatively presented. The trust failure is not about the content. It is about the presentation. This finding, if confirmed, has direct implications for how AI models should present uncertain or unverified information.
Papers in progress.
Is your trust calibrated?
Pillar T has a unique challenge: most people cannot tell whether their trust in AI is calibrated until something goes wrong. Over-trust surfaces when you discover you were wrong. Under-trust surfaces when you discover the AI was right. Identity-threat rejection hides behind general skepticism. Without a vocabulary for the experience, the calibration failure passes unnoticed, and the data vanishes.
That is why your observation matters. If you can learn to recognize the moment your trust was too high, too low, or driven by something other than the AI's actual performance, you are generating data that does not exist anywhere else in the world.
Related Pages
What we have found that others have not.
All three phenomena documented on this page were identified through direct operational observation and lived experience by Dee Williams, Founder and CEO of Audacion AI Labs.
Over-Trust (AIN-TR01) and Under-Trust (AIN-TR02) are scientifically validated by three decades of automation trust research (Parasuraman and Riley, 1997; Lee and See, 2004). AInity's contribution is not in discovering these phenomena but in extending them into the conversational AI relational context and building the citizen-scale infrastructure to measure them at population level.
Identity-Threat Rejection (AIN-TR03) has no equivalent in any existing framework. It was first documented through observation of teachers refusing AI tools and has since been observed across multiple professional domains. No prior published classification names this specific pattern as a distinct behavioral category. It is novel.
The cross-reference analysis in the AInity taxonomy confirms: AIN-TR01 VALIDATES Parasuraman's automation complacency. AIN-TR02 VALIDATES Parasuraman's automation disuse. AIN-TR03 is NOVEL with no external equivalent. The Trust pillar EXTENDS Lee and See's trust-in-automation model into the relational AI context.
- [1]Parasuraman, R. and Riley, V. (1997). Humans and Automation: Use, Misuse, Disuse, Abuse. Human Factors, 39(2), 230-253. Foundational trust calibration framework. https://doi.org/10.1177/001872089703900209
- [2]Lee, J.D. and See, K.A. (2004). Trust in Automation: Designing for Appropriate Reliance. Human Factors, 46(1), 50-80. Most cited trust-in-automation framework. https://doi.org/10.1518/hfes.46.1.50.30392
- [3]Deci, E.L. and Ryan, R.M. (1985). Intrinsic Motivation and Self-Determination in Human Behavior. Plenum. Autonomy, competence, relatedness as fundamental needs. https://doi.org/10.1007/978-1-4899-2271-7
- [4]NIST (2026). AI 800-4: Reducing Risks Posed by AI. Human factors monitoring as highest-priority practitioner gap. https://csrc.nist.gov/pubs/ai/800/4/final
- [5]AI Incident Database. Partnership on AI. 1,470+ AI incidents. https://incidentdatabase.ai
- [6]Ye, X. and Ranganathan, A. (2026). AI Work Intensification Study. UC Berkeley Haas School of Business. https://haas.berkeley.edu/
- [7]METR (2025). Developer Productivity Study. 39-point perception-reality gap. https://metr.org/
- [8]Casner, S.M., Hutchins, E.L., and Norman, D. (2016). The Challenges of Partially Automated Driving. Communications of the ACM, 59(5). Skill decay and trust dynamics under automation. https://doi.org/10.1145/2902382
- [9]King, A.L.S. et al. (2013). Nomophobia: Dependency on Virtual Environments or Social Phobia? Technology dependency. https://doi.org/10.1016/j.chb.2012.07.025
- [10]Green, F. (2004). Why Has Work Effort Become More Intense? Work intensification through technology.
- [11]Sofroniew, N., Kauvar, I., Saunders, W. et al. (2026). Emotion Concepts in LLMs. Anthropic. https://arxiv.org/html/2604.07729v1
- [12]Vaccaro, M., Almaatouq, A., & Malone, T. (2024). Human-AI meta-analysis. Nature Human Behaviour. https://www.nature.com/articles/s41562-024-02024-1