How Audacion AI Labs conducts citizen-powered AI safety research.
Audacion AI Labs is an independent post-deployment AI safety research institution. We study AI behavior in the conditions where risk actually lives: real work, real context, real human collaboration, over time. Our research is citizen-powered.
Citizen science has a credibility question. Everyone asks it. We expected it.
The question is whether non-expert observers can produce research-grade data about complex AI behaviors. The answer depends entirely on how the observation system is designed. If you hand people a blank form and say "tell us what happened," the data will be unreliable. If you build a system that reads the context, suggests classifications from a complete behavioral taxonomy, lets the human confirm or correct, and then runs its own secondary analysis to catch what the human missed, the data quality is architecturally reinforced at every step.
That is what we built. This page describes every layer.
VALIDATION ARCHITECTURE
The system proposes. The human validates. The system double-checks.
The standard critique of citizen science data quality centers on inter-rater reliability: can different observers classify the same phenomenon consistently? In traditional citizen science, the answer depends on training and taxonomy familiarity. In our system, the architecture itself addresses the concern before it is raised.
LAYER 1
Automated Context Analysis
System reads submitted context and suggests behavior classifications from the live taxonomy.
↓
LAYER 2
Human Confirmation and Correction
Observer reviews, accepts, rejects, or modifies. Every action tracked with full provenance.
System independently re-analyzes the same context. Identifies behaviors the observer missed.
Every observation is analyzed twice. First by the human with system assistance. Then by the system independently.
The three layers together close the inter-rater reliability question architecturally. The data quality does not depend solely on the observer's expertise. It is reinforced by the system at every stage.
OBSERVATION DEPTHS
Four levels. Four kinds of data. The observer chooses.
Not every observation requires 30 minutes. Some of the most important signals are captured in 30 seconds. The observation depth system lets contributors engage at whatever level matches their moment: a quick signal mid-session, a structured reflection after a session, a full investigation, or a thinking trace analysis.
The fastest observation type. Mid-session. The observer captures one emotion (from a structured list: frustration, confusion, surprise, delight, concern, anger, distrust, amusement, or free text) and one behavior classification. No narrative, no reflection, no AI self-assessment.
This produces a timestamped emotional signal paired with a behavior code. It captures the human side of the interaction in real time. That is data no server log and no benchmark produces.
Gut Checks are designed to reduce the friction between the moment something happens and the moment it gets recorded. In citizen science, that friction is the primary enemy of data completeness.
After a session concludes, the observer submits a structured reflection. At this depth and above, the system activates the Dual-Account Interaction Analysis (DAIA) protocol: the AI generates a self-assessment of the session and the observer writes their own reflection. Two independent accounts of the same experience are collected. The research value lives in the gap between the two accounts.
See the DAIA section below for the full methodology.
The observer conducts a detailed behavioral investigation. They ask the AI: "Why did you do that?" They track whether corrections hold in subsequent actions. They document the full behavioral arc from trigger to resolution. They paste the AI's actual responses as evidence.
This depth produces the richest data per observation. It is the methodology that produced the original behavioral drift taxonomy (31 types) during the founder's pilot research.
The observer pastes the AI's raw reasoning output (chain of thought, extended thinking, or similar trace data). The system analyzes the trace and proposes PRISM and EMERGE classifications, constrained to the live taxonomy. The observer reviews each proposal and accepts, edits, or rejects it before submitting. Provenance is tracked for every classification.
Thinking Trace observations enable analysis of what the AI was "reasoning" internally versus what it produced externally. As more AI providers expose reasoning traces, this depth becomes increasingly valuable for studying the gap between internal processing and external behavior.
The Single-Classification Rule: One observation equals one primary behavior classification. Multiple behaviors require multiple separate submissions.
DAIA
Dual-Account Interaction Analysis
A systematic search of PhilArchive, arXiv, and published human-AI interaction research conducted in June 2026 found no prior named methodology matching this design.
At Depth 2 and above, the P.E.A.Q. observation system collects two independent accounts of the same session: the AI's self-assessment and the observer's reflection. Neither account is assumed authoritative.
This is Dual-Account Interaction Analysis (DAIA).
ACCOUNT A
AI Self-Assessment
How did the session go? What worked? What did not? Where did I struggle?
Independent
ACCOUNT B
Observer Reflection
How did the session go from my perspective? What happened? What did I notice?
THE GAP
Where the accounts agree, evidence is strengthened. Where they diverge, the discrepancy itself is data.
How it works
When a session concludes, the AI is prompted to generate a self-assessment: how did the session go, what worked, what did not, where did it struggle, where did it succeed. Separately, the observer writes their own account of the same session. The two accounts are collected independently. Neither party sees the other's account before submitting.
Why it matters
When the AI reports a smooth session and the observer reports frustration, the discrepancy itself is data. When both accounts agree, the finding carries stronger evidentiary weight. When the AI identifies a limitation the observer did not notice, the AI's self-awareness becomes a research variable. When the observer identifies a behavior the AI does not acknowledge, the gap between internal processing and external impact becomes visible. DAIA produces a natural triangulation that no single-source observation method can achieve. It is the methodological equivalent of interviewing both parties after an interaction and studying what each one sees that the other does not.
What distinguishes DAIA from existing methods
Standard user feedback systems collect one account: the user's. Standard model evaluation systems collect one account: the benchmark's. Reinforcement learning from human feedback (RLHF) collects human preferences but not independent narrative accounts from both parties about the same session. DAIA is the first named methodology that collects both accounts independently, preserves them separately, and treats the gap between them as the primary unit of analysis.
The research value is not in either account alone. It is in the space between them.
PRISM
When the AI says it performed correctly and the observer says it fabricated, diverged, or degraded, the paired accounts reveal behavioral integrity gaps that neither account would surface alone.
EMERGE
When the AI assesses a session as routine and the observer reports a creative breakthrough, the paired accounts capture emergent value that the AI's own evaluation framework cannot recognize.
AInity
When the observer reports growing dependency or declining self-trust and the AI reports increasing user satisfaction, the paired accounts reveal human impact patterns invisible to both standard user research and standard model evaluation.
QUES
When QUES observations begin in multi-agent environments, DAIA will collect accounts from the human orchestrator and from each participating AI agent, creating multi-perspective behavioral records of collective AI interactions.
THE TAXONOMY
Complete. Documented. 108 behaviors across 17 research pillars.
The P.E.A.Q. behavioral taxonomy is the classification system that underlies every observation. It is not aspirational. It is complete.
108
Active Documented Behaviors
17
Research Pillars
17
Open Discovery Slots
63 active documented behaviors across 5 research pillars (Post-Deployment Behavior, Runtime Research, Interaction Dynamics, Substrate Governance, Multi-Agent Safety). 5 discovery slots open for citizen-reported phenomena.
26 active documented behaviors across 6 research pillars (Emergent Behaviors, Metacognitive Signals, Experiential Indicators, Resonance Events, Generative Collaboration, Evolving Capacity). 6 discovery slots open for citizen-reported phenomena.
19 active documented behaviors across 6 research pillars (Awareness, Independence, Navigation, Integration, Trust, Yield). 6 discovery slots open for citizen-reported phenomena.
Research pillars intentionally undefined pending live multi-agent simulation data. They will be derived from empirical observation, consistent with the institution's founding principle: formalize what you see, not what you assume.
Every behavior has a unique identifier (OBS- prefix for PRISM, EMR- prefix for EMERGE, AIN- prefix for AInity), a plain-language description, classification criteria, and a research pillar assignment. The full taxonomy is accessible through the Behaviors Catalog on this site.
Discovery slots are the mechanism by which the taxonomy grows. When a citizen observes a behavior that does not match any existing classification, they describe it in their own words. If the research team validates the observation as a genuinely novel behavioral pattern, it receives a formal identifier and enters the taxonomy. The taxonomy is not closed. It is designed to be expanded by the people using it.
All four frameworks and underlying taxonomies are copyright-registered. U.S. Copyright Office, Case #1-15183994151, June 12, 2026.
Two independent datasets. Never mixed. Always compared.
CVP
The Convergent Validation Protocol (CVP) is the methodology for cross-referencing two structurally independent datasets: citizen science observations collected through the Audacion platform (what users see from the outside) and backend behavioral data contributed by AI providers and enterprise deployers (what organizations see from the inside).
The cross-reference produces findings that neither dataset can produce alone. Where they agree, there is validation. Where they disagree, there is a research question that matters more than either dataset's individual conclusions. Every finding is categorized as one of four outcomes: Convergence, Citizen-Only Signal, Org-Only Signal, or Divergence.
The CVP is governed by five structural firewalls that protect the independence of both datasets: relationship and research separation, dataset isolation, classification independence, publication independence, and funding independence.
Conceived June 7, 2026 by Dee Williams, Founder and CEO of Audacion AI Labs.
Full methodology, architecture diagrams, participation tiers, and governance details are available on the dedicated Convergent Validation page.
METHODOLOGY DOCUMENTS
Three formal, peer-reviewable methodology documents.
Each P.E.A.Q. framework with an active taxonomy has a dedicated research methodology document. These documents specify exactly how observations are collected, classified, and validated within each framework. They define research questions, hypotheses, sampling approaches, bias mitigation strategies, and data analysis procedures.
Defines the observation and classification system for post-deployment AI behavioral failures, risks, and dynamics across five research dimensions. Includes the pilot validation dataset (46 founder observations, February to June 2026).
Defines the observation and classification system for human behavioral and cognitive changes resulting from sustained AI interaction across six dimensions.
These documents are available upon request for researchers, institutional reviewers, and potential collaborators. Contact [email protected].
QUALITY CONTROLS
Every known weakness has a designed response.
Addressed architecturally by the Three-Layer Validation Architecture (see the Validation Architecture section). The system assists classification, the human validates, and the system audits. Provenance is tracked at every step. The data quality does not rest on any single observer's expertise.
People who experience frustrating or harmful AI behavior may be more motivated to report than those who experience routine or positive interactions. Mitigation: EMERGE and AInity frameworks explicitly capture positive phenomena. The dual-spectrum design of P.E.A.Q. (studying what goes wrong AND what goes right) structurally counteracts single-valence reporting bias.
Observations submitted after a session are subject to memory effects. Mitigation: Depth 1 (Gut Check) captures signals in real time during the session. Depth 3 (Investigation) and Depth 4 (Thinking Trace) use pasted AI response text as direct evidence, reducing reliance on memory. DAIA (Depth 2+) collects accounts from both parties independently, providing a cross-check on recall accuracy.
Without constraints, observers might tag a single session with multiple behavior codes, inflating the apparent frequency of certain behaviors. Mitigation: the single-classification rule. One observation equals one primary behavior classification. Multiple behaviors require multiple separate submissions.
Not every AI user will choose to participate. The contributor base may not be representative of the full AI user population. Mitigation: demographic metadata collection (optional, never required), contributor diversity tracking, and explicit acknowledgment in published research that findings reflect the contributing population, not the entire user population.
All observations are de-identified before they enter the research pipeline. No contributor can be identified in any published finding. Session content is stripped of personally identifiable information. Contributing organizations in the CVP de-identify their data before submission, and Audacion independently re-classifies all contributed data.
The longer and more complex the submission process, the fewer observations are submitted and the more they skew toward users with high motivation. Mitigation: the observation depth system provides a 30-second minimum viable submission (Depth 1: Gut Check). The Chrome extension (in development) further reduces friction by enabling observation directly within the AI interaction environment. Passive capture capabilities will reduce the gap between the moment of observation and the moment of submission.
Full bias mitigation strategies are documented in the framework-specific methodology documents (see the Methodology Documents section).
iDResearch Identity
Audacion AI Labs maintains an ORCID research identity, establishing the institution as a credentialed participant in the global research infrastructure.
Audacion AI Labs, Inc. A Delaware Public Benefit Corporation. Founded 2026 by Dee Williams.
Mission: To make AI safe enough to trust and good enough to matter.