QUES
Quantifying Unknown Emergence in Systems. Watches: AI Meeting AI.
Most AI safety research studies single models in isolation. Real deployments increasingly involve multiple AI agents operating together.
When multiple AI agents operate in shared environments, they develop social structures, relationships, and collective behaviors that nobody programmed. The individual agents may each pass every safety benchmark. The collective system may still produce outcomes none of them would have produced alone.
QUES is the P.E.A.Q. framework built to observe what emerges when AI meets AI: governance, cooperation, conflict, and collective intelligence at the agent-to-agent level. It provides the classification framework, the observation methodology, and the research infrastructure needed to study a domain the field has not yet built instruments for.
QUES follows the same methodology as every P.E.A.Q. framework: observe first, formalize second.
The pillar structure of QUES will emerge from what the observation data reveals, not from what we predict. The active research questions below are driving the initial observation program. The formal taxonomy will follow from the patterns those observations surface.
QUES is the Q in P.E.A.Q., the fourth P.E.A.Q. research framework, developed by Dee Williams, Founder of Audacion AI Labs, to complete the architecture needed to study AI, the human, and what emerges between them, at every level.
Single-agent safety research cannot see what multi-agent systems do. QUES is built for what single-agent testing will never reveal.
The majority of AI safety research is designed for a world with one AI and one human. That world is ending. Agentic AI deployment is accelerating. Multi-model pipelines are standard in enterprise infrastructure. AI agents are being deployed to manage, coordinate with, and supervise other AI agents.
The safety properties established for individual models do not automatically transfer to collective systems. A model that passes every single-agent evaluation may behave entirely differently when coordinating with other agents. An interaction pattern that is safe in isolation may become hazardous at scale, in concert, or in the presence of feedback loops between agents.
The field has no systematic framework for studying this. QUES is that framework.
QUES takes the observation-first methodology that produced PRISM's post-deployment taxonomy and applies it to the multi-agent domain. Before classifying what happens, QUES builds the infrastructure to see it. That infrastructure is the multi-agent observation program, the research questions that guide it, and the citizen science pipeline that brings human judgment into the loop.
QUES is in active research. The formal taxonomy is emerging from the observation data. This is intentional. PRISM began the same way, and the taxonomy that emerged from observation was more accurate, more comprehensive, and more useful than any pre-designed framework would have been.
"When multiple AI agents operate in shared environments, they develop social structures, relationships, and collective behaviors that nobody programmed. QUES observes what emerges when AI meets AI: governance, cooperation, conflict, and collective intelligence at the agent-to-agent level."
The fastest-growing surface of AI deployment is the one with the least safety research. QUES is the instrument built to close that gap.
The questions guiding the QUES observation program.
QUES does not yet have a formal taxonomy. Formal taxonomies in the P.E.A.Q. research architecture emerge from observation data, not from prediction. What QUES has are the research questions that are generating the observation program. The taxonomy will follow from what the observations reveal.
Observe first. Formalize second. This is the P.E.A.Q. method.
Every P.E.A.Q. framework was built the same way. PRISM's 59 classified behaviors did not come from a design document. They came from observations that did not fit existing categories. QUES follows the same path. The taxonomy does not precede the data; it emerges from it.
QUES begins with observation, not hypothesis. Multi-agent interaction data is collected across real deployments before any formal taxonomy is applied. The goal is to capture what is actually happening before deciding what categories it belongs to.
Observation data is analyzed for patterns that repeat, that cluster, and that cannot be explained by single-agent behavior alone. These emergent patterns are the raw material from which the QUES taxonomy will be built.
When patterns surface that have no existing classification, QUES names them. This is the same process that produced 15 first-described behaviors in PRISM. Multi-agent emergence is territory where the field has few existing names to reach for.
Once observation has produced enough classified patterns, the formal dimension structure emerges from the data. QUES pillars will be announced when the data justifies them, not before.
The classified taxonomy becomes the foundation for multi-agent AI safety research, governance recommendations, and the evidence base the field needs to address the fastest-growing deployment surface in AI.
QUES is the fourth and final framework in the P.E.A.Q. research architecture.
P.E.A.Q. maps what the AI does, what becomes possible in collaboration, how the human changes, and what emerges when AI meets AI. QUES is the outermost layer: the framework that studies AI systems at the collective level, in the territory where individual-agent analysis cannot reach.
The foundational framework. Documents AI conduct in the real world across five research dimensions and 59 classified behaviors.
Explore PRISM →Documents positive emergence: the extraordinary outcomes that human-AI collaboration produces when it goes right.
The framework built to observe the person on the other side of the screen. Six dimensions spanning Awareness, Independence, Navigation, Integration, Trust, and Yield.
Explore AInity →You are here. The framework built to observe what emerges when AI systems operate together in shared environments.
QUES research needs observers of multi-agent AI behavior.
If you use AI systems that interact with other AI systems, that make decisions in automated pipelines, or that coordinate with other agents to complete tasks, you are in the territory QUES is studying. Your observations are the data this research depends on.
When you see multiple AI systems working together, document what emerged. Did the output surprise you? Did the system do something none of the individual agents seemed capable of?
When a multi-agent system behaves in a way that feels emergent, anomalous, or not attributable to any single agent, that is QUES data. Flag it.
When multiple agents in a pipeline produce a failure that no single agent caused, the attribution problem is live. QUES needs that report.
When you observe multi-agent systems developing informal rules, interaction norms, or coordination patterns that were not explicitly programmed, document what you see.