DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration

Nourzad, Narjes; Yang, Hanqing

DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration

Narjes Nourzad ^*^†^SC Hanqing Yang ^*^CM Shiyu Chen^CM Carlee Joe-Wong^CM

SC University of Southern California CM Carnegie Mellon University

NeurIPS 2025 Workshop • Bridging Language, Agent, and World Models for Reasoning and Planning (LAW)

^*Indicates equal contribution.
^†Work done during an internship at Carnegie Mellon University.

Paper Code

DR.WELL — At a Glance

DR.WELL is a decentralized neurosymbolic framework that enables embodied agents to cooperate under partial observability and limited communication through symbolic plans mediated by a shared world model.

The world model serves as a collective memory linking commitments, plans, and outcomes across episodes, so agents adapt without sharing step-by-step trajectories.

DR.WELL — Overview

What’s new

✓

Two-phase Negotiation Protocol

Agents enter a shared communication room, propose candidate tasks with rationales, and converge on a consensus allocation under environment rules. Communication is limited to these structured rounds, no “free chat”.

Proposal stage

Idle agents propose candidate tasks (e.g., which block to work on) with brief reasoning about feasibility and coordination needs (positioning, workload, team size).

Commitment stage

Agents review proposals and commit to a joint allocation that satisfies consensus and quorum (only tasks with sufficient participants proceed).

Once committed, each agent generates its own symbolic plan via its LLM embodiment, and executes independently without any communication until a plan is finished. The process is asynchronous: agents resynchronize when idle, then return to execution.

✓

Cooperation through Dynamic Symbolic World Model

A shared symbolic memory captures evolving commitments, plans, and outcomes across episodes, enabling adaptive coordination and plan reuse.
It continuously organizes experience into layers, from concrete interactions to abstract task structures, so that each episode adds context to the next.

As agents act and negotiate, the WM links their symbolic plans with observed outcomes, gradually forming a structured understanding of teamwork.
Over time, this layered memory lets agents recall how past strategies succeeded or failed, adapt their plans to new conditions, and coordinate without constant communication.

During negotiation, the WM serves as a guidebook, capturing how past tasks unfolded, what team compositions succeeded, and which strategies proved most effective.
During planning, it functions as a library, offering reusable templates and examples that help agents adapt proven strategies to new contexts.

The result is an evolving, symbolic model of cooperation, one that grows richer with experience, turning episodic interactions into collective understanding.

DR.WELL — Experiments

Setup

We compare a zero-communication baseline against DR.WELL in the Cooperative Push Block environment. Both start with identical symbolic actions.

Baseline

Agents act independently with no communication, with zero-shot plans and no shared state.

vs

DR.WELL

Structured two-phase negotiation protocol and a shared symbolic world model, but no "free talk".

Results

In repeated runs, DR.WELL completes nearly all block-push tasks while maintaining consistent role division and reduced redundant effort. By Episode 5, coordination stabilizes—showing how symbolic reasoning and structured memory turn episodic learning into smooth collaboration.

✓ Takeaways

Symbolic reasoning → adaptation: agents learn reusable strategies rather than memorized trajectories.
Decentralized yet aligned: coordination emerges through structured memory, not central control.
Interpretable collaboration: the world model reveals evolving roles and tasks.

DR.WELL — Intended Audience

Who should use this

Researcher exploring embodied multi-agent planning, LLM-assisted cooperation, or neurosymbolic reasoning, especially when interpretable, decentralized coordination is needed under limited communication.

Cite DR.WELL

If you find this framework useful in your research, please cite:

@inproceedings{nourzad2025drwell,
  title     = {DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration},
  author    = {Nourzad, Narjes and Yang, Hanqing and Chen, Shiyu and Joe-Wong, Carlee},
  booktitle = {Workshop on Bridging Language,
Agent, and World Models for Reasoning and Planning},
  year      = {2025}
}