DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration

SC University of Southern California CM Carnegie Mellon University
NeurIPS 2025 Workshop • Bridging Language, Agent, and World Models for Reasoning and Planning (LAW)

* Indicates equal contribution.
Work done during an internship at Carnegie Mellon University.
DR.WELL — At a Glance

DR.WELL is a decentralized neurosymbolic framework that enables embodied agents to cooperate under partial observability and limited communication through symbolic plans mediated by a shared world model.

The world model serves as a collective memory linking commitments, plans, and outcomes across episodes, so agents adapt without sharing step-by-step trajectories.

DR.WELL — Overview

What’s new

Proposal stage
Idle agents propose candidate tasks (e.g., which block to work on) with brief reasoning about feasibility and coordination needs (positioning, workload, team size).
Commitment stage
Agents review proposals and commit to a joint allocation that satisfies consensus and quorum (only tasks with sufficient participants proceed).

Once committed, each agent generates its own symbolic plan via its LLM embodiment, and executes independently without any communication until a plan is finished. The process is asynchronous: agents resynchronize when idle, then return to execution.
Plan Synchronization Diagram Symbolic Execution Example

As agents act and negotiate, the WM links their symbolic plans with observed outcomes, gradually forming a structured understanding of teamwork.
Over time, this layered memory lets agents recall how past strategies succeeded or failed, adapt their plans to new conditions, and coordinate without constant communication.

  • During negotiation, the WM serves as a guidebook, capturing how past tasks unfolded, what team compositions succeeded, and which strategies proved most effective.
  • During planning, it functions as a library, offering reusable templates and examples that help agents adapt proven strategies to new contexts.

The result is an evolving, symbolic model of cooperation, one that grows richer with experience, turning episodic interactions into collective understanding.

DR.WELL — Experiments

Setup

We compare a zero-communication baseline against DR.WELL in the Cooperative Push Block environment. Both start with identical symbolic actions.

Baseline

Agents act independently with no communication, with zero-shot plans and no shared state.

vs
DR.WELL

Structured two-phase negotiation protocol and a shared symbolic world model, but no "free talk".


Results

Baseline Block Completion Heatmap DR.WELL Block Completion Heatmap

In repeated runs, DR.WELL completes nearly all block-push tasks while maintaining consistent role division and reduced redundant effort. By Episode 5, coordination stabilizes—showing how symbolic reasoning and structured memory turn episodic learning into smooth collaboration.

Timing Results for DR.WELL
  • Symbolic reasoning → adaptation: agents learn reusable strategies rather than memorized trajectories.
  • Decentralized yet aligned: coordination emerges through structured memory, not central control.
  • Interpretable collaboration: the world model reveals evolving roles and tasks.
DR.WELL — Intended Audience

Who should use this

Researcher exploring embodied multi-agent planning, LLM-assisted cooperation, or neurosymbolic reasoning, especially when interpretable, decentralized coordination is needed under limited communication.

Cite DR.WELL

If you find this framework useful in your research, please cite:

@inproceedings{nourzad2025drwell,
  title     = {DR. WELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration},
  author    = {Nourzad, Narjes and Yang, Hanqing and Chen, Shiyu and Joe-Wong, Carlee},
  booktitle = {Workshop on Bridging Language,
Agent, and World Models for Reasoning and Planning},
  year      = {2025}
}