Multi Expert Debate (MED): An LLM Framework for Analysis – OSCAR Celebration of Student Scholarship and Impact

Author(s): Jacob Sheikh

Mentor(s): Ozlem Uzuner, Information Systems and Technology

Abstract

In this work, we introduce Multi-Expert-Debate (MED): An LLM Framework for Analysis. Analysis is an open ended problem; given the same facts, different people draw different conclusions (based on their background, their personality, their beliefs, etc.). In MED, LLM agents are each initialized with their own personas. Agents are all provided the same problem and same knowledge, and, after coming to their own individual solution, debate with other agents, until the ensemble produces a singular, refined idea. We also present SumRAG: a summary-based retrieval method to augment LLM generation. We believe this work will establish a valuable baseline to measure other approaches to reasoning against.

Audio Transcript

Hello. Today, I want to talk about explicit and implicit reasoning in language models. I want to guide this discussion with the question: How can you construct representations of the world in such a way that some agent can navigate those representations to solve problems? In other words, how can you construct an artificial general intelligence?
The first step in answering this question—and what we addressed in our work—was understanding how to navigate representations of knowledge, which is essentially how to reason. There are two approaches to reasoning in language models today: explicit reasoning and implicit reasoning. In our work, we focused on explicit reasoning.
Language models like ChatGPT have shown the ability to improve their responses through reasoning. Shown here is one example technique called chain of thought. On the left, the language model does not reason through its answer and simply outputs “11,” which is incorrect. On the right, the model verbalizes its thought process—explicit reasoning—and arrives at a more accurate answer. Explicit reasoning, therefore, is the process of articulating thoughts step by step to improve the final output. It has proven to be very effective.
The goal of our work was to synthesize explicit reasoning with other emerging techniques in language models, including retrieval-augmented generation (RAG)—querying a database—and multi-agent systems, where multiple LLMs interact. We aimed to combine all three into a unified framework to create the best of what’s currently available in LLM-based explicit reasoning.
Our work hopes to produce a framework called Multi-Expert Debates (MED). In MED, we initialize multiple agents (LLMs), each with their own opinions and personas, given access to the same information via the same RAG setup. These agents debate and defend their decisions until they converge on a single, agreed-upon output.
This work was done in the context of medical care—specifically, decision support systems to assist clinicians in diagnosis. To support this, we implemented a summarization-based RAG pipeline using a dataset that includes foundational medical knowledge, case studies, and procedural guidance.
While the system is still under development, we aim to compare its performance with models that use implicit reasoning. In implicit reasoning, the model reasons internally without verbalizing steps. For example, given the question “Find the capital of the state containing Dallas,” the model might internally reason: “Dallas is in Texas, the capital of Texas is Austin,” and output “Austin” without showing its steps. This form of reasoning has been observed but is not always reliable.
The broader objective of our research is to explore implicit reasoning further. For now, we are building a strong explicit reasoning framework as a baseline for future comparison. We’ve also found interesting connections with neuroscience, particularly regarding disentangled representations, which play a key role in how reasoning may be structured.
We are hopeful our work will provide a valuable foundation for evaluating and developing implicit reasoning approaches in the future.

Leave a Reply Cancel reply