Inception Labs’ Mercury Coder Diffusion LLM now Available via Poe

Burzin Patel

VP of Product

Quora’s mission is to share and grow the world’s knowledge. This has been a driving force behind the creation of Poe, a powerful and versatile chat platform designed to bring the world’s leading AI models together in a single, accessible interface. Poe allows users to engage in seamless, natural conversations with the latest state-of-the-art large language models (LLMs) from top providers.

Today, we’re thrilled to announce the launch of Mercury Coder Small, our new diffusion-based large language model (dLLM), on Poe.com. Designed for speed, efficiency, and versatility, Mercury-Coder-Small represents a major leap forward in AI model performance, offering 5X–10X reduced response times.

Below is a simple yet powerful animation that visually highlights the performance advantage of Mercury Coder Small when compared to conventional auto-regressive LLMs. This side-by-side comparison demonstrates the significant reduction in latency made possible by our diffusion-based approach. While traditional LLMs generate tokens sequentially—introducing delays that add up in real-time applications—Mercury Coder Small delivers output far more rapidly, enabling smoother, more interactive user experiences.

This launch is a critical step toward unlocking a new generation of real-time AI applications, particularly in domains where latency and responsiveness are paramount. We believe Mercury Coder Small will power smarter, faster coding assistants, enable voice-driven interfaces, and serve as a key building block for agentic AI systems that need to reason, react, and respond with minimal delay.

Poe users can start interacting with Mercury Coder Small immediately through the chat interface to explore its capabilities in natural language understanding and code generation. For developers and builders looking to integrate this high-speed dLLM into their own tools or platforms, the model is also accessible via the Poe API, making it easy to experiment, prototype, and scale new AI-driven experiences.

One of the key enhancements Poe provides when working with Mercury Coder Small is its support for multi-modal input, specifically the ability to incorporate both images and documents into the interaction. This functionality significantly broadens the range of use cases and allows for more context-rich prompts that go beyond plain text.

Poe users can engage with this feature either through the chat interface or via the Poe API. By simply uploading an image or a document—such as a PDF, screenshot, diagram, or technical paper—users can seamlessly include these files as parts of their conversations with the model. Once uploaded, Poe processes these inputs and provides the relevant context to Mercury Coder Small, enabling it to generate more informed, accurate, and relevant responses.

This multi-modal capability unlocks powerful new possibilities: developers can ask questions about code snippets embedded in screenshots, students can get summaries of lengthy articles or research papers, and professionals can extract insights from visual data and documents—all in real time, and all powered by the fast, responsive nature of Mercury Coder Small. As a result, Poe becomes not just a conversational interface, but a highly capable assistant for a wide range of practical and advanced tasks involving diverse input formats.