Muse Spark (Meta)
A Tier · 8.8/10
Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning
Score Breakdown
Benchmark Scores
Benchmarks for Muse Spark
| Benchmark | Description | Score | |
|---|---|---|---|
| MMLU | Knowledge across 57 subjects | 89% | |
| GPQA Diamond | Graduate-level science questions | 86% | |
| HumanEval | Python code generation | 91% | |
| Humanity's Last Exam | Frontier difficulty questions | 58% | |
| HealthBench Hard | 42.8% |
Last updated: 2026-04-19
The Good and the Bad
What we like
- +Completely free to use via Meta AI app and Meta.ai -- no subscription required for the full model
- +Natively multimodal: handles text, images, and audio in a single architecture, not bolted-on adapters
- +Contemplating mode orchestrates multiple agents reasoning in parallel -- competitive with Gemini Deep Think and GPT-5.4 Pro
- +Runs on 10x less compute than comparable frontier models according to Meta's benchmarks
- +260K token context window -- competitive with most frontier models
- +Scores 52 on Artificial Analysis Intelligence Index and 58% on Humanity's Last Exam in Contemplating mode
- +HealthBench Hard 42.8 -- a real differentiator. Far ahead of Claude Opus 4.6 (14.8), Gemini 3.1 Pro (20.6), and GPT-5.4 (40.1). Meta trained Muse Spark with 1,000+ licensed physicians in the loop, which shows up in medical-reasoning evaluations
- +Distribution advantage: will reach billions of users across Facebook, Instagram, and WhatsApp
What could be better
- −No public API yet -- developers can't integrate it into their own products
- −Locked into Meta's ecosystem -- you need Meta accounts and their apps to access it
- −Early reviews say it's 'competitive but not leading' -- doesn't clearly beat GPT-5.4 or Claude on any benchmark
- −Privacy concerns given Meta's data practices across its social platforms
- −No fine-tuning, no custom instructions, no equivalent of Custom GPTs or Claude Projects
Pricing
Free (Meta AI app)
- ✓Full Muse Spark access
- ✓Text, image, audio input
- ✓Available on Meta.ai, Facebook, Instagram, WhatsApp
- ✓Contemplating mode
API
- ✓Private preview for select partners
- ✓Paid API access coming later in 2026
- ✓No public API yet
Known Issues
- No public API available as of April 2026 -- only private preview for select enterprise partners. Paid API access is planned but no date announced.Source: Meta AI blog, TechCrunch · 2026-04
- Contemplating mode adds significant latency (30+ seconds) as multiple agents reason in parallel before respondingSource: DataCamp review, Reddit r/artificial · 2026-04
Best for
Anyone who wants frontier-level AI for free. If you use Meta's apps (Facebook, Instagram, WhatsApp) already, Muse Spark is the most accessible high-quality LLM with zero cost.
Not for
Developers who need API access for production apps (not available yet). Also not ideal for enterprise users who need data privacy guarantees given Meta's data handling practices.
Our Verdict
Muse Spark is Meta's strongest statement yet that frontier AI should be free. The model is genuinely competitive with GPT-5.4 and Claude on benchmarks, and Contemplating mode's multi-agent reasoning is a real differentiator. The catch? No API, no customization, and you're locked into Meta's ecosystem. For casual users who just want a great free chatbot, this is arguably the best deal in AI right now. For developers or enterprise users, you'll be waiting.
Sources
- Meta AI official blog (accessed 2026-04-13)
- TechCrunch coverage (accessed 2026-04-13)
- Artificial Analysis benchmarks (accessed 2026-04-13)
- DataCamp review (accessed 2026-04-13)
Explore more Muse Spark (Meta) rankings
Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Muse Spark (Meta).
The Tier List Tuesday
Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.
Alternatives to Muse Spark (Meta)
Claude (Anthropic)
Anthropic's flagship LLM -- Opus 4.7 (launched April 16, 2026) with 1M-token context, high-res vision, new xhigh reasoning level, and the most natural conversational style. Note: 2026-04-04 policy excluded third-party agent harnesses (OpenClaw etc.) from Pro/Max flat-rate, and 2026-04-16 Enterprise pricing dropped bundled tokens
Claude Mythos Preview
Anthropic's most capable model -- a gated research preview via Project Glasswing, cybersecurity-specialized. 73% success on expert CTF tasks, 32-step autonomous network attacks. Not generally available.
Gemini (Google)
Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution
Grok
xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens
GPT-Rosalind (OpenAI)
OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)
GPT-5.4-Cyber (OpenAI)
OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing
Hunyuan 3 (Tencent Hy3)
Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ
MiMo (Xiaomi)
Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch