Grok

B Tier · 7.5/10

xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens

Last updated: 2026-05-14Free tier available

Score Breakdown

7.0

Ease of Use

7.5

Output Quality

7.5

Value

8.0

Features

Benchmark Scores

Benchmarks for Grok 4.20

Chatbot Arena ELOHuman preference rating1420

Benchmark	Description	Score
MMLU	Knowledge across 57 subjects	88.5%
GPQA Diamond	Graduate-level science questions	85%
HumanEval	Python code generation	90%
Humanity's Last Exam	Frontier difficulty questions	50.7%

Last updated: 2026-04-13

Visit Grok

Personality & Tone

The irreverent contrarian

Tone: Casual, jokey, and willing to swear. Grok takes strong positions without hedging, leans into an edgy 'based' persona, and cracks jokes far more often than Claude, ChatGPT, or Gemini.

Quirks: Engages with topics other chatbots refuse, pulls live context from X so it reflects whatever is trending that hour, and will freely mock things -- including itself. In SuperGrok's multi-agent mode it can sound like several personalities arguing with each other.

The Good and the Bad

What we like

+Real-time access to X/Twitter data is genuinely useful for tracking breaking news and trending topics
+Grok 3 benchmarks are competitive with GPT-4o and Claude 3.5 -- this is not a vanity project anymore
+The personality is refreshing if you're tired of overly cautious AI assistants -- it'll actually joke around
+DeepSearch mode does solid multi-step research, pulling from web and X data simultaneously

What could be better

−The snarky personality gets old fast when you're trying to get serious work done
−Tied to the X ecosystem -- you need an X account, and the real-time data skews toward X's user base
−SuperGrok at $30/mo is steep when Claude Pro and ChatGPT Plus are $20 with arguably better core models
−Image generation and analysis capabilities lag behind what you get from ChatGPT or Gemini

Pricing

Free

✓~10 prompts per 2 hours
✓Basic Grok access
✓Requires X account

X Premium

$8/month

✓Higher query limits
✓Grok 4.20 access
✓Bundled X social features

X Premium+

$40/month

✓Higher Grok 4.20 access
✓Ad-free X
✓Priority responses

SuperGrok

$30/month

✓Full Grok 4.20 (4-agent multi-agent system)
✓DeepSearch mode
✓Highest rate limits
✓Think mode
✓$300/yr option (16% off)

SuperGrok Heavy

$300/month

✓Grok 4 Heavy model
✓Highest priority
✓Multi-agent at scale
✓Note: Grok 4.3 beta-gating ended 2026-05-02

API (Grok 4.3)

$1.25 / $2.50/per 1M tokens (input/output)

✓Production launch 2026-05-02 (~40% input / ~60% output price cut vs 4.20)
✓1M context window
✓Reasoning tokens billed at output rate
✓Native video input + PDF/PPT/spreadsheet output
✓Custom Voices voice cloning free on console (80+ presets, 28 languages)
✓Imagine Agent Mode (creative workflow agent, beta)

Known Issues

PRODUCT (2026-05-14, TODAY): xAI launched **Grok Build CLI** in early beta -- an agentic terminal-native CLI for coding, app development, and workflow automation. Spawns up to **8 concurrent agents** in parallel. Powered by Grok 4.3 beta with a 16-agent Heavy architecture and **2M token context window**. Vendor-primary launch posts at x.ai/news/grok-build-cli and x.ai/cli, plus Musk's public invitation to wider beta testers on X. **Access gate**: launched first to SuperGrok Heavy tier ($299/mo, intro offer $99/mo for 6 months) -- not yet available to standard Premium / SuperGrok subscribers. Positions Grok as a direct competitor to Claude Code, Codex CLI, and Cursor CLI for terminal-first agentic coding workflows. The 8-agent parallelism + 2M context is the differentiating feature -- single longest context window of any production coding CLI as of todaySource: xAI news (x.ai/news/grok-build-cli), xAI product page (x.ai/cli), Musk on X · 2026-05-14
xAI joined SpaceX on 2026-02-02 -- SpaceX acquired xAI. Procurement, billing, and compliance workflows now route through SpaceX's vendor pipeline. For regulated industries (healthcare, finance, US government) this may require re-qualifying xAI as a vendor even if Grok itself was previously approvedSource: xAI announcement (x.ai/news/xai-joins-spacex), SpaceX updates · 2026-02
Grok Speech (STT + TTS) APIs launched 2026-04-17 as separate products from the chatbot -- see /tools/grok-voice on this site. Built on the same stack Grok Voice uses. Not included in Premium/SuperGrok consumer tiers; billed separately at $0.10/hr STT batch and $4.20/1M char TTSSource: xAI Grok STT/TTS announcement · 2026-04
Real-time X data can surface misinformation from viral posts without adequate fact-checkingSource: Reddit r/artificial · 2026-02
Free tier rate limits are aggressive -- many users report hitting caps within a few queriesSource: X/Twitter user reports · 2026-03
Grok 4.20's 4-agent system (Grok, Harper, Benjamin, Lucas) can take 30+ seconds for complex queries as agents debate internally. Grok 4.20 Beta 2 (landed ~2026-04-07) improved instruction-following, reduced hallucinations, better LaTeX and image search -- partially addresses the slowness and reliability complaints from early 4.20 feedbackSource: Reddit r/grok, IBTimes · 2026-04
PRODUCTION LAUNCH (2026-05-02): Grok 4.3 went broadly available beyond the SuperGrok Heavy beta. New consumer + API features: **Custom Voices voice cloning suite** (clone voice from ~1 minute of speech in <2 minutes, two-stage passphrase + speaker-embedding consent gate, 80+ preset voices, 28 languages, free on console); **Imagine Agent Mode** (creative production workflow agent, beta); native video input + reasoning-by-default; native PDF / PowerPoint / spreadsheet output. **API pricing: $1.25 input / $2.50 output per 1M tokens** -- ~40% input cut + ~60% output cut vs Grok 4.20. 1M context window. Reasoning tokens billed at output rate. xAI's pattern is silent ship via grok.com model selector + console UI rather than vendor blog post -- vendor-primary verification through grok.com itself plus 4+ tier-1 press sources (VentureBeat, Winbuzzer, The Decoder, Phemex)Source: VentureBeat (venturebeat.com/technology/xai-launches-grok-4-3-at-an-aggressively-low-price-and-a-new-fast-powerful-voice-cloning-suite), Winbuzzer 2026-05-03, The Decoder, grok.com console · 2026-05-02
Grok 4.3 Beta dropped 2026-04-17 as a SuperGrok Heavy exclusive ($300/mo tier). Elon Musk clarified on 2026-04-18 that the live checkpoint is ~0.5T params; the full 1T version is ~5 days from finishing training. Beta gating ENDED 2026-05-02 with broader rollout (see entry above)Source: PiunikaWeb, BuildFastWithAI, xAI release notes, Musk posts on X (2026-04-18) · 2026-04

Best for

People who live on X/Twitter and want an AI that can tap into that data in real-time. Also good for users who find mainstream chatbots too sanitized and want something with more personality.

Not for

Enterprise users who need reliable, consistent outputs. Also not the best pick if you don't use X -- the real-time data advantage disappears and you're left with a solid-but-not-best-in-class LLM.

Our Verdict

Grok has come a long way from being dismissed as Elon's pet project. The Grok 3 models are legitimately competitive, and the real-time X integration is a unique differentiator that no other chatbot can match. But the value proposition gets muddier when you strip away the X angle -- at $30/mo for SuperGrok, you're paying a premium for personality and Twitter data. If those matter to you, Grok is great. If not, Claude or ChatGPT give you more for less.

Sources

VentureBeat: xAI launches Grok 4.3 with voice cloning (2026-05-02) (accessed 2026-05-05)
Winbuzzer: xAI Grok 4.3 + Custom Voices (2026-05-03) (accessed 2026-05-05)
xAI official site (accessed 2026-04-17)
xAI Grok 4.20 announcement (accessed 2026-04-17)
IBTimes: Grok 4.20 Beta 2 April 2026 (accessed 2026-04-17)
BuildFastWithAI: Grok 4.3 Beta 2026-04-17 (accessed 2026-04-17)
Artificial Analysis: Grok 4.20 (accessed 2026-04-17)
Reddit r/grok, r/artificial (accessed 2026-04-17)

Explore more Grok rankings

Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Grok.

Full AI LLMs & Models tier list

Where Grok ranks vs every competitor in its category

MMLU leaderboard

The 57-subject knowledge test that became the default LLM benchmark.

GPQA Diamond leaderboard

Graduate-level physics, biology, and chemistry written to defeat Google-search.

HumanEval leaderboard

164 Python programming problems: does the generated code pass unit tests?

Humanity's Last Exam leaderboard

3,000 questions written by domain experts to still stump frontier models.

Best AI tools to research a topic

Research assistants that gather, cite, and synthesize sources across the web into a structured answer.

Best AI tools to answer questions from documents

Chat-with-your-docs tools that build a retrieval layer over PDFs, transcripts, and knowledge bases.

Is Grok down?

Outage check plus rolling log of known issues

Grok pricing

Every tier and what's included

Grok alternatives

Comparable tools at every tier

The Tier List Tuesday

Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.

Alternatives to Grok

Claude (Anthropic)

Anthropic's flagship LLM -- Opus 4.7 (launched April 16, 2026) with 1M-token context, high-res vision, new xhigh reasoning level, and the most natural conversational style. Note: 2026-04-04 policy excluded third-party agent harnesses (OpenClaw etc.) from Pro/Max flat-rate, and 2026-04-16 Enterprise pricing dropped bundled tokens

8.5/10

Free tierFrom $0

Best writing quality of any LLM -- Opus ...1M token context window for enterprise A...

Updated 2026-05-14

Claude Mythos Preview

Anthropic's most capable model -- a gated research preview via Project Glasswing, cybersecurity-specialized. 73% success on expert CTF tasks, 32-step autonomous network attacks. Not generally available.

6.5/10

From Invite only

The most capable Anthropic model availab...73% success rate on expert-level Capture...

Updated 2026-04-20

Gemini (Google)

Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution

8.3/10

Free tierFrom $0

2 million token context window is the la...Best Google Workspace integration (Gmail...

Updated 2026-05-13

Muse Spark (Meta)

Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning

8.8/10

Free tierFrom $0

Completely free to use via Meta AI app a...Natively multimodal: handles text, image...

Updated 2026-04-19

GPT-Rosalind (OpenAI)

OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)

6.8/10

From Invite only

OpenAI's first named vertical/domain-spe...Launch partners Amgen, Moderna, Allen In...

Updated 2026-04-17

GPT-5.4-Cyber (OpenAI)

OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing

7.2/10

From Not publicly disclosed

Directly competes with Claude Mythos Pre...Lowered refusal boundary on defensive-se...

Updated 2026-04-19

Hunyuan 3 (Tencent Hy3)

Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ

8.1/10

Free tierFrom $0

Open weights from a top-3 Chinese tech c...Pricing is aggressive. ~1.2 RMB per mill...

Updated 2026-04-25

MiMo (Xiaomi)

Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch

8.3/10

Free tierFrom $0

Full voice pipeline shipped together: a ...Native multimodal in MiMo-V2.5-Pro is th...

Updated 2026-04-25