Live Platform Data
Drux Activity
Real-world AI model performance from every query run on Drux. Speed, reliability, and consensus measured on actual user questions — not controlled benchmarks.
173 searches · 544 model calls · updated every 5 min
Platform Summary
173 total searches completed. 544 individual model calls recorded. Average consensus score: 75%. Model completion rate: 81%.
Top Models by Speed
- inception/mercury-2: avg 2.8s, 100% reliability, 8 queries
- microsoft/phi-4: avg 7.4s, 98% reliability, 40 queries
- x-ai/grok-4.3: avg 7.8s, 100% reliability, 3 queries
- cohere/command-r-08-2024: avg 8.0s, 100% reliability, 6 queries
- openai/gpt-oss-120b: avg 9.7s, 70% reliability, 50 queries
Most Agreed Questions
- What is the impact of history on food habits in India? — consensus score 100%
- is techcrunch battlefield the biggest startup showcase in the world? — consensus score 90%
- Is techcrunch disrupt the biggest startup showcase in the world? — consensus score 90%
- Amdahl's Law for LLM generated code — consensus score 90%
- Is $300/HR too low these days for custom full stack? — consensus score 90%
Platform Pulse
Total Searches
173
This Week
173
last 7 days
Today
5
Model Calls
544
across all searches
Avg Consensus Score
75%
across all completed searches
Model Completion Rate
81%
calls that returned successfully
Model Leaderboard — on Drux
Last 173 searches · click column to sort
| Model | Avg Speed▲ |
|---|---|
1Mercury 2FREE | 2.8s |
2Phi-4FREE | 7.4s |
3Grok 4.3PAID | 7.8s |
4Command RFREE | 8.0s |
5GPT OSS 120BFREE | 9.7s |
6Gemini 3.5 FlashPAID | 11.9s |
7Llama 4 MaverickFREE | 12.2s |
8Hermes 3 70BFREE | 12.7s |
9Seed 1.6FREE | 13.4s |
10Gemma 3 27BFREE | 14.1s |
11Qwen3 235BFREE | 14.4s |
12DeepSeek V3FREE | 15.3s |
13Claude Sonnet 4.6PAID | 15.9s |
14GPT-5.5PAID | 20.2s |
15ERNIE 4.5 300BFREE | 21.6s |
16Nemotron 49BFREE | 21.9s |
17GPT-5 MiniFREE | 22.6s |
18Gemini 2.5 FlashFREE | — |
⚡ Fastest on Drux
Mercury 2
avg 2.8s · 100% reliable
✓ Most Reliable on Drux
Mercury 2
100% completion · 8 calls
Consensus Digest — Recent Public Searches
✓ Models Agreed
What is the impact of history on food habits in India?
is techcrunch battlefield the biggest startup showcase in the world?
Is techcrunch disrupt the biggest startup showcase in the world?
Amdahl's Law for LLM generated code
Is $300/HR too low these days for custom full stack?
Updated every 5 min · Data from last 1,000 searches
About This Data
How is this different from other AI benchmarks?
Most AI benchmarks are run under controlled laboratory conditions on standardised test sets. Drux Activity measures performance on real user questions — diverse, unpredictable, and representative of actual use. Speed and reliability numbers here reflect what users actually experience.
What is a Consensus Win?
When multiple models answer the same question and their responses agree closely (consensus score ≥ 7/10), all models that responded are credited with a Consensus Win. A high Consensus Win rate means a model consistently lands on the same answer as its peers — a signal of reliability beyond just accuracy.
Why do free models have more queries?
Drux randomly selects models for each search. Free-tier models are included in more searches because they are available to all users. Paid models are selected when users opt into premium tiers. Query count reflects availability and tier distribution, not quality — sort by Speed or Reliability for a fairer comparison.
How often is this updated?
The leaderboard updates every 5 minutes, pulling from the last 1,000 completed searches. As query volume grows the data becomes more statistically significant.