Introducing SN1’s dashboard: benchmarks, leaderboards, & miner stats
Providing a real-time breakdown of Subnet 1’s activity and achievements
SN1 represents the forefront of Bittensor’s ecosystem. Not only does it predate the root subnet, but chat-based generative AI is many people’s introduction to the contemporary wave of the industry. Given Bittensor’s position at the cutting edge of decentralized AI, this places SN1 in a prestigious position. And such prestige deserves transparent, user-friendly, and accessible design.
Despite its prominence, tracking performance on Subnet 1, Apex, has long been challenging. After much hard work, we’ve now developed a dashboard for SN1. This contains a range of features, giving people a breakdown of how it operates in real-time – including insights into miner-activity, what questions miners’ models are getting tested on, and how it compares to centralized LLMs on the market.
Here’s how the newly launched dashboard can help you get more out of SN1.
Comparing model benchmarks
The SN1 dashboard contains a range of benchmarks for leading centralized AI LLMs, along with a graph of how our subnet performs against them. This benchmark covers several timeslots, meaning you can track our progress on a historical level. At present, the dashboard displays benchmarks for GPT-4o, Llama 3.1 70B, Claude 3.5 Sonnet, Mixtral 8x7B, and many more.
SN1 is outperforming several high-quality open-source LLMs
On the note of MMLU benchmarks, our dashboard reveals SN1 is beating several renowned LLMs. In particular, SN1 has outperformed Llama3.1 70B 4bit and GPT 4 at certain points; a significant feat. This is a testament to how impactful the decentralized AI space can truly be.
Miner Arena
See how different miners score on different tasks. This helps you to understand which miner-models are most suitable for particular actions. As we’ve recently tweaked our rewards landscape to favor models that are highly skilled at a one or two tasks, rather than models that are generally intelligent, matching models to suitable actions is becoming increasingly important.
Miner leaderboard
Our miner leaderboard shows which miners have been the most successful within a given period of time. This means that their models have been outperforming others, and therefore are more likely to get used more often by those prompting with Apex.
More coming soon
In due time, we will be releasing more data on this dashboard. In particular, we’ll be showing a radar chart exploring how our subnet responds to prompts in terms of readability, insightfulness, correctness, and various other factors of a similar nature.
For SN1, this is just the start. We’ve been working consistently on both the backend and the frontend of this subnet to prove that, in the LLM landscape, decentralization is the future.