TAOLOR: Building a subnet-native AI agent with distributed RAG
From monolith to swarm: how we rebuilt Michael Taolor for the Bittensor ecosystem
When we first set out to build TAOLOR, our vision was clear: to create an AI agent that could drive TAO adoption by serving as the ultimate Bittensor copilot and companion guide for navigating dTAO. Michael Taolor - the ultimate TAO maxi -needed to be fluent in all 129 subnets (including Root), keeping up with the ecosystem in a way not otherwise possible but desperately needed. What began as a straightforward RAG implementation evolved into a distributed architecture that fundamentally rethinks how AI agents interact with decentralized knowledge.
The early days: v2 and the Postgres vector store
Our initial v2 designs took the conventional path. We experimented with OpenAI Assistants and integrated with Eliza v1 and v2, backing everything with a simple Postgres vector store. It worked, but something was off. We were building on Bittensor, yet relying heavily on centralized infrastructure. The irony wasn’t lost on us.
The strategic pivot: going subnet-native
We made a fundamental decision: leverage subnet capabilities as much as possible. This wasn’t just about decentralization for its own sake - it was about tapping into the unique strengths of the TAO ecosystem. That meant:
Macrocosmos Gravity (Subnet 13) - OnDemand API for real-time queries and DataSets API for bulk data gathering
Chutes.ai (Rayon Labs, Subnet 64) - LLM models for subnet-native inference and response generation
Squad.io (Rayon Labs) - Loop agents with smolagent tools for multi-step reasoning
On-chain Data (Node) - Direct blockchain data access for subnet state and transactions
Taostats - Additional APIs for network metrics, analytics, and insights
From day one, we knew our data layer needed to be comprehensive. Macrocosmos Gravity on Subnet 13 provides the foundation with its dual-mode APIs. The OnDemand API has been running on an always-on server for months, while the DataSets mode systematically gathers and indexes subnet-specific data. We complement this with direct node access for onchain data and Taostats APIs for network-wide analytics.
The 128 knowledge domains problem
Here’s where things got interesting.
Michael Taolor wasn’t just meant to be knowledgeable about one or two areas - he needs comprehensive understanding across all 128 subnets plus the root. That’s 129 distinct knowledge domains, each with their own code repositories, documentation, and rapidly evolving state.
Our first approach seemed logical: build one massive vector store containing everything. We threw all 128 subnets into a single knowledge base and... it failed spectacularly.
The problems were immediate.
Code couldn’t be properly indexed alongside natural language docs
Query performance degraded with scale
Context retrieval became increasingly imprecise
The critical question emerged: how do you orchestrate data fetching via Gravity’s two modes for 129 separate knowledge domains?
The realization: we had the pieces, but not the architecture
Taking stock, we realized we had everything we needed:
Data gathering capabilities through Macrocosmos Gravity (Subnet 13)
Agent LLM loop via Squad.io
LLM inference through Chutes.ai (Subnet 64)
Robust APIs with both OnDemand and DataSets modes
What we were missing was the right RAG architecture. Our monolithic approach was fighting against the distributed nature of the Bittensor ecosystem itself.
The solution: container swarms and micro-RAG
The breakthrough came when we stopped thinking about RAG as a single service and started thinking about it as a swarm. We rebuilt TAOLOR with one container per subnet, 130 micro-RAG instances, each optimized for its specific domain.
What each container provides:
Each subnet container is a fully-featured micro-RAG system with default capabilities:
1. Vector store
Optimized for code and documentation specific to that subnet
Semantic search and context retrieval
Subnet-specific indexing strategies
2. Multi-source data layer:
Macrocosmos Gravity (Subnet 13):
OnDemand API: Real-time / gravity queries for live data fetching
DataSets API: Bulk data gathering and systematic collection
Onchain Data (Node): Direct blockchain access for subnet state, transactions, and validator info
Taostats APIs: Network metrics, analytics, subnet rankings, and historical data
3. Chutes.ai integration (Subnet 64):
LLM models for inference and response generation
Subnet-native computation
Query processing and understanding
4. Squad.io agents:
Loop agents with smolagent tools
Multi-step reasoning capabilities
Complex task execution
5. Dataset gatherer:
Automated collection from GitHub repos
Documentation site scraping
Subnet-specific source monitoring
Onchain data polling and caching
Taostats metric aggregation
6. State manager:
Sleep/wake orchestration
Time tracking for data freshness
Auto-update triggers
Resource hibernation when idle
Intelligent resource management
The real elegance is in how these containers behave and leverage multiple data sources:
Wake-up protocol: When accessed, containers measure elapsed time and update data sources as needed:
Gravity DataSets for bulk documentation and code updates
Onchain node queries for latest subnet state
Taostats API calls for current metrics and rankings
Sleep mode: Unused containers hibernate, conserving resources while maintaining cached data
On-demand activation: Containers spring to life precisely when queried:
Gravity OnDemand for real-time subnet queries
Direct node access for live blockchain data
Taostats for instant network analytics
This creates a living, breathing knowledge infrastructure that scales with usage patterns and leverages the best data source for each query type.
The hybrid architecture: swarm and signal
We don’t rely solely on on-demand containers. Our architecture includes a signal engine that’s always on, also powered by the Macrocosmos Gravity Subnet 13 OnDemand API. This engine acts as:
A coordinator for container wake-up
A fast-path for common queries using real-time Gravity data
A health monitor for the container swarm
The combination gives us both responsiveness (via the always-on signal engine with OnDemand mode) and deep, specialized knowledge (via the on-demand subnet containers using both Gravity modes).
The subnet-native stack
Every component of TAOLOR is built on Tao subnet infrastructure:
Data layer:
Subnet 13 - Macrocosmos Gravity:
OnDemand API: Always-on server for real-time queries
DataSets API: Bulk data collection and indexing
Foundation of our data layer
Onchain Data (Node):
Direct blockchain access
Subnet state and validator information
Transaction data and network events
Real-time chain monitoring
Taostats:
Network-wide metrics and analytics
Subnet rankings and performance data
Historical trends and insights
API endpoints for programmatic access
Compute layer:
Subnet 64 - Chutes.ai (Rayon Labs):
LLM models for all inference
Subnet-native computation
Response generation and query understanding
Squad.io (Rayon Labs):
Loop agents for complex workflows
Smolagent tools integration
Multi-step reasoning engine
Why it all matters
TAOLOR’s architecture isn’t just about solving our specific problem - it’s a template for building subnet-native AI agents. As the Bittensor ecosystem grows, we need AI systems that embrace distribution rather than fight it. By aligning our architecture with the underlying network topology, we’ve built something that gets better as the protocol itself evolves.
Michael Taolor can now answer detailed questions about any subnet, compare implementations across domains, and guide developers through the entire Bittensor landscape - all because we stopped trying to centralize knowledge and started orchestrating it.
The future of AI agents on Bittensor is distributed, subnet-native, and built on the unique capabilities of each subnet. TAOLOR is just the beginning.







