Data Universe: How to collect real-time data with Gravity
Fresh social media data scraping with no technical knowledge, powered by SN13 - here's how to get started.
Intelligence is only as good as the data on which it’s based. Since its inception, Subnet 13, Data Universe, has fulfilled its mission to provide Bittensor with the most powerful distributed data scraping service possible - so that models built on Bittensor can be properly equipped with the latest data, on demand.
Already, Data Universe has established the largest open-source repository of social media datasets in the world: 40bn rows and counting. Yet much of the potential of this data is still untapped.
Last week, we widened access by launching our landing page for Gravity. And right now, it’s free to use. We’re giving everybody 3000 credits immediately, to prove the product’s potential and boost performance across Bittensor. Here’s how to get started.
Data Universe, Gravity, and Nebula
Data Universe, Gravity, and Nebula are all powered by Subnet 13 - so how exactly do they fit together?
Data Universe is our name for Subnet 13’s data capabilities.
Use Gravity to scrape data from X and Reddit.
Use Nebula to visualize it according to size and sentiment.
And use Mission Commander as your agentic assistant, to navigate your way through the process.
To begin, head over to our landing page, and click ‘Get started’. Here, you’ll be able access Gravity, set a task to collect X and Reddit information, and collect it in four simple steps - no technical knowledge required.
Launching your task
Once you enter Gravity, you can start your data collection task. You can name it yourself, and then pick hashtags on X and communities on Reddit.
Or, you can chat with our AI agent, Mission Commander, and get it to pick a name and topics for you. Mission Commander is run by SN1, Apex - it helps automate the scraping process, in case you’re unsure what to choose from.
Click ‘Launch data collection’, and then the process will begin. Here’s how to do that using Mission Commander:
Go to the task library
After this, you’ll get automatically directed to the task library, where you can see each of your topics listed. These will all start as ‘pending’ - but once miners have taken on your tasks, their status will move to ‘running’.
You can then collect the data by clicking ‘Build dataset’. You can do this for each individual task, or you can combine the datasets by clicking ‘Build all’ at the top. It’s best to wait for all requests to be running if you want a unified dataset, so there are no unexpected gaps in your collection.
The datasets will take all information that’s already been scraped, but will update over time as more fresh data comes in through the miners. Here’s how your data is collected in real time:
Check your datasets
Once your data is ready, you can open it by clicking ‘View dataset’, which will present it as a table. Then can then download it in Parquet or CSV format - to see how, skip to 2.24 on the full tutorial below:
The data is now yours to use in any way you see fit - whether for market research, training LLMs, or your own general curiosity. Already, Gravity is providing some of Bittensor’s best subnets with real-time data on demand - including SN44, Score, and SN64, Chutes, with more to come.
Fresh, accessible, scalable
This data is now yours. You can play with it in any way you see fit. Whether that’s for Market research, training LLMs, or your own general curiosity.
It takes about two minutes to get started with Gravity. Just choose the data labels you want, and sit back while SN13’s miners fulfil your task. We’re excited to be sharing this service, but it’s just the start - we’re building new features that will put even more power in your hands.
Tell us what you think of Gravity - join our Telegram and Discord communities to let us know.