Visualizing the IPFS network using powergate, altair, pandas
I set about visualizing the ipfs/filecoin network underlying a Textile Powergate node.
Visualizations here can be used to get a better idea of the scale of the network. They’re made to be easy to export and use and can be replicated in any interactive python notebook type environment.
You should be able to rerun the notebook yourself with no setup. Or recreate this project in a jupyter or google collab notebook. Also images should be in the source repo of the vizualizations.
Sadly, i suffered a serious accident on Jul 12th, and never got around to implementing anything more than the initial POC map of the network, had lip surgery related to it (aug 5) so it's a long process.
Notes on the video:
Tech
Python & Javascript
NodeJS app uses hosted textile powergate to fetch peering information.
This info is then written, as json, to github, and ipfs via pinata and powergate.
In a python notebook environment the data is then ETL’d in and visualizations are created.
The visualizations use a python library called Altair.
Basic data collection flow:
A NodeJS CLI application is run on a loop
This talks to the local or remote powergate client via the textile/powergate-client JS adapter
It calls the peers API on powergate and writes that to a file in JSON
This data is then stored for the visualizations. I just used github to store it in an easily accessible manner while prototyping.
In the databricks notebook linked there’s some python code that then ETL’s the data from Github or any remote source
Finally Altair is used to generate the visualizations from the cleaned and merged data-frames
We generate a map projection using a base vega dataset then plot the peers on it based on latitude and longitude
The data is also aggregated via country
Since Altair generates a vega spec, they can be interactive and can be transformed into HTML5, or pictures
There’s 4 visualizations that I managed to complete:
An equirectangular map with the distribution of nodes based on latitude and longitude, with the nodes aggregated or binned based on their country
The above but without the data aggregation
Pie chart of countries - uses the visualizer native to databricks
An orthographically projected map (a globe) with a projection of the nodes
Notes:
Since these projections were based on queries to only a few different nodes they may not be a full map of the network, only the peers this node connected to during the experiment
I tried using pygate but I couldn’t get it to work at the time
Choice on databricks:
We use it at work
Think of it as a interactive python notebook
The code in it should work in any interactive python notebook type environment like google collab or jupyter
Used Textile Powergate. Used localnet, then connected to the main-net from my machine and eventually also used the hosted Powergate instance provided by Textile.
Used both the CLI interface and the JS client thus the messy etl code
Some things I wanted to do but didn’t get a chance to:
Visualizing
Deals on the network
Network peers over time - some peers may stay on the network while many just are transient
The network via other gateways to compare
Ethereum users top board <- need data source
3Box users <- Need 3Box api
Comparing reliability and speed of the different providers, direct writes seem to fail or take a long time
Uploading the dataset on a IPFS hosting solution
Hosting the visualizations as a website on a decentralized store