Playback Network is a decentralized data marketplace for training a large action model
AI is amazing except it can’t actually do anything without a human babysitting - ChatGPT doesn’t have hands so everyone has to do loads of manual work that should be automated!
A Large Action Model (LAM) is a new kind of foundational artificial intelligence model that can understand and execute complex tasks by translating human intentions into action.
We are giving ChatGPT hands so it can take actions on your devices.
LAMs are a new kind of foundational AI model BUT... there is very little training data. Only around 2000 hours of recordings data is available to train these models which is absurd.
Because of this, all existing LAMs are using In Context Learning, a prompt style where you give an LLM a set of input:output examples but this is severely limited by context window sizes and is far inferior to training an actual model - which we are unlocking with Playback.
Fundamentally we are solving this problem - we are creating a decentralized data marketplace for screen recordings of people completing various tasks.
Our key contributions include:
Decentralizing the data used to train these LAMs would democratise them and enable researchers to improve on the technology at a much faster pace. Moreover we have designed incentive mechanisms to align incentives between contributors and users of the data in such a way that encourages the creation of a massive LAM dataset and enables contributors to participate in the economic upside generated from the models trained on their data.
Our focus for HackFS is to solve the data problem but we intend on also training a decentralized LAM and building a solution that lets you automate complex task execution with a LAM automating actions on your device.
It should feel like minority report when you're using your computer, it's 2024!
Our submission honed in on the supply side of the decentralized market. We have a frontend tool that records the user’s screen and converts it to frames. These frames are then ran through a model that redacts sensitive data from the images (eg email addresses, account details etc).
The frontend sends the redacted frames to the backend, where they are converted to a segmented data using a SoM model on CoopHive. The SoM image data is then sent to S3, and then a lambda function takes the image urls and user wallet address which sends this to our custom OpenAiChatGptVision contract on Galadriel. We include a specific system message and prompt that tells the GPT on teeML to value the data for us. Once the data has been valued by gpt on Galadriel, our contract on Galadriel emits the valuation in an event. We have an EC2 instance that is listening for these events. It extracts the data from these events, and saves the user wallet address, segmented image data and the valuation to Lighthouse on Filecoin. It then calls a lambda that creates a signed message from the user’s wallet and the valuation. This prevents manipulation of the data. The signed message is then sent to the frontend, which then constructs a transaction that contains this signed message and sends it to our SignedMinter contract on Filecoin. This verifies the signature and if it is valid, it mints the specified amount of $BACK tokens to the user’s wallet.
Take $BACK your data.
The Playback Network tech stack is split into 3 components:
mint
function can only be called by the SignedMinter contract.sam
to get things shipped to bacalhau & coophive.