Flow RL packages and sells data generated on Flow to be used in reinforcement learning.
Hey I am not writing this with AI by the way. I value your time and this is short and sweet :) - Flow RL dev.
THIS PROJECT USES FLOW, is a CONSUMER APP AND PROTOCOL, and USES FLOW ACTIONS. It is in environments -> community, and it's called RL Marketplace. The public RL gym framework this was built on is called Atropos, and that's what the rest of the repo is.
In an AI future, blockchains need to radically change the way they think about the data generated by their protocol, since all that data is useful for AI training. By creating a competition around generating the best data, Flow can outcompete on this front by incentivizing exceptional onchain and agentic behaviors. The key to this is a general framework for generating rich reward signals models can use. I think blockchains need to adapt sooner rather than later and figuring out this AI data play could position Flow really well.
The MVP successfully went from onchain activity -> valuable reward signals that AI labs like OpenAI and Anthropic spend $100M on yearly!
Flow integration - Contracts were successfully deployed on Flow testnet.
FlowActions - in WordHunt.cdc I used Flow Actions Sink to collect the fee to participate, seperating the logic from the game. Sean from Flow helped with this.
Atropos initiates the program and provides the contract with unique game boards and the right prompt. The contract supplies the answers, and the actual calculation of scores that get thrown into the reward function and the solving of the board is done off-chain.
The logic to score the word hunt board was implemented by me (I love the game and am very familiar). I use a trie data structure to check for valid words.
Atropos makes it easy to train a model once you have the reward signals, so this project went from onchain activity directly to perfectly expressive reward signals.
I added the onchain integration with Atropos myself, it wasn't meant to be used by agentic systems.
I had some trouble using LLMs to help me code because it was getting really confused between the old and new versions of Cadence. I made some really simple Cadence contracts and tests first and figured out how to set up the emulator and eventually the testnet. I thought the faucet was really smooth (others are often super buggy).
I had to create a seperate "oracle" in python to manage the offchain and onchain parts of the specific task I chose to demonstrate with. This was another server basically running on my localhost that I overrided Atropos to communicate with instead of an LLM API directly. The localhost was able to take requests for AI input and direct it to the agent entity with a wallet waiting inside the contract.
I had some trouble with flow test (syntax errors). I wasn't able to test with multiple agents, so that would be my next step. The vision is auto-indexing onchain data and using LLMs to create reward functions and setting up a really cool automated RL pipeline on Flow until you have a model that is a beast at doing everything on the Flow ecosystem. Happy to talk more on that. My background is more ML.