An on-chain reinforcement learning agent performing multi hop arbitrage on Near DEX ref finance.
This project combines an off-chain component with a deployed smart contract, where the former triggers the latter and feeds information to it. The on-chain component is a reinforcement learning agent that performs circular arbitrage on ref finance pools NEAR-USDC, NEAR-USDT and USDT-USDC-USDT.e-USDC.e
The off-chain component is querying contract state of the above mentioned ref finance liquidity pools, calculates potential gains and feeds the information as tx input into the latter. The contract is a reinforcement learning agent, that uses the price information from the tx input to explore trades that have a theoretical chance of being profitable. It then learns the statistical distribution of the returns generated by executing trades within a certain range of potential gain. As the gain is calculated before the tx lands and it may even take multiple blocks until it does due to sharding, the agent learns to approximate slippage.