RAG-powered research assistant that unifies on-chain data and docs into natural-language answers
This project combines on-chain logs data from factory contract as entry point to the protocol and protocol docs to retrieve protocol analysis. We also decode the on-chain logs to detect new contract deployments, aggregate their logs, identify patterns or anomalies (like suspicious upgrades or abnormal liquidity events) so that we could feed the context into the large language model along with protocol design knowledge from docs. We currently use Aave protocol docs and its factory contract address for demonstration
This project uses context prompting behind the scene to retrieve answer from OpenAI's large language model. To get the context, we first extract protocol docs(currently support pdfs only) using PDFplumber, chunk and embed them using OpenAI's embedding model and store the vector embeddings into ChromaDB for quickly finding similar embeddings for an input query(currently support most relevant chunk). After getting user's query and input contract address, we also use Blockscout APIs to get events logs of the contract and decode the interfaces of contracts found in those logs to feed it into the large language model's context. Using the combined context of most relevant chunk and on-chain logs as well as prompt engineering, we can retrieve the protocol analysis according to the recent upgrade.

