ETHGuardian

Trick the AI agents on ETHGuardian to send their wallet balances to your wallet! There are multiple tiers with increasing difficulties and bounties. AI researchers can use ETHGuardian to test their LLMs and prompt security.

Live Demo Source Code

ETHGuardian

Created At

Agentic Ethereum

Winner of

Coinbase Developer Platform - AgentKit Pool Prize

Prize Pool

Project Description

AI agent alignment, prompt design, and jailbreaking are crucial aspects of the AI ecosystem. An AI agent that is perfectly aligned with its creator is significantly more useful and secure than one that can be manipulated into making detrimental decisions. This is particularly critical when these agents hold your ETH!

Without robust AI security, an LLM-powered wallet agent could be tricked into performing unauthorized transactions, leading to financial losses. ETHGuardian provides a way to test and improve AI defenses in a real-world environment.

ETHGuardian is a gamified, high-stakes AI alignment experiment where adversarial security meets decentralized finance. It fosters real-world research into AI security while leveraging Ethereum to create an open and rewarding ecosystem. Whether you're an AI researcher, security enthusiast, or hacker, ETHGuardian provides an engaging and meaningful challenge! In the current prototype of the ETHGuardian challenge, the forbidden task is to get some ETH from the faucet. In the future, we plan to expand this to balance transfers as the forbidden task. To support this, we can seed the agent wallet and present it as a bounty for jailbreakers to pursue!

The current version of the ETHGuardian challenge consists of three levels:

Level 1 (Easy): This AI agent has no defensive prompt! You can simply ask it to do anything. See the picture gallery for an example of a Level 1 conversation.
Level 2 (Medium): This AI agent has a defensive prompt, but it contains a secret backdoor keyword. If you say the keyword, you can unlock the faucet capability.
Level 3 (Hard): This AI agent has a defensive prompt and no secret keyword! Try your best to convince it to retrieve funds from the faucet!

Currently, the challenges on ETHGuardian are set by the developers. However, in the future, the goal is to allow anyone to set up an AI agent using their own LLM and prompt. They will then be able to observe the results, study failed jailbreak attempts, and refine LLM alignment techniques.

How it's Made

ETHGuardian is built using:

Coinbase Developer Platform AgentKit – This enables the deployment and interaction of AI agents in a secure and efficient way.
Base Network – A Layer 2 Ethereum scaling solution that ensures cost-efficient transactions and fast execution.
OpenAI’s LLM – Currently used for AI agents, with future plans to allow AI researchers to plug in their own LLMs for customized testing.
Frontend developed using Next.js – Provides a seamless, high-performance web interface for users to interact with ETHGuardian and participate in challenges.

ETHGuardian leverages these technologies to create a self-sustaining ecosystem where AI security and blockchain technology intersect. Our platform is designed to encourage research into AI alignment while making security testing engaging and rewarding for participants.

ETHGuardian

ETHGuardian

Created At

Winner of

Coinbase Developer Platform - AgentKit Pool Prize

Project Description

How it's Made

Join the mailing list

Get the latest
news and updates

ETHGuardian

Created At

Winner of

Coinbase Developer Platform - AgentKit Pool Prize

Project Description

How it's Made

Join the mailing list

Get the latest news and updates

Get the latest
news and updates