Open Inference Layer offers open-source products, tools, and protocols to support decentralized AI
Open Inference Layer is a collection of open source products, tools, and protocols built to support decentralized AI.
Created by Ando (https://andoai.xyz), an inference platform, Open Inference Layer is grounded in a simple belief: inference is the cornerstone of AI, and access to inference should be open, user-managed, and aligned with the people and applications that depend on it.
The Open Inference Layer is releasing 2 keys modules of the layer today.
Programmable Inference - https://github.com/Open-Inference-Layer/programable-inference
Programmable Inference is a basket backed unit and settlement layer for AI inference.
INF is minted only against a redeemable USDC/EURC basket, so inference spend is not priced purely through USD or locked inside provider credits. Users and agents enter through Uniswap pools with one stablecoin, while a v4 hook completes the basket, mints or redeems through the vault, and returns a single asset.
This makes inference wallet native, portable, and programmable: it can be held by users, agents, DAOs, or providers; routed across compatible inference markets; escrowed in MPP sessions; metered during model use; paid to providers; and refunded when unused.
In short: Programmable Inference turns AI inference from a provider-specific USD credit into a redeemable, basket backed, wallet native primitive for the global AI economy.
Ando Inference Wallet - https://github.com/Open-Inference-Layer/ando-inference-wallet
Ando Inference Wallet is a self custody wallet and session layer for paid AI inference.
Ando lets users hold USDC in a browser extension, open an inference session, sign payment vouchers as model usage is metered, and sync that session directly into OpenWebUI. Instead of buying provider specific credits or trusting a hosted account balance, users keep control of their wallet while authorizing spend only for the active inference session.
The wallet makes inference payment native and portable: users, agents, teams, or providers can fund sessions, route requests through compatible inference gateways, meter usage per message or token, settle with providers, and recover unused funds when a session ends.
For OpenWebUI users, Ando turns a normal chat model route into a crypto paid inference session. The user starts a session in the extension, OpenWebUI routes chat through the Ando gateway, and the wallet manages the payment state behind the scenes.
In short: Ando Inference Wallet turns AI inference from a prepaid provider credit into a self custody, wallet native payment flow for open inference markets.
Programmable Inference is built as a Sepolia deployed smart contract system around Ethereum, Uniswap v4 hooks, Uniswap API, USDC, and EURC.
At the core is InferenceToken, an ERC-20 called INF. It cannot be freely minted by an owner. Minting and burning are restricted to InferenceVault, which enforces the basket rule:
1 INF = 0.5 USDC + 0.5 EURC
Users do not deposit into the vault directly. They enter and exit through two Uniswap v4 custom accounting pools: USDC/INF and EURC/INF. InferencePoolHookV4 intercepts swaps, reads the live USDC/EURC reference rate from a Uniswap v4 pool, converts the missing basket leg, funds the vault with the exact basket, and mints INF back through the swap path. On redemption, it burns INF, releases the basket, converts the non requested leg, and returns one stablecoin.
The notably hacky part is the v4 settlement design. The INF pools have zero AMM liquidity; price does not come from LP reserves. Instead, the hook uses Uniswap v4 custom accounting to make the pool behave like a programmable mint/redeem surface. For sells, we seed a fully backed INF settlement buffer into the PoolManager so exact input swaps can settle atomically while preserving vault backing.
We also built Sepolia deployment scripts, Foundry tests for the token, vault, rate source, hook accounting, router actions, and failure modes, plus a direct Universal Router smoke test proving USDC -> INF and INF -> USDC execution onchain.
Partner tech used: Ethereum for settlement, Uniswap v4 hooks for programmable pool logic, Uniswap Universal Router for swap execution, Circle testnet USDC/EURC for basket backing, and Foundry for development and verification.
Ando Inference Wallet is built as a browser extension, OpenWebUI integration, and IPP-compatible inference payment flow around Ethereum Sepolia, USDC, and an OpenAI-compatible gateway.
The wallet is a Chrome/Brave extension built with WXT, TypeScript, Viem, and browser extension storage. It creates or imports a self-custody EVM wallet locally, encrypts the private key in the browser with a user password, reads ETH/USDC balances on Sepolia, and signs IPP vouchers for active inference sessions.
The session flow is split across three pieces. The extension opens an inference payment session by approving USDC, opening an InferenceStreamChannel on Sepolia, and signing the first voucher. OpenWebUI runs the Ando Pipe, which exposes Ando Session as a normal model route. The extension syncs the current session token, wallet address, channel id, and voucher into OpenWebUI user valves, so when the user chats, OpenWebUI forwards the request to the Ando gateway with the right payment headers.
The gateway is OpenAI-compatible: it accepts /ipp/v1/chat/completions, validates the Ando session bearer token, checks the IPP channel and signed voucher, forwards valid requests to OpenAI, meters usage, and exposes a receipt. When the session closes, the user signs a final voucher for the exact metered usage; the gateway can settle the provider payment onchain, and the wallet can withdraw unused funds back to the user.
The notably hacky part is making OpenWebUI crypto-payment aware without forking OpenWebUI. We used a Pipe plugin plus a browser extension sync flow, so OpenWebUI still thinks it is talking to a normal model, while Ando injects the payment session state around the request path.
Partner tech used: Ethereum Sepolia for settlement, Circle testnet USDC for session deposits, Viem for wallet and contract interactions, OpenWebUI Pipes for model routing, OpenAI-compatible chat completions for inference, and WXT for the browser extension.

