Decentralised Community LLM inference on-chain, allowing users to infer very cheap.
Aight is a decentralized AI inference marketplace that connects buyers who need LLM API access with operators who run local Ollama rigs, using crypto-backed coordination for trust and payments.
Instead of relying on centralized model providers, Aight lets independent operators stake on Base Sepolia, register their rig, and expose inference capacity through the Aight gateway. Buyers can discover available rigs, rent one by funding an escrow-backed session, receive an API key, and immediately call OpenAI-compatible /v1/chat/completions endpoints. During a rental, the platform tracks live usage metrics (calls, prompt tokens, completion tokens, total tokens, last-used timestamp) so buyers can verify real compute consumption and operators can prove service delivery.
On-chain escrow logic governs settlement: rental funds are split hourly, with operator payout and protocol treasury share enforced by contract rules. This creates a transparent payment path from buyer deposit to operator earnings, with transaction receipts viewable on BaseScan for verifiable proof of stake, rental funding, and releases. The system is designed so settlement can be triggered permissionlessly and automated by a keeper process, reducing operator friction and improving payout reliability.
The product includes:
A modern web console for signup/login, buyer and operator workflows, wallet connection, and settlement visibility. A FastAPI gateway for auth, API key issuance, routing requests to active operator rigs, and usage accounting. A Solidity registry/escrow contract for staking, allocation safety, and payout splitting. Operator bootstrap scripts for quick rig pairing from external machines. In short, Aight turns unused local GPU/CPU inference capacity into a crypto-verifiable, rentable AI infrastructure layer—combining Web3 trust guarantees with practical, developer-friendly AI APIs.
I built Aight as three layers that talk over HTTPS and on-chain calls.
Frontend (buyers & operators) is a Next.js / React app with Tailwind for UI. RainbowKit + Wagmi + Viem handle wallets (e.g. WalletConnect/MetaMask), contract reads/writes, and Base Sepolia receipts—we surface BaseScan links for stakes, escrow deposits, and releases so users can verify money flow. The buyer console lists rigs, drives rentals, and shows live usage (calls, prompt/output tokens, totals). The operator console covers pairing, staking, rig status, settlement queue, and payout withdrawal.
Gateway is Python 3.12 with FastAPI and Uvicorn, containerized with Docker and served behind Caddy on AWS EC2 (TLS termination and reverse proxy to the app). It owns auth, session tokens, API key issuance, and OpenAI-compatible /v1/chat/completions. Inference goes through LiteLLM, which routes to Ollama on operator machines. Operators run a small Python + httpx client installed via a curl | bash bootstrap; rigs reach the gateway through public HTTPS (in demos, often via a tunnel) while Ollama stays local.
On-chain logic lives in Solidity (AightRegistry), developed and tested with Foundry. Operators stake; buyers fund escrows for rentals; the contract enforces hourly releases and operator vs treasury splits. We made releaseHourlyPayment permissionless so anyone—or our optional settlement keeper in the gateway—can trigger due payouts without operators babysitting txs; the keeper uses web3.py-style contract calls from env-configured keys.
Partner / ecosystem fit: Base gives cheap L2 txs and a clear block explorer story for hackathon judges; WalletConnect-class UX lowers friction for non-crypto-native testers.
Notable “hackathon” choices: We leaned on escrow + hourly accounting instead of micro-payments per token on-chain; usage proof lives in the gateway (aggregated per API key) while settlement stays contract-verifiable. DNS split (Vercel frontend vs EC2 API) required careful CORS and deploy discipline—that’s the boring glue that made the demo actually work end-to-end.

