Personalized product discovery on social media powered by crypto payments & AI vision models

Instant is a Chrome browser extension that revolutionizes how people shop from social media content. It uses advanced AI vision technology to analyze Instagram reels in real-time, automatically identifying clothing and fashion items worn by creators and instantly finding similar products across major e-commerce platforms (including those in India!). Users can purchase these items using cryptocurrency, creating a seamless bridge between spontaneous inspiration and actual shopping.
Upon seeing a reel you like, Instant can automatically capture an image of the video and display the closest match and affordable alternatives with clear pricing, descriptions, and images. If this is their first time using Instant, the user connects a wallet and pays in Polygon USDC. Purchases are completed by programmatically sourcing or directly ordering from merchants. We built Instant on Polygon’s x402 payment framework, enabling micropayment between AI agents and seamless conversion after users connect with non-custodial wallet authentication.
According to some reports, there are 314 million stablecoin holders in India, which is the most in the world! As a result, there’s a massive untapped market of crypto-native users who want to spend their digital assets directly on everyday purchases. Instant is particularly valuable for users holding USDT, USDC, or other stablecoins who want to maintain their crypto holdings while still being able to access vast product catalogs across short-form content and nontraditional marketplaces.
Over time, the vision is to expand into a true aggregation layer for e-commerce, where users can find any product online, search for matches efficiently, and pay with whatever rail they prefer.
We built on Polygon's x402 payment framework using USDC on Polygon Amoy testnet. Users connect via Rainbow SDK with non-custodial wallet authentication, creating a unique wallet-to-extension mapping that tracks their shopping journey across Instagram sessions. Our dual-agent architecture consists of a discovery agent (powered by OpenAI Vision API for product identification) and a purchase agent (handling autonomous buying on platforms like Amazon), both connected through x402. When users find a product and hit "Purchase," they're redirected to the marketplace only if agentic payment has gone through. The x402 framework ensures that our backend only receives OpenAI API results after the discovery agent pays for the service, creating proof-of-payment for AI processing. This enables crypto-native users to spend USDC, directly bridging the gap between decentralized payments and centralized retail infrastructure through Polygon's efficient, low-cost transaction layer.
Complementing our approach to x402, we thought our product discovery mechanism was hacky. The way we find similar items is with an integration with SerpAPI across Amazon, Google Shopping, Myntra, and other local marketplaces with rate limiting for performance optimization. After gathering a pool of around ~25 potential product matches, the agent then consolidates into 4-5 best matches based on a calculated similarity score. The score factors in visual attributes (color, pattern/texture, etc), product metadata (item type, composition, style), and contextual factors (price, availability, etc). The payment infrastructure consists of Amazon Incentives API integration and crypto payment processing with multi-chain wallet support.
Regarding traditional technology, Instant consists of a Chrome extension frontend communicating with a Node.js/Express backend via REST APIs, deployed with modular service architecture for scalability. The frontend (extension) leverages real-time video frame capture using HTML5 Canvas API to extract Instagram reel frames, as well as context menu integration and popup UI windows for user state management. The backend consists of an image processing pipeline: after capturing video frames, we convert them to Base64 and send them to OpenAI GPT-4 Vision API with custom prompts for structured item extraction.

