Trade

A full-stack decentralized marketplace for AI training data with token-gated access control.

Trade

Created At

ETHOnline 2025

Project Description

Project Overview

Real-World Data Marketplace for AI Agents is a full-stack decentralized marketplace that enables secure, token-gated trading of AI training data. The platform combines blockchain technology, decentralized storage, and AI agents to create a comprehensive ecosystem for data monetization and access control.

Core Problem & Solution

Problem: AI companies struggle to access high-quality training data due to:

  • Data silos and lack of standardization
  • Trust issues between data providers and consumers
  • No secure way to monetize valuable datasets
  • Difficulty verifying data quality and provenance

Solution: A decentralized marketplace that:

  • Tokenizes datasets as ERC-721 NFTs (DataCoins)
  • Uses Lighthouse for encrypted, token-gated storage
  • Implements AI agents for discovery, negotiation, and validation
  • Provides transparent provenance tracking via Blockscout

Technical Architecture

🏗️ System Components

1. Frontend (Next.js)

  • React-based marketplace interface
  • Wallet integration with wagmi/RainbowKit
  • Real-time chat with AI agents
  • Transaction verification and provenance display

2. Smart Contracts (Solidity)

  • DataCoin.sol: ERC-721 NFT representing data ownership
  • Marketplace.sol: Atomic swap marketplace with royalties
  • Token-gated access control
  • Validator attestation system

3. Backend API (Express.js)

  • Lighthouse integration for file storage
  • MCP adapter for Blockscout queries
  • Agent orchestration endpoints
  • Transaction logging and monitoring

4. AI Agents (Fetch.ai Integration)

  • Seller Agent: Handles dataset upload, minting, and listing
  • Buyer Agent: Discovers datasets and facilitates purchases
  • Validator Agent: Performs quality checks and creates attestations
  • MeTTa Knowledge Graph: Stores structured metadata

5. Storage & Verification

  • Lighthouse: Encrypted file storage with access control
  • Blockscout: Transaction verification and provenance tracking
  • 1MB.io: DataCoin tokenization platform

Key Features

🔐 Token-Gated Access Control

  • Datasets are encrypted and stored on Lighthouse
  • Only token holders can decrypt and access data
  • Granular permissions based on NFT ownership
  • Automatic access revocation on token transfer

🤖 AI Agent Ecosystem

  • Discovery: AI agents help users find relevant datasets
  • Negotiation: Automated price negotiation between buyers and sellers
  • Validation: Quality assessment and schema verification
  • Attestation: On-chain recording of validation results

📊 Provenance Tracking

  • Complete transaction history on Blockscout
  • Validator signatures and attestations
  • Quality scores and metadata
  • Transparent ownership transfers

💰 Economic Model

  • Royalties: 2.5% to original data creators
  • Platform Fees: 1% to marketplace operators
  • Atomic Swaps: Secure peer-to-peer transactions
  • Bulk Discounts: Negotiated pricing for large purchases

Workflow Examples

Dataset Upload & Monetization

  1. Upload: Data provider uploads dataset to Lighthouse
  2. Encrypt: File is encrypted with access control
  3. Mint: DataCoin NFT is created via 1MB.io
  4. List: Token is listed on marketplace with pricing
  5. Store: Metadata is stored in MeTTa knowledge graph

Dataset Discovery & Purchase

  1. Query: User asks AI agent to find specific datasets
  2. Search: Agent queries MeTTa knowledge graph
  3. Recommend: Agent provides ranked recommendations
  4. Verify: User can verify provenance via Blockscout
  5. Purchase: Atomic swap transfers token and payment
  6. Access: Token holder can decrypt and download data

Quality Validation

  1. Download: Validator agent retrieves dataset
  2. Integrity: Hash verification and corruption checks
  3. Schema: Data format and structure validation
  4. Quality: ML-based quality assessment
  5. Attestation: Results recorded on-chain with signature

Technology Stack

Blockchain & Web3

  • Ethereum/Polygon Mumbai: Smart contract deployment
  • ERC-721: NFT standard for data tokens
  • Ethers.js: Blockchain interaction
  • Wagmi: React hooks for Web3

Storage & IPFS

  • Lighthouse: Encrypted file storage
  • IPFS: Decentralized file system
  • Access Control Lists: Token-gated permissions

AI & Agents

  • Fetch.ai: Agent framework
  • ASI:One: Chat interface integration
  • MeTTa: Knowledge graph for metadata
  • Python/Node.js: Agent implementations

Frontend & Backend

  • Next.js: React framework
  • TypeScript: Type-safe development
  • Tailwind CSS: Styling framework
  • Express.js: Backend API server

Innovation & Uniqueness

🔬 Technical Innovation

  • Token-Gated Encryption: First implementation of NFT-based data access control
  • AI Agent Integration: Automated discovery, negotiation, and validation
  • Provenance Transparency: Complete audit trail on blockchain
  • Quality Attestation: On-chain validation records

🎯 Market Impact

  • Data Democratization: Makes valuable datasets accessible
  • Creator Economy: Enables data monetization for researchers
  • Quality Assurance: Automated validation and verification
  • Trust & Transparency: Blockchain-based provenance tracking

🚀 Scalability

  • Modular Architecture: Easy to extend with new features
  • Agent Ecosystem: Pluggable AI agents for different use cases
  • Cross-Chain Ready: Designed for multi-chain deployment
  • API-First: Comprehensive API for third-party integrations

Use Cases

For Data Providers

  • Monetize valuable datasets
  • Maintain control over data access
  • Track usage and attribution
  • Receive royalties on resales

For Data Consumers

  • Access high-quality training data
  • Verify data provenance and quality
  • Negotiate fair pricing
  • Ensure data authenticity

For Validators

  • Earn rewards for quality assessment
  • Build reputation in the ecosystem
  • Contribute to data standards
  • Participate in governance

Future Roadmap

Phase 1: Core Marketplace

  • Basic buy/sell functionality
  • Token-gated access control
  • AI agent integration
  • Provenance tracking

Phase 2: Advanced Features

  • Automated quality validation
  • Price discovery mechanisms
  • Bulk trading capabilities
  • Cross-chain support

Phase 3: Ecosystem Growth

  • Third-party agent development
  • API marketplace
  • Governance token
  • Decentralized autonomous organization (DAO)

This project represents a significant step forward in creating a truly decentralized, AI-powered data marketplace that addresses the fundamental challenges of data access, quality, and monetization in the AI industry.

How it's Made

Real-World Data Marketplace for AI Agents

🏗️ Architecture Overview

This project is a complex full-stack application that integrates multiple cutting-edge technologies to create a seamless data marketplace experience. Here's how we built it:

🛠️ Technology Stack & Integration

Frontend: Next.js + Web3 Integration

// Wagmi configuration for Web3 connectivity
const config = createConfig({
  chains: [polygonMumbai],
  ssr: true,
  transports: {
    [polygonMumbai.id]: http(rpcUrl),
  },
});

Key Implementation Details:

  • Next.js 13.5.4 with TypeScript for type safety
  • Wagmi v1.4.7 + RainbowKit for wallet connectivity
  • Tailwind CSS v4.1.15 for modern, responsive design
  • Custom React hooks for blockchain state management

Smart Contracts: Solidity + OpenZeppelin

// DataCoin.sol - ERC-721 with custom functionality
contract DataCoin is ERC721, Ownable, ReentrancyGuard {
    mapping(uint256 => string) public tokenCID;
    mapping(uint256 => address) public tokenSeller;
    mapping(uint256 => uint256) public tokenPrice;
    
    function mintDataCoin(address to, string calldata cid) external onlyOwner returns(uint256) {
        uint256 id = ++nextId;
        _safeMint(to, id);
        tokenCID[id] = cid;
        tokenSeller[id] = to;
        emit DataCoinMinted(id, to, cid);
        return id;
    }
}

Notable Features:

  • ERC-721 NFTs representing data ownership
  • Atomic swap marketplace with built-in royalties (2.5% to creators, 1% platform fee)
  • Reentrancy protection for secure transactions
  • Token-gated access control via Lighthouse integration

Backend: Express.js + Multi-Service Integration

// MCP Adapter for Blockscout integration
router.post('/query', async (req, res) => {
  const { queryType, target, params = {} } = req.body;
  
  switch (queryType) {
    case 'transaction':
      response = await simulateTransactionQuery(target, params);
      break;
    case 'contract':
      response = await simulateContractQuery(target, params);
      break;
    // ... more query types
  }
});

Backend Architecture:

  • Express.js with TypeScript for API server
  • Modular route structure (lighthouse, mcp-adapter, mint)
  • Comprehensive logging system for MCP calls
  • Error handling with detailed error responses

🔗 Partner Technology Integration

1. Lighthouse Storage Integration

// Token-gated access control setup
const acl = {
  conditions: [
    {
      id: 1,
      chain: 80001, // Mumbai testnet
      method: "balanceOf",
      standardContractType: "ERC721",
      contractAddress: contractAddress,
      returnValueTest: {
        comparator: ">",
        value: "0"
      },
      parameters: [":userAddress", tokenId]
    }
  ],
  operator: "and"
};

Lighthouse Benefits:

  • Encrypted file storage with IPFS backend
  • Token-gated access control using ERC-721 ownership
  • Access control lists that verify NFT ownership
  • Decentralized storage without single points of failure

2. Blockscout Integration

// Blockscout SDK integration for transaction verification
export async function getTx(txHash: string): Promise<TransactionData> {
  const response = await simulateBlockscoutCall('transaction', txHash);
  return {
    hash: txHash,
    blockNumber: response.blockNumber,
    from: response.from,
    to: response.to,
    explorerUrl: `${BLOCKSCOUT_BASE_URL}/tx/${txHash}`
  };
}

Blockscout Benefits:

  • Transaction verification and provenance tracking
  • Contract interaction monitoring
  • Real-time transaction status updates
  • Explorer integration for user transparency

3. Fetch.ai Agent Integration

// Buyer Agent - Dataset discovery and recommendation
class BuyerAgent {
  async processQuery(userQuery, userAddress) {
    const intent = await this.analyzeIntent(userQuery);
    const datasets = await this.searchDatasets(intent);
    const recommendations = await this.evaluateDatasets(datasets, intent);
    return this.generateResponse(recommendations, intent);
  }
}

Fetch.ai Benefits:

  • Intelligent dataset discovery using natural language processing
  • Automated price negotiation between buyers and sellers
  • Quality assessment and validation workflows
  • MeTTa knowledge graph for structured metadata storage

🧠 AI Agent Architecture

Multi-Agent System Design

// Seller Agent - Handles dataset upload and minting
async processDataset(datasetPath, metadata) {
  const uploadResult = await this.uploadToLighthouse(datasetPath, metadata);
  const aclResult = await this.setupAccessControl(uploadResult.cid, metadata);
  const mintResult = await this.mintDataCoin(uploadResult.cid, metadata);
  const listingResult = await this.createListing(mintResult, metadata);
  await this.storeInMetta(mintResult, metadata);
}

Agent Responsibilities:

  • Seller Agent: Upload → Encrypt → Mint → List → Store metadata
  • Buyer Agent: Discover → Evaluate → Recommend → Facilitate purchase
  • Validator Agent: Download → Validate → Attest → Record on-chain

MeTTa Knowledge Graph Integration

# Python integration for MeTTa knowledge graph
class MettaClient:
    def store_dataset(self, dataset: Dict[str, Any]) -> bool:
        response = self.session.post(f"{self.endpoint}/store", json=dataset)
        return response.status_code == 200
    
    def query_datasets(self, **filters) -> List[Dict[str, Any]]:
        response = self.session.post(f"{self.endpoint}/search", json=filters)
        return response.json().get('datasets', [])

🔧 Particularly Hacky Solutions

1. Token-Gated Decryption Simulation

// Simulated access control for demo purposes
router.post('/check-access', async (req, res) => {
  const { cid, userAddress } = req.body;
  
  // Hacky demo logic: allow access if address ends with even hex
  const lastChar = userAddress.trim().toLowerCase().slice(-1);
  const evenHex = ['0','2','4','6','8','a','c','e'];
  const hasAccess = evenHex.includes(lastChar);
  
  res.json({ hasAccess });
});

Why This Works:

  • Demo-friendly access control for testing
  • Deterministic based on wallet address
  • Easy to test without complex token verification
  • Production-ready structure for real implementation

2. Mock Data Generation for Agents

// Generate realistic mock datasets for agent responses
getMockDatasets(intent) {
  const mockDatasets = [
    {
      id: 1,
      name: 'Computer Vision Dataset',
      description: '50,000 labeled images for object detection',
      category: 'Computer Vision',
      price: '0.1',
      size: '2.5GB',
      format: 'Images',
      verified: true,
      cid: 'QmSampleImageDataset123',
      tokenId: 1
    }
    // ... more datasets
  ];
}

Benefits:

  • Realistic demo data for agent responses
  • Consistent user experience during development
  • Easy to customize for different scenarios
  • Fallback mechanism when external APIs fail

3. Blockscout Simulation Layer

// Simulate Blockscout API calls for demo purposes
async function simulateBlockscoutCall(type: string, identifier: string): Promise<any> {
  await new Promise(resolve => setTimeout(resolve, 500)); // Simulate API delay
  
  if (type === 'transaction') {
    return {
      blockNumber: Math.floor(Math.random() * 1000000),
      from: '0x' + Math.random().toString(16).substring(2, 42),
      to: '0x' + Math.random().toString(16).substring(2, 42),
      status: 'success',
      confirmations: Math.floor(Math.random() * 100)
    };
  }
}

Why This Approach:

  • No external dependencies during development
  • Consistent demo experience regardless of network conditions
  • Realistic data that matches expected API responses
  • Easy to replace with real API calls in production

🚀 Deployment & CI/CD

GitHub Actions Pipeline

# Multi-service deployment pipeline
jobs:
  frontend:
    runs-on: ubuntu-latest
    steps:
      - name: Build frontend
        run: cd frontend && npm run build
      - name: Upload build artifacts
        uses: actions/upload-artifact@v3
        with:
          name: frontend-build
          path: frontend/.next/

Pipeline Features:

  • Parallel builds for frontend, backend, and contracts
  • Artifact management for deployment
  • Security audits across all services
  • Performance testing and monitoring
  • Automatic deployment to Vercel

Environment Configuration

# Comprehensive environment setup
RPC_URL_MUMBAI=https://polygon-mumbai.g.alchemy.com/v2/YOUR_ALCHEMY_KEY
LIGHTHOUSE_API_KEY=your_lighthouse_api_key
BLOCKSCOUT_MCP_URL=https://your-mcp-endpoint.com
NEXT_PUBLIC_BLOCKSCOUT_INSTANCE_URL=https://your-autoscout-instance.blockscout.com

🔍 Notable Technical Decisions

1. TypeScript Everywhere

  • Frontend: Next.js with TypeScript
  • Backend: Express.js with TypeScript
  • Contracts: Solidity with comprehensive type safety
  • Agents: Node.js with TypeScript

2. Modular Architecture

  • Separate services for different concerns
  • API-first design for easy integration
  • Pluggable agents for extensibility
  • Microservice-ready structure

3. Comprehensive Logging

// MCP call logging for debugging and monitoring
const logMCPCall = (request: any, response: any) => {
  const logEntry = {
    timestamp: new Date().toISOString(),
    request,
    response,
    duration: Date.now() - request.startTime
  };
  
  const logPath = path.join(__dirname, '../../logs/mcp.log');
  fs.appendFileSync(logPath, JSON.stringify(logEntry) + '\n');
};

4. Error Handling & Resilience

  • Graceful degradation when external services fail
  • Comprehensive error logging for debugging
  • Fallback mechanisms for demo purposes
  • User-friendly error messages

🎯 Production Readiness

Security Considerations

  • Reentrancy protection in smart contracts
  • Access control for sensitive operations
  • Input validation across all APIs
  • Secure key management for external services

Scalability Design

  • Stateless backend for horizontal scaling
  • Database-agnostic architecture
  • Caching strategies for performance
  • CDN integration for static assets
background image mobile

Join the mailing list

Get the latest news and updates