Project Overview
Real-World Data Marketplace for AI Agents is a full-stack decentralized marketplace that enables secure, token-gated trading of AI training data. The platform combines blockchain technology, decentralized storage, and AI agents to create a comprehensive ecosystem for data monetization and access control.
Core Problem & Solution
Problem: AI companies struggle to access high-quality training data due to:
- Data silos and lack of standardization
 
- Trust issues between data providers and consumers
 
- No secure way to monetize valuable datasets
 
- Difficulty verifying data quality and provenance
 
Solution: A decentralized marketplace that:
- Tokenizes datasets as ERC-721 NFTs (DataCoins)
 
- Uses Lighthouse for encrypted, token-gated storage
 
- Implements AI agents for discovery, negotiation, and validation
 
- Provides transparent provenance tracking via Blockscout
 
Technical Architecture
🏗️ System Components
1. Frontend (Next.js)
- React-based marketplace interface
 
- Wallet integration with wagmi/RainbowKit
 
- Real-time chat with AI agents
 
- Transaction verification and provenance display
 
2. Smart Contracts (Solidity)
- DataCoin.sol: ERC-721 NFT representing data ownership
 
- Marketplace.sol: Atomic swap marketplace with royalties
 
- Token-gated access control
 
- Validator attestation system
 
3. Backend API (Express.js)
- Lighthouse integration for file storage
 
- MCP adapter for Blockscout queries
 
- Agent orchestration endpoints
 
- Transaction logging and monitoring
 
4. AI Agents (Fetch.ai Integration)
- Seller Agent: Handles dataset upload, minting, and listing
 
- Buyer Agent: Discovers datasets and facilitates purchases
 
- Validator Agent: Performs quality checks and creates attestations
 
- MeTTa Knowledge Graph: Stores structured metadata
 
5. Storage & Verification
- Lighthouse: Encrypted file storage with access control
 
- Blockscout: Transaction verification and provenance tracking
 
- 1MB.io: DataCoin tokenization platform
 
Key Features
🔐 Token-Gated Access Control
- Datasets are encrypted and stored on Lighthouse
 
- Only token holders can decrypt and access data
 
- Granular permissions based on NFT ownership
 
- Automatic access revocation on token transfer
 
🤖 AI Agent Ecosystem
- Discovery: AI agents help users find relevant datasets
 
- Negotiation: Automated price negotiation between buyers and sellers
 
- Validation: Quality assessment and schema verification
 
- Attestation: On-chain recording of validation results
 
📊 Provenance Tracking
- Complete transaction history on Blockscout
 
- Validator signatures and attestations
 
- Quality scores and metadata
 
- Transparent ownership transfers
 
💰 Economic Model
- Royalties: 2.5% to original data creators
 
- Platform Fees: 1% to marketplace operators
 
- Atomic Swaps: Secure peer-to-peer transactions
 
- Bulk Discounts: Negotiated pricing for large purchases
 
Workflow Examples
Dataset Upload & Monetization
- Upload: Data provider uploads dataset to Lighthouse
 
- Encrypt: File is encrypted with access control
 
- Mint: DataCoin NFT is created via 1MB.io
 
- List: Token is listed on marketplace with pricing
 
- Store: Metadata is stored in MeTTa knowledge graph
 
Dataset Discovery & Purchase
- Query: User asks AI agent to find specific datasets
 
- Search: Agent queries MeTTa knowledge graph
 
- Recommend: Agent provides ranked recommendations
 
- Verify: User can verify provenance via Blockscout
 
- Purchase: Atomic swap transfers token and payment
 
- Access: Token holder can decrypt and download data
 
Quality Validation
- Download: Validator agent retrieves dataset
 
- Integrity: Hash verification and corruption checks
 
- Schema: Data format and structure validation
 
- Quality: ML-based quality assessment
 
- Attestation: Results recorded on-chain with signature
 
Technology Stack
Blockchain & Web3
- Ethereum/Polygon Mumbai: Smart contract deployment
 
- ERC-721: NFT standard for data tokens
 
- Ethers.js: Blockchain interaction
 
- Wagmi: React hooks for Web3
 
Storage & IPFS
- Lighthouse: Encrypted file storage
 
- IPFS: Decentralized file system
 
- Access Control Lists: Token-gated permissions
 
AI & Agents
- Fetch.ai: Agent framework
 
- ASI:One: Chat interface integration
 
- MeTTa: Knowledge graph for metadata
 
- Python/Node.js: Agent implementations
 
Frontend & Backend
- Next.js: React framework
 
- TypeScript: Type-safe development
 
- Tailwind CSS: Styling framework
 
- Express.js: Backend API server
 
Innovation & Uniqueness
🔬 Technical Innovation
- Token-Gated Encryption: First implementation of NFT-based data access control
 
- AI Agent Integration: Automated discovery, negotiation, and validation
 
- Provenance Transparency: Complete audit trail on blockchain
 
- Quality Attestation: On-chain validation records
 
🎯 Market Impact
- Data Democratization: Makes valuable datasets accessible
 
- Creator Economy: Enables data monetization for researchers
 
- Quality Assurance: Automated validation and verification
 
- Trust & Transparency: Blockchain-based provenance tracking
 
🚀 Scalability
- Modular Architecture: Easy to extend with new features
 
- Agent Ecosystem: Pluggable AI agents for different use cases
 
- Cross-Chain Ready: Designed for multi-chain deployment
 
- API-First: Comprehensive API for third-party integrations
 
Use Cases
For Data Providers
- Monetize valuable datasets
 
- Maintain control over data access
 
- Track usage and attribution
 
- Receive royalties on resales
 
For Data Consumers
- Access high-quality training data
 
- Verify data provenance and quality
 
- Negotiate fair pricing
 
- Ensure data authenticity
 
For Validators
- Earn rewards for quality assessment
 
- Build reputation in the ecosystem
 
- Contribute to data standards
 
- Participate in governance
 
Future Roadmap
Phase 1: Core Marketplace
- Basic buy/sell functionality
 
- Token-gated access control
 
- AI agent integration
 
- Provenance tracking
 
Phase 2: Advanced Features
- Automated quality validation
 
- Price discovery mechanisms
 
- Bulk trading capabilities
 
- Cross-chain support
 
Phase 3: Ecosystem Growth
- Third-party agent development
 
- API marketplace
 
- Governance token
 
- Decentralized autonomous organization (DAO)
 
This project represents a significant step forward in creating a truly decentralized, AI-powered data marketplace that addresses the fundamental challenges of data access, quality, and monetization in the AI industry.
Real-World Data Marketplace for AI Agents
🏗️ Architecture Overview
This project is a complex full-stack application that integrates multiple cutting-edge technologies to create a seamless data marketplace experience. Here's how we built it:
🛠️ Technology Stack & Integration
Frontend: Next.js + Web3 Integration
// Wagmi configuration for Web3 connectivity
const config = createConfig({
  chains: [polygonMumbai],
  ssr: true,
  transports: {
    [polygonMumbai.id]: http(rpcUrl),
  },
});
Key Implementation Details:
- Next.js 13.5.4 with TypeScript for type safety
 
- Wagmi v1.4.7 + RainbowKit for wallet connectivity
 
- Tailwind CSS v4.1.15 for modern, responsive design
 
- Custom React hooks for blockchain state management
 
Smart Contracts: Solidity + OpenZeppelin
// DataCoin.sol - ERC-721 with custom functionality
contract DataCoin is ERC721, Ownable, ReentrancyGuard {
    mapping(uint256 => string) public tokenCID;
    mapping(uint256 => address) public tokenSeller;
    mapping(uint256 => uint256) public tokenPrice;
    
    function mintDataCoin(address to, string calldata cid) external onlyOwner returns(uint256) {
        uint256 id = ++nextId;
        _safeMint(to, id);
        tokenCID[id] = cid;
        tokenSeller[id] = to;
        emit DataCoinMinted(id, to, cid);
        return id;
    }
}
Notable Features:
- ERC-721 NFTs representing data ownership
 
- Atomic swap marketplace with built-in royalties (2.5% to creators, 1% platform fee)
 
- Reentrancy protection for secure transactions
 
- Token-gated access control via Lighthouse integration
 
Backend: Express.js + Multi-Service Integration
// MCP Adapter for Blockscout integration
router.post('/query', async (req, res) => {
  const { queryType, target, params = {} } = req.body;
  
  switch (queryType) {
    case 'transaction':
      response = await simulateTransactionQuery(target, params);
      break;
    case 'contract':
      response = await simulateContractQuery(target, params);
      break;
    // ... more query types
  }
});
Backend Architecture:
- Express.js with TypeScript for API server
 
- Modular route structure (lighthouse, mcp-adapter, mint)
 
- Comprehensive logging system for MCP calls
 
- Error handling with detailed error responses
 
🔗 Partner Technology Integration
1. Lighthouse Storage Integration
// Token-gated access control setup
const acl = {
  conditions: [
    {
      id: 1,
      chain: 80001, // Mumbai testnet
      method: "balanceOf",
      standardContractType: "ERC721",
      contractAddress: contractAddress,
      returnValueTest: {
        comparator: ">",
        value: "0"
      },
      parameters: [":userAddress", tokenId]
    }
  ],
  operator: "and"
};
Lighthouse Benefits:
- Encrypted file storage with IPFS backend
 
- Token-gated access control using ERC-721 ownership
 
- Access control lists that verify NFT ownership
 
- Decentralized storage without single points of failure
 
2. Blockscout Integration
// Blockscout SDK integration for transaction verification
export async function getTx(txHash: string): Promise<TransactionData> {
  const response = await simulateBlockscoutCall('transaction', txHash);
  return {
    hash: txHash,
    blockNumber: response.blockNumber,
    from: response.from,
    to: response.to,
    explorerUrl: `${BLOCKSCOUT_BASE_URL}/tx/${txHash}`
  };
}
Blockscout Benefits:
- Transaction verification and provenance tracking
 
- Contract interaction monitoring
 
- Real-time transaction status updates
 
- Explorer integration for user transparency
 
3. Fetch.ai Agent Integration
// Buyer Agent - Dataset discovery and recommendation
class BuyerAgent {
  async processQuery(userQuery, userAddress) {
    const intent = await this.analyzeIntent(userQuery);
    const datasets = await this.searchDatasets(intent);
    const recommendations = await this.evaluateDatasets(datasets, intent);
    return this.generateResponse(recommendations, intent);
  }
}
Fetch.ai Benefits:
- Intelligent dataset discovery using natural language processing
 
- Automated price negotiation between buyers and sellers
 
- Quality assessment and validation workflows
 
- MeTTa knowledge graph for structured metadata storage
 
🧠 AI Agent Architecture
Multi-Agent System Design
// Seller Agent - Handles dataset upload and minting
async processDataset(datasetPath, metadata) {
  const uploadResult = await this.uploadToLighthouse(datasetPath, metadata);
  const aclResult = await this.setupAccessControl(uploadResult.cid, metadata);
  const mintResult = await this.mintDataCoin(uploadResult.cid, metadata);
  const listingResult = await this.createListing(mintResult, metadata);
  await this.storeInMetta(mintResult, metadata);
}
Agent Responsibilities:
- Seller Agent: Upload → Encrypt → Mint → List → Store metadata
 
- Buyer Agent: Discover → Evaluate → Recommend → Facilitate purchase
 
- Validator Agent: Download → Validate → Attest → Record on-chain
 
MeTTa Knowledge Graph Integration
# Python integration for MeTTa knowledge graph
class MettaClient:
    def store_dataset(self, dataset: Dict[str, Any]) -> bool:
        response = self.session.post(f"{self.endpoint}/store", json=dataset)
        return response.status_code == 200
    
    def query_datasets(self, **filters) -> List[Dict[str, Any]]:
        response = self.session.post(f"{self.endpoint}/search", json=filters)
        return response.json().get('datasets', [])
🔧 Particularly Hacky Solutions
1. Token-Gated Decryption Simulation
// Simulated access control for demo purposes
router.post('/check-access', async (req, res) => {
  const { cid, userAddress } = req.body;
  
  // Hacky demo logic: allow access if address ends with even hex
  const lastChar = userAddress.trim().toLowerCase().slice(-1);
  const evenHex = ['0','2','4','6','8','a','c','e'];
  const hasAccess = evenHex.includes(lastChar);
  
  res.json({ hasAccess });
});
Why This Works:
- Demo-friendly access control for testing
 
- Deterministic based on wallet address
 
- Easy to test without complex token verification
 
- Production-ready structure for real implementation
 
2. Mock Data Generation for Agents
// Generate realistic mock datasets for agent responses
getMockDatasets(intent) {
  const mockDatasets = [
    {
      id: 1,
      name: 'Computer Vision Dataset',
      description: '50,000 labeled images for object detection',
      category: 'Computer Vision',
      price: '0.1',
      size: '2.5GB',
      format: 'Images',
      verified: true,
      cid: 'QmSampleImageDataset123',
      tokenId: 1
    }
    // ... more datasets
  ];
}
Benefits:
- Realistic demo data for agent responses
 
- Consistent user experience during development
 
- Easy to customize for different scenarios
 
- Fallback mechanism when external APIs fail
 
3. Blockscout Simulation Layer
// Simulate Blockscout API calls for demo purposes
async function simulateBlockscoutCall(type: string, identifier: string): Promise<any> {
  await new Promise(resolve => setTimeout(resolve, 500)); // Simulate API delay
  
  if (type === 'transaction') {
    return {
      blockNumber: Math.floor(Math.random() * 1000000),
      from: '0x' + Math.random().toString(16).substring(2, 42),
      to: '0x' + Math.random().toString(16).substring(2, 42),
      status: 'success',
      confirmations: Math.floor(Math.random() * 100)
    };
  }
}
Why This Approach:
- No external dependencies during development
 
- Consistent demo experience regardless of network conditions
 
- Realistic data that matches expected API responses
 
- Easy to replace with real API calls in production
 
🚀 Deployment & CI/CD
GitHub Actions Pipeline
# Multi-service deployment pipeline
jobs:
  frontend:
    runs-on: ubuntu-latest
    steps:
      - name: Build frontend
        run: cd frontend && npm run build
      - name: Upload build artifacts
        uses: actions/upload-artifact@v3
        with:
          name: frontend-build
          path: frontend/.next/
Pipeline Features:
- Parallel builds for frontend, backend, and contracts
 
- Artifact management for deployment
 
- Security audits across all services
 
- Performance testing and monitoring
 
- Automatic deployment to Vercel
 
Environment Configuration
# Comprehensive environment setup
RPC_URL_MUMBAI=https://polygon-mumbai.g.alchemy.com/v2/YOUR_ALCHEMY_KEY
LIGHTHOUSE_API_KEY=your_lighthouse_api_key
BLOCKSCOUT_MCP_URL=https://your-mcp-endpoint.com
NEXT_PUBLIC_BLOCKSCOUT_INSTANCE_URL=https://your-autoscout-instance.blockscout.com
🔍 Notable Technical Decisions
1. TypeScript Everywhere
- Frontend: Next.js with TypeScript
 
- Backend: Express.js with TypeScript
 
- Contracts: Solidity with comprehensive type safety
 
- Agents: Node.js with TypeScript
 
2. Modular Architecture
- Separate services for different concerns
 
- API-first design for easy integration
 
- Pluggable agents for extensibility
 
- Microservice-ready structure
 
3. Comprehensive Logging
// MCP call logging for debugging and monitoring
const logMCPCall = (request: any, response: any) => {
  const logEntry = {
    timestamp: new Date().toISOString(),
    request,
    response,
    duration: Date.now() - request.startTime
  };
  
  const logPath = path.join(__dirname, '../../logs/mcp.log');
  fs.appendFileSync(logPath, JSON.stringify(logEntry) + '\n');
};
4. Error Handling & Resilience
- Graceful degradation when external services fail
 
- Comprehensive error logging for debugging
 
- Fallback mechanisms for demo purposes
 
- User-friendly error messages
 
🎯 Production Readiness
Security Considerations
- Reentrancy protection in smart contracts
 
- Access control for sensitive operations
 
- Input validation across all APIs
 
- Secure key management for external services
 
Scalability Design
- Stateless backend for horizontal scaling
 
- Database-agnostic architecture
 
- Caching strategies for performance
 
- CDN integration for static assets