gh0st.market

A privacy-first job network for verifiable web data collection

gh0st.market

Created At

ETHGlobal Buenos Aires

Project Description

The Problem We're Solving

The modern data economy runs on web data that lives behind logins, paywalls, and account gates—think Crunchbase company profiles, LinkedIn insights, SaaS analytics dashboards, or proprietary B2B databases. This data is incredibly valuable for investors, growth teams, AI training pipelines, and market intelligence platforms.

But accessing it programmatically is a nightmare:

  • Scraping is broken: Proxy networks get blocked, headless browsers get fingerprinted, CAPTCHAs multiply, and sites actively fight automation. You spend more time maintaining infrastructure than building products.
  • Trust is zero: When you pay a data vendor, you have no idea if the data is real, fabricated, or stale. There's no cryptographic proof that data actually came from the source claimed.
  • Compliance is risky: Fake accounts, credential sharing, and ToS violations create legal exposure. Enterprises can't touch these solutions.
  • AI agents are stuck: The next generation of autonomous AI agents need real-time web data, but they can't authenticate into services or prove their outputs are genuine.

Our Solution: A Decentralized Data Access Layer with Cryptographic Proofs

gh0st.market creates a two-sided marketplace where data requesters connect with authorized operators (humans or AI agents) who already have legitimate access to target platforms. The magic? Every data delivery is accompanied by a zk-TLS proof—cryptographic evidence that the data genuinely came from the claimed source, without revealing credentials or session details.

How It Works

  1. Requesters Define Jobs A requester creates a Job Spec—a reusable template that defines:
  • Target domain (e.g., crunchbase.com)
  • URL pattern with placeholders (e.g., https://crunchbase.com/organization/{{slug}})
  • Data extraction instructions for AI agents
  • Expected output schema
  • Validation rules

Then they create individual Jobs referencing that spec, with concrete inputs (e.g., {slug: "anthropic"}) and an escrowed bounty in ETH or any ERC-20 token. The bounty is locked in the smart contract until work is verified.

  1. Workers Approve & Execute Workers—who already have subscriptions, accounts, or access to target platforms—browse available job specs and approve the ones they can fulfill. They set minimum bounty thresholds so they only see jobs worth their time.

The gh0st browser extension acts as their AI-powered work environment:

  • Monitors the blockchain for new jobs matching approved specs
  • Opens a dedicated worker tab and navigates to target URLs
  • Collects the requested data fields
  • Generates a zk-TLS proof via vlayer that cryptographically attests the HTTPS session really hit that domain and the response matches what's being submitted
  • Submits the result + proof to the smart contract
  1. Trustless Settlement The JobRegistry smart contract:
  • Verifies the zk-TLS proof against the spec's target domain
  • Confirms the proof is valid
  • Automatically releases the escrowed bounty to the worker
  • Records the result payload on-chain

No middleman. No disputes. No trust required.

Technical Architecture

Smart Contracts (Solidity + Foundry)

  • JobRegistry.sol: Core protocol contract managing job specs, jobs, escrow, and proof-verified payouts
  • ProofVerifier.sol: Interface for vlayer zk-TLS verification (mock for hackathon, production-ready interface)
  • Supports multiple tokens: Native ETH and any ERC-20 (USDC, WBTC, etc.)
  • Gas-efficient batch queries: getJobSpecsRange() and getJobsRange() for frontend pagination
  • Event-driven architecture: All state changes emit events for efficient indexing

Web Application (Next.js 16 + React 19)

  • Dynamic Labs integration: Seamless wallet connection with email/social fallbacks
  • Dual-role UX: Single app serves both requesters and workers with role toggle
  • Real-time blockchain state: wagmi v3 + TanStack Query for reactive contract reads
  • Requestor Dashboard: Create specs, post jobs, track completion status
  • Worker Dashboard: Browse specs, approve with min-bounty filters, monitor active tasks

Browser Extension (Plasmo + TypeScript)

  • Worker Engine: Sophisticated state machine managing job queue, auto-mode, and parallel execution
  • Local Database: Drizzle ORM with SQLite for followed specs, active jobs, and earnings history
  • Web ↔ Extension Protocol: Typed message passing (GH0ST_*) for seamless integration
  • vlayer Client: Abstracted zk-TLS proof generation with mock fallback for development
  • Privacy-first: Credentials never leave the worker's browser; only proofs are shared

zk-TLS Integration (vlayer)

The cryptographic backbone that makes this trustless:

  • Proves a TLS session occurred with a specific domain
  • Attests that response data matches what's being claimed
  • Zero-knowledge: verifier learns nothing about credentials, cookies, or session tokens
  • On-chain verifiable: smart contracts can validate proofs directly

Why This Matters

For AI Agent Builders

gh0st.market is infrastructure for the agentic web. AI agents need real-time data from authenticated sources, but they can't hold credentials safely or prove their outputs are genuine. With gh0st, agents can:

  • Request verified data from any web source
  • Trust results without trusting the provider
  • Pay programmatically via smart contracts

For Data Teams

Stop maintaining brittle scraping infrastructure. Instead:

  • Post jobs describing what you need
  • Get verified results with cryptographic provenance
  • Pay only for successful, proven deliveries

For Workers & Operators

Monetize access you already have:

  • Use subscriptions you're paying for anyway
  • Set your own prices and filters
  • Work anonymously—your identity is never revealed
  • Let AI agents work for you in auto-mode

What Makes This a Strong Hackathon Project

  1. Full-Stack Implementation This isn't a mockup—it's a working system spanning smart contracts, a production-quality web app, and a browser extension with real state management.

  2. Novel Architecture Combining zk-TLS proofs with a job marketplace is genuinely new. We're not just "blockchain + scraping"—we're creating verifiable data infrastructure.

  3. Real Market Need The web data market is $5B+ and growing. Every AI company, hedge fund, and growth team struggles with this problem. We're building picks and shovels for the AI gold rush.

  4. Privacy-First Design Both sides stay pseudonymous. Requesters don't reveal what data they're collecting at scale; workers don't reveal their credentials. Only proofs and payments hit the chain.

  5. Extensible Foundation The Job Spec system is a protocol primitive. Anyone can create specs for any website. The ecosystem grows organically as workers approve new domains.

  6. AI-Native Built from day one for AI agents to participate—as requesters posting jobs or as workers executing them autonomously.

The Vision

gh0st.market is the HTTP of authenticated web data. Just as APIs standardized how services talk to each other, we're standardizing how AI agents and applications access human-permissioned web data with cryptographic trust.

Imagine:

  • AI assistants that can fetch your real portfolio data and prove it's accurate
  • Market intelligence platforms with verifiable, real-time competitor data
  • Research tools that can cite sources with cryptographic provenance
  • Autonomous agents that earn revenue by monetizing access their operators already have

We're not building a scraping tool. We're building the trust layer for the web data economy.

How it's Made

How We Built gh0st.market: The Technical Deep Dive

Architecture Overview

gh0st.market is a three-part system that had to work together seamlessly: smart contracts handling escrow and verification, a web application for both requesters and workers, and a browser extension that actually executes jobs and generates proofs. Getting these pieces to communicate reliably was the core engineering challenge.


Smart Contracts: Foundry + Solidity

We chose Foundry over Hardhat for its speed and native Solidity testing. The contract architecture centers on two key abstractions:

The JobSpec / Job Pattern

Rather than having requesters define everything per-job, we separated templates (JobSpecs) from instances (Jobs):

struct JobSpec { string mainDomain; // "crunchbase.com" string notarizeUrl; // "https://crunchbase.com/organization/{{orgSlug}}" string promptInstructions; // AI extraction instructions string outputSchema; // Expected JSON schema string inputSchema; // Placeholder types address creator; bool active; }

struct Job { uint256 specId; // Reference to template string inputs; // Concrete values: {"orgSlug": "anthropic"} address token; // ETH (address(0)) or ERC-20 uint256 bounty; JobStatus status; string resultPayload; address worker; }

This means anyone can create a reusable spec for a domain, and the ecosystem benefits from shared templates. Workers approve specs once, then see all matching jobs automatically.

Multi-Token Escrow

We wanted to support both native ETH and stablecoins (USDC) from day one:

function createJob(CreateJobParams calldata params) external payable { if (params.token == address(0)) { // Native ETH - must match msg.value if (msg.value != params.bounty) revert InvalidBounty(); } else { // ERC-20 - pull tokens via transferFrom if (msg.value != 0) revert TokenMismatch(); IERC20(params.token).safeTransferFrom(msg.sender, address(this), params.bounty); } // ... create job }

On payout, the same logic reverses—ETH via call{value} or ERC-20 via safeTransfer. We use OpenZeppelin's SafeERC20 and ReentrancyGuard to prevent the obvious attack vectors.

The Proof Verifier Interface

The contract calls an external IProofVerifier to validate zk-TLS proofs:

interface IProofVerifier { function verifyProof( bytes calldata proof, string calldata targetDomain ) external view returns (bool valid); }

Hacky but necessary: For the hackathon, our ProofVerifier.sol is a mock that always returns true. The interface is production-ready for vlayer integration—we just swap the implementation address. This let us build the full flow without blocking on proof generation complexity.

Batch Query Optimization

Frontends need to display lists of specs and jobs. Rather than N+1 RPC calls, we added range queries:

function getJobSpecsRange(uint256 from, uint256 to) external view returns (JobSpec[] memory specs) { uint256 length = to - from; specs = new JobSpec; for (uint256 i = 0; i < length; i++) { specs[i] = _specs[from + i]; } }

Two RPC calls (get count, then get range) instead of potentially hundreds.


Web Application: Next.js 16 + React 19 + wagmi v3

Why This Stack

  • Next.js 16 with App Router for file-based routing and React Server Components
  • React 19 for the latest concurrent features
  • wagmi v3 + viem for type-safe contract interactions (wagmi v3 just released—we're on the bleeding edge)
  • TanStack Query for caching and background refetching
  • Dynamic Labs for wallet connection with social login fallbacks

wagmi CLI for Type Generation

This was a huge DX win. We use wagmi generate to produce fully-typed hooks from our contract ABIs:

// wagmi.config.ts export default defineConfig({ out: 'src/generated.ts', contracts: [ { name: 'JobRegistry', abi: jobRegistryAbi, }, ], });

Now useReadContract and useWriteContract have full TypeScript inference for function names, argument types, and return types. No more ABI typos at runtime.

Event-Based Data Fetching

Here's where it gets interesting. We needed to show "all jobs created by this user" but the contract only stores jobs by ID, not by creator. Solution: query events.

export function useUserJobs(userAddress: 0x${string} | undefined) { const publicClient = usePublicClient();

return useQuery({
  queryKey: ["userJobs", userAddress],
  queryFn: async () => {
    // Get all JobCreated events filtered by requester
    const logs = await publicClient.getLogs({
      address: JOB_REGISTRY_ADDRESS,
      event: parseAbiItem(
        "event JobCreated(uint256 indexed jobId, uint256 indexed specId, address indexed requester, address token, uint256 bounty)"
      ),
      args: { requester: userAddress },
      fromBlock: DEPLOYMENT_BLOCK, // Skip genesis blocks
      toBlock: "latest",
    });

    // Fetch full job details for each
    return Promise.all(logs.map(async (log) => {
      const job = await publicClient.readContract({
        address: JOB_REGISTRY_ADDRESS,
        abi: jobRegistryAbi,
        functionName: "getJob",
        args: [log.args.jobId],
      });
      return { ...job, id: log.args.jobId };
    }));
  },
});

}

We store DEPLOYMENT_BLOCK in a generated config file so we don't scan from block 0 on Sepolia (which would timeout).

Dynamic Labs Integration

Dynamic gives us wallet connection with a much better UX than raw RainbowKit:

export function Web3Provider({ children }: { children: React.ReactNode }) { return ( <DynamicContextProvider settings={{ environmentId: process.env.NEXT_PUBLIC_DYNAMIC_ENV_ID!, walletConnectors: [EthereumWalletConnectors], }} > <DynamicWagmiConnector> <WagmiProvider config={wagmiConfig}> <QueryClientProvider client={queryClient}> {children} </QueryClientProvider> </WagmiProvider> </DynamicWagmiConnector> </DynamicContextProvider> ); }

Users can connect with MetaMask, WalletConnect, or even email—Dynamic handles the embedded wallet creation. This dramatically lowers the barrier for non-crypto-native users.

Dual-Role Architecture

The same app serves requesters and workers. We handle this with a role toggle that preserves navigation context:

function getEquivalentPath(currentPath: string, newRole: Role): string { // /requestor/jobSpecs/123/jobs → /worker/jobSpecs/123/jobs const segments = currentPath.split('/'); segments[1] = newRole; return segments.join('/'); }

Both roles see different sidebars and slightly different UIs, but share components like JobSpecCard and DashboardLayout.


Browser Extension: Plasmo + Worker Engine Architecture

This is where most of the complexity lives. The extension needs to:

  1. Store worker preferences (approved specs, min bounties)
  2. Listen for new jobs on-chain
  3. Execute jobs in a controlled browser tab
  4. Generate zk-TLS proofs
  5. Submit results back to the contract
  6. Communicate status to the web app in real-time

Why Plasmo

Plasmo is a framework for building browser extensions with React. It handles:

  • Manifest generation
  • Hot reload during development
  • TypeScript compilation
  • Content script injection
  • Background service worker bundling

We can write the popup as a React component and Plasmo handles the Chrome extension boilerplate.

Local Database with Drizzle ORM

Workers need persistent state that survives browser restarts. We use Drizzle ORM with SQLite (via sql.js compiled to WASM):

// db/schema.ts export const followedSpecs = sqliteTable("followed_specs", { id: integer("id").primaryKey({ autoIncrement: true }), specId: integer("spec_id").notNull(), walletAddress: text("wallet_address").notNull(), mainDomain: text("main_domain").notNull(), minBounty: real("min_bounty").default(0), autoClaim: integer("auto_claim", { mode: "boolean" }).default(false), });

export const activeJobs = sqliteTable("active_jobs", { jobId: text("job_id").notNull().unique(), status: text("status", { enum: ["pending", "navigating", "collecting", "generating_proof", "submitting", "completed", "failed"], }).notNull().default("pending"), progress: integer("progress").default(0), // ... });

This gives us type-safe queries and migrations without a server.

The Worker Engine State Machine

The core of the extension is workerEngine.ts—a state machine that orchestrates everything:

export interface WorkerEngine { start(): void; // Begin listening for jobs stop(): void; // Pause everything openWorkerTab(): Promise<number>; // Create dedicated execution tab setApprovedSpecs(specIds: Set<number>, minBountyBySpec: Map<number, number>): void; setAutoMode(enabled: boolean); // Auto-process queue processNextJob(): Promise<JobResult | null>; // Manual trigger getStatus(): WorkerStatus;

// Event subscriptions
onStatusChange(cb: (status: WorkerStatus) => void): () => void;
onProgress(cb: (progress: JobProgress) => void): () => void;
onJobComplete(cb: (result: JobResult) => void): () => void;

}

The engine coordinates three sub-modules:

  • JobListener: Polls the blockchain for new jobs matching approved specs
  • JobQueue: Priority queue of jobs waiting to be processed
  • QueueProcessor: Actually executes jobs in the worker tab

The Worker Tab Pattern: Jobs execute in a dedicated browser tab (/worker/runner), not in a headless context. This is intentional—it uses the worker's real browser profile with real cookies and sessions. The extension controls this tab via chrome.tabs APIs, navigating it to target URLs and extracting data.

Web ↔ Extension Communication

The web app needs to know if the extension is installed, query worker preferences, and receive job progress updates. We built a typed message protocol:

// Message types with GH0ST_ prefix for namespacing export type WebToExtensionMessage = | { type: "GH0ST_PING" } | { type: "GH0ST_START_JOB"; payload: StartJobPayload } | { type: "GH0ST_QUERY"; payload: QueryPayload } | { type: "GH0ST_FOLLOW_SPEC"; payload: FollowSpecPayload };

export type ExtensionToWebMessage = | { type: "GH0ST_PONG"; payload: { version: string } } | { type: "GH0ST_JOB_PROGRESS"; payload: JobProgressPayload } | { type: "GH0ST_JOB_COMPLETED"; payload: JobCompletedPayload };

Communication flows through a content script injected into the web app:

Web App → window.postMessage → Content Script → chrome.runtime.sendMessage → Background Script Background Script → chrome.tabs.sendMessage → Content Script → window.postMessage → Web App

Hacky detail: We track which tabs have the web app open (connectedTabs Set) so we can broadcast progress updates to all of them. When a tab closes, we clean it up via chrome.tabs.onRemoved.

vlayer Client Abstraction

For zk-TLS proof generation, we abstracted behind an interface:

export interface IVlayerClient { generateProof(request: ProofRequest): Promise<ProofResult>; verifyProof(proof: string, domain: string): Promise<boolean>; }

export function createVlayerClient(): IVlayerClient { const useMock = process.env.PLASMO_PUBLIC_USE_MOCK === 'true';

if (useMock) {
  return new MockVlayerClient(); // Returns fake proofs instantly
}

return new VlayerClient({ clientId, secret }); // Real vlayer integration

}

The mock client lets us develop the full flow without vlayer credentials. In production, we swap to the real client—same interface, real proofs.


Notable Hacks & Clever Solutions

  1. Generated Contract Config

Deployment addresses change between local Anvil and Sepolia. We generate a config file at deploy time:

// Generated by deploy script export const JOB_REGISTRY_ADDRESS = "0x5FbDB2315678afecb367f032d93F642f64180aa3"; export const DEPLOYMENT_BLOCK = 12345678n; export const CHAIN_ID = 11155111;

The web app imports this, so switching networks is just a redeploy + regenerate.

  1. Mock Extension Mode for Development

Testing extension features without building/installing the extension constantly:

// In useExtensionStatus hook if (localStorage.getItem("gh0st_extension_mock") === "true") { return { connected: true, version: "dev", activeTask: { jobId: "0x...", status: "collecting", progress: 45 }, }; }

Set a localStorage flag and the web app pretends the extension is connected with an active task.

  1. Optimistic UI with Event Refetch

When a user creates a job spec, we don't wait for blockchain confirmation to update the UI. We show a toast, then refetch via events once confirmed:

useEffect(() => { if (isSpecCreated) { refetchSpecs(); // Re-query events to get the new spec setIsCreateSpecModalOpen(false); } }, [isSpecCreated, refetchSpecs]);

  1. Extension Setup Flow

The popup has two modes: setup (first run) and operational. We persist config in chrome.storage.local:

export async function saveConfig(config: ExtensionConfig): Promise<void> { await chrome.storage.local.set({ gh0st_config: config }); }

export async function hasConfig(): Promise<boolean> { const result = await chrome.storage.local.get("gh0st_config"); return !!result.gh0st_config; }

On first open, workers enter their RPC URL, contract address, and a private key for signing submissions. The key never leaves local storage.

  1. Job Listener with Polling + Deduplication

The extension polls for new jobs but needs to avoid re-queueing jobs it's already seen:

// In jobListener.ts const seenJobIds = new Set<string>();

function onNewJob(job: Job) { const jobKey = job.id.toString(); if (seenJobIds.has(jobKey)) return; seenJobIds.add(jobKey);

// Check against approved specs and min bounty
if (!approvedSpecIds.has(Number(job.specId))) return;
const minBounty = minBountyBySpec.get(Number(job.specId)) || 0;
if (parseFloat(formatEther(job.bounty)) < minBounty) return;

onJobFound(job);

}


Partner Technologies

| Technology | How It Helped | |--------------|---------------------------------------------------------------------------------------------------| | Dynamic Labs | Wallet connection with email/social fallback—critical for onboarding non-crypto users | | vlayer | zk-TLS proof infrastructure—the cryptographic core that makes trustless verification possible | | Foundry | Fast Solidity compilation and testing; forge test runs our full suite in ~2 seconds | | Plasmo | Made browser extension development feel like building a React app instead of fighting Chrome APIs | | wagmi v3 | Type-safe contract hooks with TanStack Query integration—caught multiple bugs at compile time |


What We'd Do Differently

  1. Use an indexer: Event-based queries work but get slow as job count grows. The Graph or Ponder would scale better.
  2. WebSocket subscriptions: Polling for new jobs works but adds latency. A WebSocket connection to an RPC with eth_subscribe would be real-time.
  3. Multi-chain from day one: We hardcoded Sepolia. Abstracting chain config earlier would make multi-chain deployment trivial.

Lines of Code

  • Contracts: ~600 lines of Solidity + ~600 lines of tests
  • Web App: ~4,000 lines of TypeScript/React
  • Extension: ~2,500 lines of TypeScript

All written in 48 hours. We're proud of how complete the system is—not just a demo, but a working protocol with real escrow, real payments, and a real browser automation pipeline.

background image mobile

Join the mailing list

Get the latest news and updates