Video thumbnail

Mind Your Data

AI models training while preserving biometric data ownership and privacy.

Mind Your Data

Created At

ETHGlobal Prague

Project Description

800 million people suffer from diabetes. We manufactured open hardware near-infrared spectroscopy (NIRS) devices, which we want to use for non-invasively tracking glucose levels. To achieve this, we want to collect continuous glucose monitor data alongside data from open hardware NIRS devices. We want to train on user data to improve the performance of the non-invasive glucose monitor that we're designing. We want to increase competition in neuroscience by creating open hardware that we teach others to manufacture, so that we are not middle-mans interested in extracting profit.

At the same time, we want to maximize the privacy and security of users by 1) ensuring that they can price access to their data arbitrarily and 2) ensuring that they can choose to never share their data, and instead just allow model trainers to fine tune models on data.

Our project addresses two critical challenges in the data economy and machine learning: (1) selling data, which we solve using user-chosen NFT balances as the decryption key, and (2) efficient federated training where model training is transparent to file owners, whereby their data does not leave their device, and the training code is shown upfront.

  1. Cross-Chain NFT File Gating. We've built a system that enables file access control across multiple blockchain networks using NFT ownership verification. Users can gate access to files by requiring ownership of NFT contract balance they choose.

  2. Privacy-Preserving Biometric Computing & AI Training Without Data Exposure.

The system we built allows model trainers to send python based code requests and data owners to review and execute these computational requests on their own data without ever sharing their files. After receiving .py scrips, these can be compiled into WASM scripts and run into the browser. Machine learning researchers and companies can train more accurate and robust biometric AI models by accessing computational results from diverse, real-world biometric data without ever seeing the actual biometric information. This solves the fundamental tension between AI advancement and privacy protection that has limited progress in biometric AI applications. This also prevents the classic data harvesting practice where users pay to get a service while companies receive valuable files they can repackage and sell to third parties.

How it's Made

We forked CypherShare and extended it with cross-chain capabilities using LayerZero's ONFT contracts deployed on Hedera and Amoy Polygon, secured by TACo encryption for NFT-gated file access. This allows individuals to upload and encrypt their data, and to offer as decryption condition a positive balance of NFTs they designate.

For privacy-preserving computation, we leverage pyodide & WebAssembly to execute Python scripts directly in users' browsers—enabling AI training and biometric analysis while ensuring sensitive data never leaves the client device. Only computational results are shared, never the raw biometric data.

This creates a trustless system where digital ownership meets zero-data-exposure computation.

background image mobile

Join the mailing list

Get the latest news and updates