project screenshot 1
project screenshot 2
project screenshot 3

DeML

Blockchain marketplace where people get paid for help training ML models and don’t need to share their data with anyone.

DeML

Created At

ETHGlobal New York

Winner of

trophy

🥉 Cartesi — Best MVP

Project Description

Data is the new oil. Powerful machine learning models are trained using people’s data. In the current world, people usually share their private data with a centralized entity in exchange for some freemium service. Users are not paid for sharing their private data that powers the applications and rely on the central authority to protect their data privacy. Imagine a world where people don’t need to share their local data with a central entity but still contribute to training a global machine learning model using their data locally, and even better get paid for doing so? In this project, we use blockchain to achieve that.

How it's Made

In this project, we created a blockchain model marketplace where model sponsors, who want to train some ML models using people’s data (for example, health data to predict health conditions), can post their jobs. The posting basically says I need this type of data to train a specific model and in exchange I will pay X tokens. The model sponsor would provide in the listing the initial model to be trained, a model trainer executable (for clients to run on their local node to update the model using their local data) and the reward for the job.

Clients (nodes) can see the listings in the marketplace and decide if they want to participate or not. If a client decides to participate, they accept the job from the marketplace. After that the client load the initial model and the trainer executable from IPFS (Filecoin), runs the trainer executable on their local node with their local data. Once the job completes, a new model would be written to IPFS. We use Cartesi to run a python service for model training and validation, given {model_cid, data_cid} where cid is a unique IPFS file reference string. Cartesi machine will listen to the network, and upon message (in the form of creating a blockchain transaction), the corresponding reward is issued to the client’s wallet. We also tried Filecoin’s Lilypad for the distributed execution of the ML compute but were stuck at creating smart contract part using Lilypad. But we think Lilypad would work great as well. Note that the model training happens on the client side and the client data never leaves the client node. Only the new model weights are delivered to the model sponsor in exchange for the reward. Thus, this protects the user data privacy and simultaneously the client gets rewarded for their effort training the model using their data.

For wallet connection and authentication, we used WalletConnect (super easy to use and integrate). After completing the task, the client can see their reward in their wallet. We also integrated web3inbox so that the model sponsors can easily communicate with the potential clients for any questions or discussions.

background image mobile

Join the mailing list

Get the latest news and updates