We developed a protocol for verifiable machine unlearning, solving AI privacy issues by securely and efficiently detecting and removing user data. Through backdoor attacks and rigorous validation with hypothesis testing and zero-knowledge proofs, transparency is ensured.
This project focuses on addressing the pervasive issue of privacy in artificial intelligence (AI) systems. Here's a detailed breakdown of what the project entails:
Problem Statement: AI users face significant privacy concerns. It's difficult for users to detect if their privacy is compromised, while AI service providers struggle with the laborious task of removing sensitive user data from their datasets.
Inspiration and Methodology: The project draws inspiration from the research paper titled 'Athena: Probabilistic Verification of Machine Unlearning'. It proposes a protocol to achieve verifiable machine unlearning, a process crucial for safeguarding user privacy.
Protocol Overview:
Role of Privacy Avengers: Privacy Avengers are individuals who champion data privacy. They play a crucial role in conducting backdoor attacks, monitoring the model's behavior to verify that the unlearning has indeed occurred, and verifying the integrity of inference results.
Final Outcome: Upon successful completion of the protocol, conclusive evidence is obtained regarding the effectiveness of machine unlearning. Privacy Avengers receive rewards for their contributions, ensuring accountability and incentivizing participation.
Safety Measures:
Overall, this project offers a comprehensive solution to the complex challenge of ensuring privacy in AI systems through a meticulously designed protocol supported by advanced techniques and community participation.
Protocol Design: We began by outlining the protocol based on the principles laid out in the research paper 'Athena: Probabilistic Verification of Machine Unlearning'. This involved conceptualizing the steps involved in verifiable machine unlearning and defining the roles of Privacy Avengers.
Implementation:
Technologies Used: