project screenshot 1
project screenshot 2
project screenshot 3
project screenshot 4
project screenshot 5
project screenshot 6

NudiBranch

Preserving diverse LLMs ensures quality, reduces bias, and embraces diversity.

NudiBranch

Created At

ETHGlobal Bangkok

Winner of

Blockscout - Blockscout Explorer Big Pool Prize

Prize Pool

Protocol Labs - Storacha 2nd place

Project Description

Problems The lack of diversity and rigorous evaluation in LLMs is a growing concern. Key data points include:

Cultural-Bias: Up to 35% of AI-generated content may carry cultural biases due to limited datasets, impacting inclusivity across languages and regions. Non-English Content Accuracy: Over-reliance on homogeneous data has led to 40% of non-English content being inaccurately processed, leaving millions of global users underserved. Economic Impact: The global economic losses incurred by major corporations due to AI bias and misinterpretation issues are estimated to reach tens of billions of dollars annually. Companies are experiencing approximately $80 billion in annual losses, factoring in the costs of addressing incorrect outputs or interpretations from AI and automation systems, as well as expenses associated with marketing and rebranding efforts aimed at restoring trust.

Solution – Diverse Model Evaluation through FID and Human Insights This framework combines objective and subjective metrics to promote high-quality, unbiased LLMs.

  1. Implementing FID for Objective Quality and Diversity Assessment FID (Frechet Inception Distance) is a proven metric for assessing the quality of generated content, particularly in GAN applications. It analyzes the distributional similarity between generated and real-world content, identifying quality gaps in LLM outputs. By ensuring diversity across language and cultural dimensions, FID can help meaningfully reduce inaccuracy rates.

  2. Incorporating Human Subjective Evaluation for Fidelity and Alignment Combining FID with human evaluations addresses fidelity and alignment issues. Diverse panels of reviewers provide nuanced insights, helping models better align with cultural and contextual expectations. Studies indicate that incorporating human reviews can considerably improve alignment accuracy, ensuring reliable and relevant outputs.

  3. Text-to-Image with LLM Mint for Content Generation and Ownership By minting LLMs as NFTs and providing text-to-image generation capabilities, this approach encourages diverse user participation, leverages a decentralized network to reduce bias, and enables the creation and ownership of culturally diverse content.

How it's Made

1.Model Evaluation through FID and Human Insights We adopted a method combining FID and human evaluations for LLM assessment. FID was used to objectively assess quality and diversity, and World was utilized to verify human authenticity rather than bots. Finally, reviews were conducted by human evaluators from diverse backgrounds. Tokens were distributed to evaluators to encourage fair and accurate evaluations.

2.Text to Image with LLM Mint We implemented a system where each LLM can be minted as an NFT, allowing NFT holders to access text-to-image prompt capabilities. The text-to-image generation is powered by Hyperbolic, a decentralized GPU network, ensuring robust, distributed processing for high-quality image creation. The generated images are stored securely on Storacha by Protocol Labs, ensuring decentralized and reliable storage. Each image is then minted as a unique NFT, creating a seamless ecosystem where users can generate, store, and own digital content with blockchain-backed authenticity and provenance.

background image mobile

Join the mailing list

Get the latest news and updates