[Proposal] GPU Sponsorship Request to create Gen AI DL Model for 3d avatars project

juyang · November 22, 2023, 12:46am

Title: GPU Sponsorship Request to create Gen AI DL Model for 3d avatars project
Author: DK Ahn
Date posted: 2023/11/22

Mission and Value Alignment

GoodGang Labs is a team of talented developers, engineers, and artists dedicated to pushing the boundaries of 3D avatar technologies. Our mission is to connect human to human human to AI through avatars. We are focusing on deep learning technologies for generating natural gestures and facial expressions for 3D avatars that will help our users freely express themselves online.

As our models become increasingly complex and require large amounts of video, audio, and text data, the computational resources required to train and deploy the models have become a significant bottleneck. Access to AI Network’s powerful GPUs is essential to continue progressing and maintain a competitive edge in the rapidly evolving AI landscape. Using AI Network’s GPU resources, we wish to develop generative AIs to create natural and dynamic avatar expressions.

Relevant Links:

GoodGang Labs About Page (https://goodganglabs.com/About)
GoodGang Labs Tech Demo for Camera-Based Avatar Animation (https://demo.gglabs.link/)

Request Amount of GPU

Write a detailed breakdown of your project’s requirements.

You need to provide the machine specifications you requested below.
1. How many GPUs do you need, and choose the type?
  - A100 40G 1ea, 5 months
2. What is the machine used for?
  - Training for 3D avatar to implement the Video chatting platform

Scope of Work

Goals
- To create a generative AI deep learning model that naturally and dynamically animates 3D avatars based on the user’s text or audio (speech) input.
Features
- Develop key generative deep learning technologies described below:
  - Text2Avatar: Take user’s sentence input in text, and generate avatar audio, facial expressions, and body gesture animations.
  - Speech2Avatar: Take user’s audio (speech) input and generate 3D avatar’s facial expressions and body gesture animations.
- Iterate the model architecture to reduce the number of model parameters using optimization methods, so that the model can be used on a wider range of devices.
Tasks
- Experiment with prototypes and seek open-source solutions to minimize development time.
- Gather relevant data that can be used for commercial purposes, and preprocess.
- Train the deep learning model and seek the lightweight architecture for real-time inference.
- Apply deep learning model optimization methods (e.g. pruning, knowledge distillation) to create smaller versions of the model while retaining high performance.

Timeline

The following timeline provides an overview of the key phases and milestones for the development of our generative AI system for natural and dynamic avatar expressions:

Week 1-2: Prototyping

Define project requirements and objectives.
Research existing open-source models and methods for generating time-series outputs using text and audio inputs.
Develop initial prototypes of the AI system, focusing on individual components (e.g., facial expression generation, body gesture generation, speech analysis).

Week 3-5: Data Collection and Preprocessing

Identify and curate relevant datasets for training and validating the generative AI models.
Collect additional data, if necessary, through partnerships, crowd-sourced initiatives, or custom recording sessions.
Ensure data diversity and representation to avoid biases and improve the robustness of the AI system.
Clean and preprocess the collected data, including text and audio standardization, audio segmentation, and feature extraction.
Augment the data using techniques such as noise injection, pitch shift, and data mixing to increase the training dataset’s size and variety.
Split the data into training, validation, and testing sets to enable model evaluation and performance monitoring.

Week 6-15: Model Training

Develop and train generative deep learning models using the preprocessed data and secured GPU resources.
Experiment with various model architectures, hyperparameters, and training strategies to optimize performance and minimize overfitting.
Monitor training progress and make adjustments as needed to ensure convergence and stability of the generative models.

Week 15-20: Model Validation & Iteration

Evaluate the performance of the trained generative models on the validation and testing datasets using quantitative metrics (e.g., Fréchet Inception Distance, Perceptual Path Length) and qualitative assessments (e.g., alpha user feedback, internal reviews).
Identify areas for improvement and iterate on the models, addressing any issues with realism, diversity, or naturalness of the generated avatar expressions.
Conduct pilot tests and demonstrations with users to gather feedback on the AI system’s usability, effectiveness, and impact on communication and social experiences in virtual environments.
Upon completing these milestones, we will continue refining and expanding our AI system based on the feedback and insights gained during the validation phase. Additionally, we will explore potential partnerships, licensing opportunities, and integration of our technology into existing platforms and applications within the DAO ecosystem and beyond.

Successful metrics and KPI

Model Performance Metrics:

Accuracy: Measure how often the model’s predictions match the intended gesture or facial expression.
Loss Value: Track the loss value during training and validation to ensure the model is converging and not overfitting.
Processing Time: Time taken by the model to predict and render a gesture or facial expression in real-time.

User Experience & Realism:

Perceived Naturalness: Use user surveys or feedback tools to gauge how “natural” users find the generated gestures and expressions.
Expression Diversity: The variety of gestures and facial expressions the model can produce without repetitions or unnatural transitions.
User Satisfaction Surveys: Periodic surveys to understand if users find the avatars’ gestures and facial expressions align with their intent and feelings.

Benefits for Resource Providers (Runo HolderS)

2% additional rewards to the GoodGang Runo holders who contribute our product - marketing or user feedback. (Details will be announced after sales)

ainetwork · November 22, 2023, 1:15am

Thanks for your proposal. I’m going to start the 1round of voting. Discord

ainetwork · November 23, 2023, 12:25am

This proposal was approved in 1 round. So, we will move to 2nd round of voting with the expert developers.

https://snapshot.org/#/ainetwork.eth/proposal/0x01c9d417cfa24ed78b3e080675e1ce712ea85419d917351f136f44989537859e

ainetwork · November 24, 2023, 12:10am

The proposal was accepted since 4 developers were approved on 2nd round of voting. Snapshot

We will release 6 High GPU Runo on the project. We will contact you directly about the following actions.