S4-260121 - AI Summary

[FS_Avatar_Ph2_MED] Avatar Evaluation Framework and Objective Metrics

AI-Generated Summary AI

Summary of S4-260121: Avatar Evaluation Framework and Objective Metrics

Introduction

This contribution addresses Objectives 2 and 3 of the Avatar Communication Phase 2 SID (SP-251663), which concern QoE metrics, evaluation frameworks, and evaluation criteria for animation techniques. The document proposes a practical evaluation methodology designed to deliver repeatable, automated, and vendor-neutral results based on a core principle: evaluate what the user actually sees by measuring quality from rendered video output rather than internal system parameters.

Evaluation Framework

Design Principles

The framework is built on four key principles:

Black-box evaluation: Metrics computed from rendered output video, not internal system states, ensuring cross-vendor comparability
Reproducibility: Fixed test content, deterministic rendering conditions, and standardized capture workflows for consistent results
Automation: All metrics computable without human intervention for large-scale testing
(Note: Only three principles explicitly detailed despite mentioning four)

Testbed Architecture

The proposed testbed comprises five key components:

Stimulus player: Feeds avatar system with animation streams (blendshape weights, landmarks, joint poses)
Render configuration: Locks camera intrinsics, lighting, background, and resolution to eliminate variability
Capture module: Records rendered frames using lossless/visually lossless compression with frame-accurate timestamps
Network emulator: Applies controlled latency, jitter, bandwidth limits, and packet loss for transport testing
Metrics engine: Computes frame-level and clip-level objective metrics from captured assets

Objective Metrics for Avatar Evaluation

The contribution proposes metrics across three quality dimensions:

Visual Quality Metrics

PSNR (dB): Peak signal-to-noise ratio between reference and test frames
SSIM (0-1): Structural similarity index

Animation Quality Metrics

Video-based computation extracting landmarks and skeletons from rendered output:

Lip Vertex Error (LVE) (pixels/mm): RMS error of mouth landmarks; critical for lip sync evaluation
Facial Distance Deviation (FDD) (pixels/mm): Deviation of expression-related landmark distances; measures facial expression accuracy
Motion Vertex Error (MVE) (pixels/mm): RMS error of body joint positions; evaluates full-body animation fidelity

Temporal and Synchronization Metrics

Proposed for second phase evaluation due to complexity:

Rendering Frame Rate (FPS): Computed from frame timestamp deltas
Dropped Frame Ratio (%): Percentage of missing or repeated frame indices
Motion-to-Photon Latency (ms): Time from input motion event to visible response
End-to-End Latency (ms): Total delay from sender capture to receiver presentation
Audio-Visual Sync Offset (ms): Offset between mouth motion and corresponding audio via cross-correlation

Test Content

Standardized animation streams should cover:

Neutral speech: Clear visemes and steady head motion for baseline lip sync
Expressive speech: Emotions (happiness, surprise, concern) for facial expression testing
Conversational turn-taking: Gaze shifts, nods, backchannel gestures
Non-verbal body motion: Pointing, waving, posture changes, idle animation

Each test set should contain reference audio, reference animation streams, and reference rendered video from both high-quality reference pipeline and source capture.

Proposals

The contribution proposes to:

Adopt objective evaluation approach based on rendered video output as primary evaluation method for reproducibility
Include the proposed metric set (visual quality, animation fidelity, temporal performance) in TR 26.813
Define normative capture workflow using lossless recording, timecode embedding, and reference alignment for consistent metric computation across implementations

Document Information

TDoc:
S4-260121

Source:
Qualcomm Atheros, Inc.

Type:
discussion

Original Document:
View on 3GPP

Title: [FS_Avatar_Ph2_MED] Avatar Evaluation Framework and Objective Metrics

Agenda item: 9.8

Agenda item description: FS_Avatar_Ph2_MED (Study on Avatar communication Phase 2)

Doc type: discussion

Contact: Imed Bouazizi

Uploaded: 2026-02-03T21:49:01.090000

Contact ID: 84417

Revised to: S4-260355

TDoc Status: revised

Reservation date: 03/02/2026 05:48:54

Agenda item sort order: 43