# Summary of S4-260095: Neural Network Based Video Codec Architecture and Support for Error Resilience

## Document Overview

This contribution proposes documenting neural network-based codec (NNC) architectures and their error resilience capabilities in the 6G Media study (FS_6G_MED). The document focuses on two specific NNC implementations: DVC and GRACE codecs, highlighting their potential relevance for 6G deployments targeting 2030.

## Main Technical Contributions

### DVC Codec Architecture

The document describes the DVC (Deep Video Compression) codec proposed by Guo Lu et al. (2019), which represents a hybrid approach to neural network-based video coding:

**Key Architecture Features:**
- Replaces traditional video coding components with neural network equivalents while maintaining the overall predictive coding architecture
- Uses CNN models for optical flow estimation in motion estimation and compression
- Implements neural network-based motion compensation to generate predicted frames
- Maintains functional similarity between traditional and NNC components

**Joint Optimization Approach:**
The codec jointly trains/optimizes multiple components:
- Motion estimation
- Motion compensation
- Residual compression
- Quantization and bit-rate estimation

**Performance:**
- Achieves competitive results with H.264 and H.265
- Publicly available source code and research paper
- Similar approaches adopted in industry (Deep Render codec in FFMPEG and VLC)

### GRACE Codec and Error Resilience Extensions

The document presents GRACE codec (Yihua Cheng et al. 2025) as an extension of DVC with enhanced error resilience:

**Channel-Aware Training:**
- Jointly trains encoder and decoder under simulated packet loss conditions
- Enables codec awareness of specific loss patterns
- Implements channel-aware source coding design

**Technical Implementation:**
- Encodes each frame as a tensor split into independently decodable sub-tensors
- Uses arithmetic coding mapped to packets
- Tested across wide range of loss rates
- Includes lighter profiles (GRACE-lite) for mobile devices

**Performance Validation:**
- User study with 240 crowdsourced participants
- Tested 61 videos under realistic conditions
- Used Google GCC to emulate WebRTC congestion control
- Channel conditions: LTE and broadband traces (0.2-8 Mbps, 100ms end-to-end delay)
- MOS scores up to 38% better than H.264/H.265 with AL-FEC and error concealment

**Key Performance Improvements:**
- Exceptional reduction in tail latency
- Reduced non-rendered frames
- Reduced stalls per second
- Improved video smoothness

**Hardware Requirements:**
- Original GRACE: NVIDIA A40 GPU (31.2-51.2 fps)
- GRACE-lite: Real-time capable on current mobile devices

### Identified Limitations

**Content Specificity:**
- NNC performance may be content-specific due to training data dependencies

**Reconstruction Challenges:**
- Potential reconstruction failures due to non-bit-exact arithmetic operations in GPU frameworks
- Issues with floating-point arithmetic and convolution operations
- Currently under discussion in SC29 (media standards organization)
- Identified as potential key enabler requiring resolution for future NNC codec adoption

## Proposals

The document makes two specific proposals:

1. **Documentation Request:** Document NNC features and their application to error-resilient AI traffic in the 6G MED TR under 6G Media (based on clauses 2 and 3)

2. **Use Case Consideration:** Include the use case of NNC with channel-aware source coding training in AI traffic characteristics

## Text Proposal Structure

The contribution includes specific text proposals for:
- **Change 1:** Addition of two references to the normative references section
- **Change 2:** New clause 6.2.4.X under Work topic #2d (AI Traffic Characteristics) containing the technical description of DVC and GRACE codecs, including architecture diagrams and performance characteristics