# Summary of S4-260098: Demonstration of Real-Time AI Codec Transmission in WebRTC

## Document Overview

**Source:** Huawei, HiSilicon  
**Meeting:** SA4 #135, Goa, India (9-13 Feb 2026)  
**Work Item:** FS_6G_MED / Rel-20  
**Purpose:** Demonstration of AI codec for real-time AI traffic over WebRTC

## Main Technical Contribution

This document presents a practical demonstration of end-to-end AI media delivery using WebRTC, specifically implementing an AI-codec video streaming system with RTP. The demonstration proves the feasibility of real-time AI codec-based traffic transmission over WebRTC infrastructure.

## Implementation Framework

### Tools and Components

The implementation utilizes three key tools:

- **aiortc**: Python-native WebRTC/ORTC library serving as the foundational media transport framework
- **Wireshark**: Network protocol analyzer for capturing and inspecting RTP traffic traces for performance auditing
- **clumsy**: Network simulation utility for injecting controlled packet loss and jitter for resilience testing

## Technical Implementation Steps

### Step 1: AI Video Codec Registration in aiortc

- Extended the original aiortc framework which only supported legacy codecs (VP8, H264)
- Registered new AI video codec including:
  - Codec name
  - Encoding function
  - Decoding function
- Enabled codec recognition during SDP negotiation
- Mapped encoding/decoding functions to transmitter and receiver operations

### Step 2: Custom RTP Payload Format Design

**Encoding Process:**
- Video converted to bits frame-by-frame through encoder neural network processing and entropy encoding
- Codec-specific metadata carried in RTP Payload Header

**Payload Format Structure:**
```
[[Latent Shape | Hyperprior Byte Length | Latent Byte Length] | [Hyperprior Bytes | Latent Bytes]]
```

**Payload Components:**
- **Latent Shape**: Shape of the latent representation
- **Hyperprior Byte Length**: Length of hyperprior parameter bytes (used for probability distributions in entropy coding)
- **Latent Byte Length**: Length of latent representation bytes

### Step 3: RTP Packing, Transmission, and Unpacking

**Transmission Side:**
- Large payloads fragmented due to MTU limitations
- aiortc automatically appends standard RTP Header to each fragment
- RTP packets transmitted with congestion control

**Reception Side:**
- RTP packets buffered and reorganized per frame by aiortc
- Packets parsed according to agreed format
- Video frame restoration through entropy decoding and decoder neural network processing
- Error resilient codec compensates for potential packet loss

### Step 4: Traffic Trace Analysis

**Testing Methodology:**
- Random packet loss simulated using clumsy software
- Wireshark captures received packets at receiver
- Analysis based on RTP Header fields: Timestamps, Sequence Numbers, Marker Bits

**Traffic Characteristics Analyzed:**
- Packet loss situation per frame
- Performance of restored video frames
- Packet size distribution
- Packet arrival patterns
- Packet success rate requirements

## Demo Implementation Details

**Current Implementation Status:**
- Actual AI codec deployed (preliminary version)
- Uses bmshj2018_factorized model [R1] instead of Grace for moderate fps on CPU
- Low-resolution video used due to computational constraints
- End-to-end link feasibility proven

**Demo Versions Provided:**
1. **With packet loss**: Simulated using clumsy; RTP retransmission enabled; packet loss causes slight stuttering (error recovery not yet implemented)
2. **Without packet loss**: Clean transmission demonstration

## Proposals

1. Take this approach into account as it demonstrates real-time AI codec-based traffic over WebRTC
2. Consider the feasibility of this approach for generating traces for real-time AI traffic

## Reference

[R1] https://arxiv.org/abs/1802.01436 (bmshj2018_factorized model)