demonstration of real-time ai codec transmission in WebRTC
Source: Huawei, HiSilicon
Meeting: SA4 #135, Goa, India (9-13 Feb 2026)
Work Item: FS_6G_MED / Rel-20
Purpose: Demonstration of AI codec for real-time AI traffic over WebRTC
This document presents a practical demonstration of end-to-end AI media delivery using WebRTC, specifically implementing an AI-codec video streaming system with RTP. The demonstration proves the feasibility of real-time AI codec-based traffic transmission over WebRTC infrastructure.
The implementation utilizes three key tools:
Encoding Process:
- Video converted to bits frame-by-frame through encoder neural network processing and entropy encoding
- Codec-specific metadata carried in RTP Payload Header
Payload Format Structure:
[[Latent Shape | Hyperprior Byte Length | Latent Byte Length] | [Hyperprior Bytes | Latent Bytes]]
Payload Components:
- Latent Shape: Shape of the latent representation
- Hyperprior Byte Length: Length of hyperprior parameter bytes (used for probability distributions in entropy coding)
- Latent Byte Length: Length of latent representation bytes
Transmission Side:
- Large payloads fragmented due to MTU limitations
- aiortc automatically appends standard RTP Header to each fragment
- RTP packets transmitted with congestion control
Reception Side:
- RTP packets buffered and reorganized per frame by aiortc
- Packets parsed according to agreed format
- Video frame restoration through entropy decoding and decoder neural network processing
- Error resilient codec compensates for potential packet loss
Testing Methodology:
- Random packet loss simulated using clumsy software
- Wireshark captures received packets at receiver
- Analysis based on RTP Header fields: Timestamps, Sequence Numbers, Marker Bits
Traffic Characteristics Analyzed:
- Packet loss situation per frame
- Performance of restored video frames
- Packet size distribution
- Packet arrival patterns
- Packet success rate requirements
Current Implementation Status:
- Actual AI codec deployed (preliminary version)
- Uses bmshj2018_factorized model [R1] instead of Grace for moderate fps on CPU
- Low-resolution video used due to computational constraints
- End-to-end link feasibility proven
Demo Versions Provided:
1. With packet loss: Simulated using clumsy; RTP retransmission enabled; packet loss causes slight stuttering (error recovery not yet implemented)
2. Without packet loss: Clean transmission demonstration
[R1] https://arxiv.org/abs/1802.01436 (bmshj2018_factorized model)