Summary of S4-260094: Media Related Real-Time AI Traffic Characteristics
Document Overview
This is a pseudo Change Request (pCR) from Huawei/HiSilicon to TSG-SA WG4 Meeting #135, proposing to add a new clause on end-to-end real-time multi-modal AI traffic characteristics to a 6G media-related TR. The document follows the methodology established in TR 26.926 for traffic modeling and quality evaluation.
Main Objective
The document aims to characterize AI traffic for 6G use cases in real-time video conferencing and robotics by defining end-to-end architecture, procedures, content coding models, and delivery mechanisms for real-time AI inference applications.
Technical Contributions
End-to-End Architecture (Clause 6.2.6.X.1)
- Core Concept: Multimodal Large Language Models (MLM) incorporating different AI encoders/decoders for various modalities (text, image, video, audio)
- Architecture Components:
- UE/client implements AI encoding and packetization
- Application Server (AS) implements AI decoding
- Media-related AI service request/response model
- Key Innovation: Introduction of "native AI data units" - a new media format generated by AI encoders that can be used for media reconstruction, generation, and comprehension
- Compatibility Handling: AI decoder at AS may be needed if UE's AI encoder is not compatible with AS's AI model; otherwise, encoded data can be processed directly
Basic Procedures (Clause 6.2.6.X.2)
The document defines a 10-step call flow:
1. UE connects and provides supported AI encoder information
2. AS configures AI model and corresponding decoder
3. Operational flow includes:
- Media data collection and AI encoding at UE
- Packetization using native or customized packet format
- Transmission to AS
- Optional AI decoding at AS (if compatibility required)
- Media-related response generation and transmission back to UE
- Response decoding and presentation at UE
Content Coding Model (Clause 6.2.6.X.3)
Two types of AI encoders are defined:
Type 1: Reconstruction-Oriented AI Encoders
- Examples: DVC, GRACE codec
- GRACE Codec Details:
- Input: 2×H×W×3 tensors (two consecutive frames)
- Encoder: Analyzes inter-frame differences (motion vectors and residuals), maps to compact latent representation
- Resilience mechanism: Latent randomly split into multiple chunks, individually entropy coded to prevent error propagation
- Decoder: Entropy decoding, latent reorganization, lost chunks set to zeros, graceful quality degradation without cliff effect
Type 2: AI Model Processing-Oriented Encoders
- Examples: VILA-U, Liquid, Chameleon, Emu3, VQGAN
- Processing Flow:
- Pre-processing to predefined sizes (256×256 or 512×512 pixels, RGB)
- Feature extraction via CNN or Transformer layers
- Quantization to AI data units
- Joint optimization with associated AI decoder
- Benefits:
- Distributed AI workload with privacy-sensitive offloading
- Direct AI model processing without decompression/re-encoding
- Reduced data size, latency, and bandwidth
- Unified format for multiple modalities
Content Delivery Model (Clause 6.2.6.X.4)
- Protocol Selection: RTP over UDP for real-time delivery
- Packetization Approaches:
For Reconstruction-Oriented Encoders:
- Latent chunks treated as NALUs
- NALU aggregation or fragmentation for MTU (typically 1500 bytes)
- Customized NALU headers for AI codec characteristics
- Standard RTP/UDP/IP header structure
For AI Model Processing Encoders:
- AI data units grouped as payload with customized payload header
- Group size determined by protocol overhead and integration efficiency
- AI data unit group limited to single IP packet size
AI Transmission Characteristics (Clause 6.2.6.X.5)
Three key characteristics identified:
1. Data Bursts and Periodicity
- Burst pattern linked to intrinsic framerate of multi-modal media
- Uplink: Periodicity matches video frame rate
- Downlink: Related to AI model inference speed
- Data rate depends on AI encoder output dimension and quantization parameters
2. Low Latency Requirements
- Tight end-to-end latency for conferencing and robotics
- Network latency budget constrained by AS processing time for large AI models
3. Error Resilience
- Packet success rate requirement linked to AI service characteristics
- Error Tolerance Examples:
- Autoregressive models: Can predict missing AI data units; reasonable quality maintained even with data unit loss
- GRACE codec: Trained for error resilience; maintains good SSIM with packet errors
- GenAI applications: High error rates (≤20%) tolerable with UE-side recovery
Differentiated Importance
- Cross-modality: Image AI data units more error-tolerant than text
- Intra-modality: Positional importance for image data units (preceding units more critical than subsequent ones)
Example KPIs for GenAI Applications
| Traffic Type | Burst Size | Max Latency | Service Bit Rate | Delay | Payload Error Rate |
|--------------|------------|-------------|------------------|-------|-------------------|
| Image GenAI | 15 KB | 15 ms | 8 Mbps | 20 ms | ≤20% |
| Video GenAI | 1.5 MB | 100 ms | 120 Mbps | 20 ms | ≤20% |
| Chatbot | 0.5 KB | 20 ms | 200 Kbps | 30 ms | ≤20% |
Evaluation Methodology (Clause 6.2.6.X.6)
- Simulation of packet loss and jitter
- P-Trace derivation from RTP header information (Sequence Number, Timestamp, Marker Bit)
- Packet size obtained at UDP layer
- Packet arrival time recorded at receiving port
- AI service-specific quality evaluation based on successful task completion
Summary and Network Implications (Clause 6.2.6.X.7)
The document concludes that AI traffic characteristics can be leveraged in 3GPP networks to improve transmission efficiency:
- RAN awareness of latency requirements, packet arrival patterns, error tolerance, and differentiated importance
- Enhanced operations: Improved scheduling and HARQ operations
- System capacity: Potential to increase supported number of UEs
References Added
The document adds seven new normative/informative references including:
- TR 22.870 (6G Use Cases)
- TR 26.926 (Traffic Models and Quality Evaluation)
- Various academic papers on neural codecs (GRACE, Liquid, DVC)
- RP-253288 on AI services for 6G