Overview of RAN2#133 Inputs on AI Traffic Characteristics
Document Purpose and Context
This document provides a summary of contributions submitted to RAN2#133 regarding AI traffic characteristics. Following RAN#110 plenary's assignment of RAN-2 to lead AI traffic characteristics work in RAN and coordinate with SA WG4, this overview aims to align SA4 and RAN-2 work at an early stage. The document explicitly recommends prioritizing discussion around key dependencies identified by RAN-2.
Key Traffic Characteristics Identified Across Contributions
Common Traffic Patterns
Multiple contributions converge on the following AI traffic characteristics:
- Bursty and aperiodic nature: Nearly universal observation across contributions
- Uplink-heavy traffic: Particularly emphasized for mobile AI applications
- Unpredictable bandwidth requirements: Dynamic and variable data rates
- Small packet sizes: Frequent transmission of small data units
- Multi-modal traffic: Synchronization requirements across different modalities
- Asymmetrical traffic patterns: Different characteristics for UL vs DL
- Error tolerance: Variable across different AI applications and data types
- Token-based communication: Specific characteristics for tokenized AI traffic
Latency Characteristics
- Delay-sensitive traffic: Strict end-to-end latency requirements
- Low latency for initial packets: Critical for interactive applications
- Variable packet delay budgets: Dependent on application type
- Interactive with elastic latency: Some flexibility in certain scenarios
AI Traffic Categorization Approaches
By Real-Time Requirements
Several contributions propose categorization based on timing:
- Real-time vs Non-real-time: Most common distinction
- Interactive vs Non-interactive: Request/response patterns
By AI Codec Usage
Multiple contributions distinguish:
- AI codec traffic: Native AI representation formats
- Non-AI codec traffic: Traditional encoding methods
- Type 1: Real-time AI application with non-AI codec
- Type 2: Real-time AI application with AI codec
- Type 3: Non-real-time AI application
By Service Class
Peng Cheng Lab (R2-2600153) proposes detailed service classes:
- Service Class A: Generative AI and AI Agent Traffic (Token-Streaming Inference)
- Service Class B: Perception/Analytics AI (Uplink-Intensive Inference), including Split Inference
- Service Class C: Federated/Distributed Learning and Training Traffic (Bulk, Synchronized Uploads)
- Composite Class D: AI-Enhanced Immersive Communication (XR + Digital Twin + AI Components)
By Use Case
- Agentic (continuous) vs Non-agentic (bursty): Meta/Qualcomm et al. (R2-2600480)
- Chatbot, Live AI, AI assistant: Ericsson (R2-2600885)
- Intermediate data type: From TR 26.927 (NEC R2-2600552)
By Data Type
- Training data, Model data, Inference data: CATT (R2-2600242)
- Token vs non-token: Multiple contributions
- Modality-based importance vs sequence-based importance: Offino (R2-2600853)
Release Strategy: 5G Rel-20 vs 6G
5G Rel-20 Focus
Strong consensus on prioritizing:
- Uplink enhancements: Primary focus for Rel-20
- Non-real-time applications: Particularly chatbot/GenAI use cases
- Burstiness and unpredictability handling: Leveraging XR Phase 4 work
- AI traffic awareness in RAN: Enable service-aware handling
6G Scope
Broader scope proposed for 6G:
- Real-time uplink and downlink: Full bidirectional support
- Unified framework: Comprehensive AI traffic handling
- Native AI communication: AI-native RAN traffic support
- Flexible QoS: Dynamic adaptation to AI traffic patterns
- Downlink non-real-time: Extended coverage beyond Rel-20
QoS and RAN Enhancement Proposals
QoS Mechanisms
- Dynamic QoS support: Constrained latency handling (ZTE R2-2600164)
- Flexible QoS framework: 6G requirement for AI traffic adaptation
- Context-aware traffic flow: Enable RAN awareness (Nvidia R2-2600925)
- Enhanced reliability: Beyond current 5G capabilities (Samsung R2-2600389)
- PDU Set concept reuse: Leverage XR mechanisms (vivo R2-2600074)
Uplink Enhancements
- Irregular burst support: Handle unpredictable UL patterns
- Delay-bound data bursts: Resource-efficient handling without over-provisioning
- Small packet transmission in RRC inactive: Efficiency improvement
- UE-assisted uplink reporting prediction: Proactive resource allocation
- Multi-modal synchronization for uplink: Coordinate different data streams
Scheduling and Resource Management
- Service awareness at L2: Enable intelligent scheduling decisions
- UE-based coordination: Context and dependency awareness
- Error-tolerant token transmission: Exploit AI traffic characteristics
- Token importance differentiation: Priority-based handling
- Downlink scheduling enhancement: Network-side optimizations
Multi-modality Support
- Multi-modal synchronization: Beyond MMSID for QoS control
- PDU Set binding for AI traffic: Token set/burst handling
- Dependency structure handling: Inter-stream coordination
Power Efficiency
- Energy savings for continuous agentic AI: Long-duration applications
- Tethering and multi-device support: Multi-access scenarios
- RRC state optimization: Balance latency and power consumption
Explicit SA4 Dependencies and Coordination Requests
Traffic Characteristics Clarification
Multiple contributions request SA4 input on:
- Token communication characteristics:
- Token importance levels and granularity
- Error tolerance properties
- Token-to-PDU mapping
- Dependency between tokens
- Whether tokenization increases/decreases data size
-
Visibility of tokens to RAN
-
Packet-level characteristics:
- Packet delay budget
- Packet size distributions
- Packet arrival rates and patterns (streaming vs bursts)
- Packet error rate tolerance
-
Packet importance variability
-
Data compression characteristics: Impact on traffic patterns
-
Multi-modality aspects: Synchronization requirements and characteristics
AI Codec Study Coordination
Several contributions explicitly reference or request coordination on:
- TR 26.847 alignment: Token communication definitions
- AI representation format clarification: Scope and characteristics
- AI codec vs non-AI codec traffic: Differentiation and handling
- Timeline and scope of SA4 AI codec study: Critical for RAN-2 planning
- Trace data provision: For derivation of packet size, arrival rate, delay budget, success rate
Service Type and Application Clarification
Requests for SA4 input on:
- Service types definition: Categories and characteristics
- Use case traffic patterns: Specific application behaviors
- Intermediate data characteristics: From TR 26.927
- End-to-end latency requirements: Impact on RAN design
- Traffic encryption: Whether packets are encrypted at application layer
PDU Set and Annotation
- PDU Set annotation: Importance and token information
- PDU Set binding for AI traffic: Token set/burst definitions
- PDU Set handling: AI-specific requirements
Specific Questions to SA4
- China Telecom (R2-2600685): More details on tokens and service types
- Spreadtrum/UNISOC (R2-2600673): Token-to-PDU mapping, importance granularity definition, UE processing requirements
- Panasonic (R2-2600757): Packet delay budget, packet size, error tolerance of token traffic
- Lenovo (R2-2600745): Confirm AI traffic characteristics including data compression, error tolerance, token importance, multimodality, burstiness, unpredictability
- CMCC et al. (R2-2600965): Whether token traffic characteristics align with TR 26.847
- Apple (R2-2600446): Input on token communication, delay budget, relative priority
- Nokia (R2-2600315): PDU Set annotation, importance and token information for AI traffic
- Fujitsu (R2-2600347): Tokenized AI feedback
- Samsung (R2-2600389): PDU Set handling of AI-related traffic
- HONOR (R2-2600515): Mobile AI arrival patterns (streaming or bursts) and corresponding characteristics
- NEC (R2-2600552): Intermediate data traffic characteristics from TR 26.927
- Peng Cheng Lab (R2-2600153): PDU Set binding for AI traffic, dependency structure, traffic model input
- OPPO et al. (R2-2600206): Timeline and scope of SA4 study, whether token/packet characteristics are in scope, AI representation format clarification
- Sharp (R2-2600183): Token traffic characteristics support
- CATT (R2-2600242): Burst traffic confirmation, end-to-end latency impact, encryption status, importance and error tolerance modeling, RAN visibility of tokens
- vivo (R2-2600074): Burst characteristics, end-to-end latency, traffic encryption, error tolerance, AI token characteristics, token visibility to RAN
- Huawei/HiSilicon (R2-2600148): Trace data for packet characteristics, whether AI codec apps have error tolerance and variable packet importance
Liaison and Coordination Proposals
Several contributions propose formal coordination:
- OPPO et al. (R2-2600206): Inform SA4 about RAN-2 decisions and progress, get timeline/scope information
- vivo (R2-2600074): Inform SA4 that RAN-2 leads AI traffic work in RAN
- Peng Cheng Lab (R2-2600153): Send LS to SA2/SA4 to clarify service awareness points
- Ericsson (R2-2600885): Coordinate with SA (not only SA4 but SA in general)
Divergent Views and Open Issues
Traffic Model Approach
- Qualcomm et al. (R2-2600138): Adopt XR traffic models for real-time, MBB models for non-real-time
- AT&T (R2-2600890): Proposes text-based conversational GenAI traffic model (suggests RAN1 scope)
- Ericsson (R2-2600885): Cautions against optimizing for specific AI applications
Scope and Prioritization
- MediaTek (R2-2600901): Stop referring to tokenizer, enhance UL, wait for AI codec study in SA4
- CATT (R2-2600242): Prioritize network inference in RAN-2
- Lenovo (R2-2600745): Rel-20 XR Phase 4 focus on uplink, unified framework in 6G
- Hanbat Univ (R2-2600409): Include AI native RAN traffic and RedCap
Error Tolerance Determination
- NTT/Docomo (R2-2600978): RAN-2 to proactively determine error tolerance based on AI task and source data type, with concrete test results provided
- Multiple others: Request SA4 to define error tolerance characteristics
XR Relationship
- Ericsson (R2-2600885): Need to understand difference between XR and AI traffic
- Fujitsu (R2-2600347): Need gap analysis from XR
- Multiple others: Propose reusing XR mechanisms (PDU Set, traffic models)
Technical Contributions Summary by Topic
Token Communication (17 contributions)
Nvidia, Offino, China Telecom, Spreadtrum/UNISOC, Panasonic, Lenovo, CMCC et al., Apple, Nokia, Fujitsu, Samsung, HONOR, Peng Cheng Lab, OPPO et al., Sharp, CATT, vivo
Key aspects: Importance differentiation, error tolerance, dependency, compression, RAN visibility, PDU mapping
Burst Traffic Handling (20+ contributions)
Nearly universal recognition of bursty, aperiodic traffic requiring specific RAN enhancements
Uplink Enhancement (15+ contributions)
Strong consensus on Rel-20 focus for uplink mobile AI traffic with burstiness, unpredictability, and interactive characteristics
Multi-modality (8 contributions)
Fraunhofer, Meta/Qualcomm et al., ZTE, Peng Cheng Lab, Samsung, Lenovo, HONOR, Nokia
Key aspects: Synchronization, MMSID usage, multi-device scenarios
Error Tolerance (12 contributions)
Offino, China Telecom, Spreadtrum/UNISOC, Panasonic, Lenovo, CMCC et al., NTT/Docomo, Fujitsu, Samsung, CATT, vivo, Huawei/HiSilicon
Key aspects: Variable tolerance, task-dependent, token-specific, importance-based
Service Awareness (6 contributions)
Nvidia, Nokia, ZTE, Peng Cheng Lab, Xiaomi, Huawei/HiSilicon
Key aspects: Context-aware flow, L2 scheduling, UE-assisted coordination
Dynamic QoS (7 contributions)
ZTE, Meta/Qualcomm et al., Samsung, Nokia, Apple, HONOR, Huawei/HiSilicon
Key aspects: Flexible adaptation, constrained latency, relative priorities
Recommendations
The document recommends taking into account the explicit dependencies from RAN-2 and prioritizing discussion around these key dependencies, particularly:
- Token communication characteristics and RAN visibility
- Packet-level traffic characteristics (size, arrival patterns, delay budgets)
- Error tolerance properties and importance differentiation
- AI codec study timeline and scope alignment
- PDU Set binding and annotation for AI traffic
- Multi-modality synchronization requirements
- Traffic model inputs and trace data provision