6GMedia - work topic 2- Characteristics of AI-enabled applications
This contribution from InterDigital addresses work topic 2 of the 6GMedia study, focusing on key characteristics of XR and AI-enabled mobile applications and services. The document proposes use cases and elaborates on requirements for interoperable and widespread deployment.
The document identifies several representative use cases:
Key Observations:
- AI-enabled applications are highly heterogeneous and multimodal, encompassing video, image, audio, text, haptics, and sensor data
- Applications exchange AI/ML data including prompts, model parameters, and compressed/uncompressed intermediate data (embeddings)
Table 1 Analysis provides detailed mapping of:
- AR: UL (video, audio, prompt, inference data) / DL (video, audio, dynamic 3D media, haptics, spatial descriptions) - requires MPEG haptics, scene description enhancements, dynamic mesh/gaussian splat codecs
- Real-time Object Detection: Feature representations, MPEG-7 descriptors, MPEG FCM
- Speech Recognition/Conversational AI: ULBC, tokens, embeddings
- Model Learning/Updates: ONNX, GGUF, MPEG NNC formats
- Avatar communication: Upcoming MPEG avatar, gaussian and mesh codecs
- Context-aware recommendation: W3C Media Annotations, MPEG-7 descriptors
Proposals:
- Proposition 1: SA4 should study support of additional media modalities and codecs/enhancements for 6G
- Proposition 2: SA4 should define terminology for AI/ML data (features, tokens, embeddings, latent, intent) and study relevant AI representation formats and interchangeable formats/codecs
- Observation 2: Some applications require remote AI-based Spatial Computing functions (TR 26.819)
- Proposition 3: SA4 should identify and study spatial compute functions benefiting from off-device processing
Traffic Characteristics:
- Applications are uplink-heavy with greatly varying characteristics across modalities
- Continuous video capture results in high-rate, periodic uplink traffic
- Audio/sensor data generates lower-rate, aperiodic, bursty transmissions
- Traffic composition changes dynamically based on user behavior, interaction patterns, mobility, and environmental factors
Table 2 Analysis characterizes requirements:
- AR, Real-time Object Detection, Avatar communication: High data rate, real-time latency, mid reliability, high need for QoE-based adaptation
- Speech Recognition/Conversational AI, Context-aware Recommendation: Mid data rate, real-time latency, mid reliability, mid adaptation need
- Model Learning/Updates: High data rate, non-real-time latency, mid reliability, low adaptation need
Key Observations:
- Observation 3: Diversity of applications and modalities makes traffic characteristics evaluation/classification challenging
- Observation 4: Temporal dependency and synchronization required between media modalities and AI data for real-time/delay-bound AI inference
- Observation 5: Applications characterized by uplink-intensive, bursty/continuous, multi-modal traffic with diverse latency sensitivity and QoE impact
- Observation 6: Current QoS frameworks lack application/context awareness, granularity, and adaptability for dynamic 6G network conditions
Proposals:
- Proposition 4: SA4 should develop generic QoS and QoE mechanisms suitable across diverse traffic patterns
- Proposition 5: SA4 should study QoS framework enhancements enabling finer granularity and context awareness
- Proposal 6: SA4 should specify procedures for real-time QoE-based adaptation of multimodal media and define QoE metrics for real-time/delay-bound AI inference
Key Points:
- Transport protocols (QUIC-based, HTTP/3-based) are rapidly evolving to suit AI-enabled use cases
- These evolutions substantially impact traffic characteristics including latency, reliability, and resource utilization
- Rel-19 SA2 specified techniques for delivering Media Related Information (MRI) when XRM traffic is end-to-end encrypted (QUIC)
- TS 23.501 clause 5.37.9 specifies options for relaying MRI over N6 interface
- Rel-18/19 SA4 specified solutions in TS 26.522 enabling RTP senders to transmit MRI using RTP header extensions
Proposals:
- Observation 8: New transport protocols impact media transmission reliability, latency, and traffic characteristics
- Proposal 7: SA4 should characterize impact of QUIC-based protocols on AI data delivery and traffic characteristics, especially for real-time/delay-bound applications
- Observation 9: SA4 has specified RTP-based MRI solutions in TS 26.522
- Proposal 8: SA4 should study integration of SA2-defined QUIC-based transport extensions into media delivery architecture, leveraging FS_Q4RTC-MED study
Key Characteristics:
- AI-enabled services deployed across smartphones, AI glasses, smartwatches, fitness devices, companion compute devices
- Services involve continuous sensing, media capture/processing, on-device/distributed AI inference, and frequent network data exchange
- Services are inherently multi-device with different devices contributing sensing, media, compute, display, or connectivity functions
- Introduces QoS/QoE challenges for modality/format adaptation, AI processing coordination with partial/full offload, and traffic correlation across UEs
Figure 1 illustrates UE tethering where AI-enabled services are delivered across multiple user devices relying on a tethered UE for cellular connectivity and coordination.
Observations and Proposals:
- Observation 7: AI-enabled services increasingly operate across heterogeneous multi-devices associated with same user; modalities and AI processing may be distributed
- Observation 8: Existing system assumptions are UE-centric and don't address QoS/QoE requirements of multi-device scenarios
- Proposal 8: SA4 should study impact of multi-devices on QoS and QoE framework
- Observation 9: QoS enhancement and QoE-driven dynamic media adaptation need to operate across heterogeneous multi-devices
- Proposition 9: SA4 should consider heterogeneous multi-devices for QoE metrics definition and QoS enhancement study for real-time/delay-bound AI inference
The document proposes to discuss and agree on all proposals as part of the 6GMedia study and document them in a new section 6.X of the TR. The contribution emphasizes three main areas requiring SA4 attention:
1. Support for heterogeneous and multimodal media types including AI/ML data
2. Enhanced QoS/QoE frameworks with finer granularity and context awareness
3. Multi-device scenario support for AI-enabled services