S4-260195 - AI Summary

CR on AIML processing in IMS calls

Back to Agenda Download Summary
AI-Generated Summary AI

3GPP CR 0608 - AI/ML Processing in IMS Calls

Change Request Overview

Specification: TS 26.114 v19.2.0
Category: B (Addition of feature)
Release: Rel-20
Work Item: AIML_IMS-MED

This CR introduces normative procedures, formats, and signaling for AI/ML assisted media processing in DCMTSI (Data Channel for Multimedia Telephony Service over IMS).


Main Technical Contributions

1. General Framework and Architecture (AD.1, AD.2, AD.3)

Key Definitions

  • AI/ML application: Data channel application providing AI/ML assisted media processing during IMS sessions
  • AI/ML processing task: Well-defined AI/ML functions (e.g., speech-to-text, translation, noise suppression, scene description)
  • AI/ML model: Parameters and metadata required for inference execution
  • AI/ML inference engine: Local UE execution environment (e.g., WebNN-aligned runtime)
  • AI/ML metadata: Data derived from media streams with timing and binding information
  • Task manifest: UTF-8 JSON describing supported tasks and candidate models
  • Model card: UTF-8 JSON describing model identity, format, artifacts, I/O conventions, runtime requirements
  • Model artifact: Downloadable model binary and auxiliary files

Terminal Architecture Requirements

DCMTSI clients must support:
- Media engine functions for RTP-based audio/video
- Data channel client (bootstrap and application data channels per clauses 6.2.10, 6.2.13)
- AI/ML application execution environment (e.g., web runtime)
- AI/ML inference engine for local model execution
- Capability discovery function (execution devices, operators, data types, resource limits)
- Model validation function (integrity/authenticity verification via SHA-256 and digital signatures)
- Binding and synchronization function (associates AI/ML tasks/metadata to RTP streams using SDP identifiers and media time anchors)

Reference Architecture

  • UE establishes Bootstrap Data Channel (BDC) to MF for retrieving DC application lists, AI/ML applications, and model artifacts via HTTP
  • DCSF and repositories (e.g., DCAR) provide provisioning of AI/ML applications and models
  • Application Data Channel (ADC) may be established to DC AS for task control, policy exchange, and metadata delivery
  • IMS Media Function does not perform inference or process RTP media for AI/ML purposes

2. Call Flows (AD.4)

AD.4.1 AI/ML Application and Model Delivery for Device Inferencing

14-step procedure:

  1. MMTel service establishment
  2. BDC establishment between UE and MF (per TS 23.228, clause AC.7.1)
  3. UE requests application list from MF via HTTP over BDC; MF forwards to DCSF
  4. DCSF creates user-specific DC application list (JSON/HTML) with:
  5. Generic app info (description, ID, URL)
  6. AI-specific info (AI feature tag, task descriptions)
  7. DCSF provides URL to application list; UE downloads list with metadata
  8. User selects app based on AI service description
  9. UE requests selected app from MF
  10. MF fetches AI application from DCSF
  11. AI application downloaded to UE via BDC with AI task metadata (task manifest)
  12. User presented with AI task list (with annotations from task metadata, execution endpoint info)
  13. Selected tasks/models informed to MF via:
    • BDC: HTTP GET with task/model URLs
    • ADC: AI Model Selection Request with model URNs
  14. MF fetches AI models from:
    • 12a: DCAR via DCSF
    • 12b: DC AS (alternative)
  15. UE downloads AI models from MF via:
    • BDC: HTTP response with model resources
    • ADC: AI Model Selection Response with model data
  16. Tasks executed for inference in UE
  17. User/UE may reselect AI tasks during session using received metadata

Editor's Note: Clarification needed on whether MF understands AI task nature, application handling types, and large model handling.

AD.4.2 On-Device Inferencing and Split Inference Operation

  • User/application selects AI/ML processing task during session
  • AI/ML application performs local capability discovery and selects compatible model artifact
  • Inference engine configured and task bound to RTP media streams using binding rules (clause AD.8)
  • If DC AS coordination required:
  • UE establishes application data channels (clause 6.2.13)
  • Associates with AI/ML application using a=3gpp-req-app SDP attribute
  • Exchanges capability, task, configuration, status via "3gpp-ai" subprotocol (clause AD.9.2)
  • Derived AI/ML metadata used for local rendering and/or transmitted over ADC
  • Metadata includes RTP stream identifier (mid) and media time anchor for alignment with RTP playout

Note: Split inference may use on-device inference for one task (e.g., STT) and DC AS for another (e.g., translation) while keeping RTP media unchanged.


3. Capabilities (AD.5)

AD.5.1 UE Capabilities

DCMTSI clients must determine and expose to AI/ML application:
- Supported execution devices (CPU, GPU, NPU, accelerators)
- Supported operator sets and data types (per local inference framework)
- Resource limits (memory constraints, concurrent task limits)
- Availability of audio/video media access points (e.g., decoded media frames)

Web runtime capability discovery may align with WebNN. Capability summary may be conveyed to DC AS using capability message type (clause AD.9.2).

AD.5.2 Network Capabilities

DC AS supporting AI/ML processing may provide:
- Repositories and discovery information for AI/ML applications/models
- Policy information (restrictions on tasks, model usage, data retention)
- Application data channels for coordination with AI/ML application
- Note: Network-side inference capabilities are outside Phase 1 scope


4. AI/ML Formats (AD.6)

Mandatory Model Format:
- ONNX format conforming to ONNX version 1.16.0
- Minimum required opset version: 18
- Encoding: ONNX Protocol Buffers representation


5. Task Manifest and Model Card (AD.7)

AD.7.1 Task Manifest

UTF-8 JSON object included with AI/ML application delivery, containing:
- List of supported tasks and optional subtasks with human-readable descriptions
- For each task: candidate model identifiers (model_id, model_version_id) and model card resource reference
- Task-specific configuration parameters including RTP stream mid binding requirements

AD.7.2 Model Card

UTF-8 JSON object provided for each candidate model, including:
- Model identifier and version identifier
- Model format specification (ONNX version, minimum opset, IR version)
- Model I/O description:
- Tensor element type and shape
- Dynamic axes, layout, normalization conventions
- Execution constraints:
- Required operator support
- Required data types
- Quantization convention
- Minimum resource requirements
- Downloadable model artifacts:
- Artifact URI, size, content type
- Integrity information (SHA-256 digest)
- Optional digital signature and key identifier

AD.7.2.1 JSON Schema for Model Card

Comprehensive JSON schema provided defining structure for:
- model_card_version: Schema version (semver pattern)
- identity: model_id, model_version_id, name, description, publisher, license, timestamps, tasks, languages, tags
- format: type (const: "onnx"), onnx_version (const: "1.16.0"), min_opset (≥18), onnx_ir_version, encoding (enum: "protobuf")
- artifacts: Array of downloadable artifacts with:
- artifact_id, uri, content_type, size_bytes, sha256
- Optional compression (none/gzip/zstd)
- Optional signature (alg, kid, sig)
- variant (precision, quantization, preferred_devices, max_latency_ms)
- selection_constraints (requires_webnn, requires_ops, requires_data_types, min_memory_mib, min_peak_scratch_mib)
- io: inputs/outputs (tensorSpec arrays), preprocessing (audio/text), postprocessing (stt/tts), output_application_format
- runtime: min_memory_mib, min_peak_scratch_mib, max_concurrent_instances, required_operator_sets, required_data_types, webnn preferences, device_preference
- selection_policy: strategy (min_latency/min_energy/best_accuracy/balanced/custom), fallback_order

tensorSpec definition:
- name, element_type (float32/float16/int8/int32/uint8/bool)
- shape (array with integers or strings for dynamic axes)
- Optional layout and dynamic_axes mapping

AD.7.3 Model Artifact Selection and Validation

Procedure:
1. UE performs capability discovery (devices, operators, data types, memory limits)
2. UE filters artifacts satisfying selection_constraints against UE capabilities
3. UE selects preferred artifact based on selection_policy and device_preference
4. UE downloads selected artifact URI via HTTP over BDC
5. UE verifies artifact using SHA-256 digest from model card
6. UE should verify digital signature when provided
7. UE instantiates inference engine and binds model I/O per model card (io.preprocessing, io.inputs, io.outputs, io.postprocessing)


6. Negotiation, Signaling, and Media Time Binding (AD.8)

AD.8.1 Binding to RTP Streams

  • AI/ML tasks operating on RTP media bound to RTP streams using SDP "mid" identifier
  • Task configuration and AI/ML metadata messages include relevant mid value

AD.8.2 Media Time Binding for AI/ML Metadata

  • AI/ML metadata over ADC may experience different delay/jitter vs. RTP media
  • To avoid drift, metadata messages shall include media time anchor derived from RTP media clock of stream identified by mid
  • For audio tasks, media time anchor may use:
  • NTP-based timestamp associated with RTP stream + duration in audio samples, OR
  • RTP timestamp
  • Time anchor representation must be consistent within session for given task
  • When DC AS forwards AI/ML metadata between endpoints, DC AS shall preserve mid binding and media time anchor for receiver alignment with RTP playout

7. Data Channel Transport (AD.9)

AD.9.1 Bootstrap Data Channel Transport

  • BDC uses HTTP subprotocol (clause 6.2.10)
  • AI/ML applications, task manifests, model cards, model artifacts retrieved via HTTP GET over BDC
  • DCMTSI client shall not transmit user media over BDC

AD.9.2 Application Data Channel Transport

Subprotocol: "3gpp-ai" for AI/ML control and metadata
Message Format: UTF-8 encoded JSON objects

Generic Message Types:
- capability: UE inference capability summary
- task: AI/ML processing task selection and model identifiers
- configuration: Task configuration parameters including media stream mid binding and media time anchor representation
- status: Lifecycle state and error reporting
- metadata: Derived AI/ML metadata bound to media stream (mid) and media time

Detailed schema specified by AI/ML application. For cross-vendor interoperability, schema should be standardized for specific task.

Example metadata message:

{
  "type": "metadata",
  "task": "stt",
  "mid": "audio",
  "segmentId": 1842,
  "ntpTs": 381245120,
  "durSamples": 16000,
  "text": "...",
  "conf": 0.87
}

Summary

This CR establishes comprehensive normative framework for AI/ML assisted media processing in DCMTSI, covering:
- Complete architecture with on-device and split inference support
- Detailed call flows for application/model delivery and runtime operation
- Capability discovery mechanisms for UE and network
- Standardized ONNX model format requirements
- Rich metadata structures (task manifests and model cards with JSON schemas)
- Deterministic model selection and validation procedures
- Media time binding mechanisms for metadata synchronization
- Data channel transport protocols for control and metadata exchange

The framework enables AI/ML tasks (STT, translation, TTS, noise suppression, scene description) while maintaining compatibility with existing DCMTSI media handling.

Document Information
Source:
Qualcomm Inc.
Type:
CR
Original Document:
View on 3GPP
Title: CR on AIML processing in IMS calls
Agenda item: 10.5
Agenda item description: AI_IMS-MED (Media aspects for AI/ML in IMS services)
Doc type: CR
Secretary remarks: Title modified on 2/3/2026. Original title : [AIML_IMS-MED] CR on AIML processing in IMS calls<br/><br/>Source modified on 2/3/2026. Original source : Qualcomm Atheros, Inc.
Release: Rel-20
Specification: 26.114
Version: 19.2.0
Related WIs: AIML_IMS-MED
CR number: 608.0
CR category: B
Clauses affected: Annex AD (new).
CN: True
CR: 608.0
ME: True
Spec: 26.114
Contact: Imed Bouazizi
Uploaded: 2026-02-03T21:49:01.107000
Contact ID: 84417
TDoc Status: noted
Clauses Affected: Annex AD (new).
Reservation date: 03/02/2026 17:07:55
Secretary Remarks: Title modified on 2/3/2026. Original title : [AIML_IMS-MED] CR on AIML processing in IMS calls<br/><br/>Source modified on 2/3/2026. Original source : Qualcomm Atheros, Inc.
Agenda item sort order: 52