S4-260183 - AI Summary

[AIML_IMS-MED] Negotiation messages for split inferencing

Back to Agenda Download Summary
AI-Generated Summary AI

3GPP Change Request Summary: Split Inferencing Negotiation Messages

Document Overview

This contribution (S4-260183) proposes additional messages and associated metadata to enable split inferencing for AI/ML applications in IMS-based media services. It builds upon and updates contribution S4aR260009, with specific focus on defining the differences between device inferencing and split inferencing scenarios.

Main Technical Contributions

1. Negotiation Message Summary Table (Section A.4.2)

Key Addition: Introduction of Table A4.2-1 summarizing all negotiation messages for split inferencing call flows.

The table defines the following message pairs with their associated metadata:

  • Application Discovery Messages:
  • AI_APPLICATION_DISCOVERY_REQUEST (HTTP GET) - carries family/type of AI/ML applications
  • AI_APPLICATION_DISCOVERY_RESPONSE (HTTP RESPONSE) - returns list of AI/ML applications

  • Application Selection Messages:

  • AI_APPLICATION_REQUEST (HTTP GET) - carries URN of selected application
  • AI_APPLICATION_RESPONSE (HTTP RESPONSE) - returns selected application binary and metadata

  • Split Model List Messages:

  • MODELS_LIST_REQUEST (HTTP POST) - carries UE capabilities
  • MODELS_LIST_RESPONSE (HTTP RESPONSE) - returns candidate AI/ML models and partitionings

  • Split Inference Configuration Messages:

  • AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST (HTTP POST) - carries URN(s) of selected models and submodel partitioning
  • SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE (HTTP RESPONSE) - returns selected models/submodels binary and metadata

  • Model Selection Messages:

  • AI_MODEL_SELECTION_REQUEST - carries URN(s) of selected models/submodels
  • AI_MODEL_SELECTION_RESPONSE - returns selected models/submodels binary and metadata

2. Common Metadata Information (Section A.4.3)

A.4.3.1 Application Metadata

  • Defines characteristics and requirements of applications and associated AI/ML media processing tasks
  • Includes performance, accuracy, energy constraints, and supported models
  • New for split inferencing: Indicates supported split and remote inference modes and whether model supports partitioning

A.4.3.2 Endpoint Capabilities Metadata

Introduces separation between static and dynamic capabilities:

  • Static capabilities: Fixed or infrequently changing properties
  • Processing architecture
  • Peak compute capacity
  • Supported AI/ML frameworks
  • Available execution engines (CPU, GPU, NPU)
  • Supported numerical precisions
  • Hardware acceleration features

  • Dynamic capabilities: Runtime-dependent characteristics

  • Available memory
  • Current compute load
  • Energy mode
  • Battery level
  • Accelerator availability

This separation enables both long-term compatibility checks and short-term runtime optimization.

A.4.3.3 Model Information Metadata

  • Describes functional, structural, and performance characteristics of AI/ML models
  • Includes supported tasks, input/output specifications, resource requirements, latency/energy metrics
  • New: Indicates whether model supports partitioning

3. Split Inferencing-Specific Metadata (Section A.4.3.4)

A.4.3.4.1 Submodel Partitioning Metadata

Major technical contribution: Comprehensive metadata structure for describing model partitioning for split inferencing.

Key metadata elements:

| Field | Description |
|-------|-------------|
| submodelsPartitioningIdentifier | URN identifying the partitioning configuration |
| submodelComposition | Array of submodel objects (1..N) |
| submodelIdentifier | URN of individual submodel |
| endpointType | Execution location (UE, SERVER, EDGE, CLOUD, CUSTOM) |
| subtaskTypeIdentifier | Subtask type supported by submodel |
| submodelType | Role in pipeline (HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL) |
| size | Submodel file size in MB |
| submodelInputs/Outputs | Tensor specifications (ID, type, shape) |
| outputAccuracy | Trained accuracy percentage |
| subModelDataType | Data type (Uint8, Float32, Float16) |

Tensor specifications include:
- tensorID - identifier for input/output tensor
- tensorType - data type (integer, float32, float16)
- tensorShape - tensor dimensions (e.g., (1,3,300,300))

JSON Example provided: Complete example showing HEAD submodel on UE and TAIL submodel on DCAS for object detection task.

4. Negotiation Message Format (Section A.4.5)

Generic message structure defined:

Table 5: AI Metadata Messages Format

  • messages: Array of Message objects (1..n)
  • Each message follows Message data type specification

Table 6: Metadata Message Data Type

| Field | Type | Cardinality | Description |
|-------|------|-------------|-------------|
| id | string | 1..1 | Unique identifier within data channel session |
| type | number | 1..1 | Message subtype identifier |
| payload | object | 1..1 | Type-dependent message payload |
| sessionId | string | 1..1 | Associated multimedia session identifier |
| sendingAtTime | number | 0..1 | Wall clock transmission time |

Defined message types:
- MODELS_LIST_REQUEST
- MODELS_LIST_RESPONSE
- SPLIT_INFERENCE_CONFIGURATION_REQUEST
- AI_APPLICATION_DISCOVERY_REQUEST
- AI_APPLICATION_DISCOVERY_RESPONSE
- AI_APPLICATION_REQUEST
- AI_APPLICATION_RESPONSE
- AI_SERVER_CONFIGURATION_REQUEST
- AI_SERVER_CONFIGURATION_RESPONSE
- AI_MODEL_SELECTION_REQUEST
- AI_MODEL_SELECTION_RESPONSE

Summary of Changes

The CR introduces three main changes:

  1. Complete message taxonomy for split inferencing negotiation with HTTP protocol mapping
  2. Comprehensive metadata definitions covering applications, endpoint capabilities, models, and split-specific partitioning information
  3. Generic message format for AI metadata exchange over data channels with extensible type system

The contribution enables complete end-to-end split inferencing capability negotiation between UE and remote endpoints, with particular emphasis on submodel partitioning metadata that allows flexible distribution of AI/ML model execution across network nodes.

Document Information
Source:
InterDigital Finland Oy
Type:
discussion
For:
Agreement
Original Document:
View on 3GPP
Title: [AIML_IMS-MED] Negotiation messages for split inferencing
Agenda item: 10.5
Agenda item description: AI_IMS-MED (Media aspects for AI/ML in IMS services)
Doc type: discussion
For action: Agreement
Release: Rel-20
Contact: Stephane Onno
Uploaded: 2026-02-03T19:11:22.680000
Contact ID: 84864
TDoc Status: merged
Is revision of: S4aR260010
Reservation date: 03/02/2026 16:32:54
Agenda item sort order: 52