S4-260183 - AI Summary

[AIML_IMS-MED] Negotiation messages for split inferencing

AI-Generated Summary AI

3GPP Change Request Summary: Split Inferencing Negotiation Messages

Document Overview

This contribution (S4-260183) proposes additional messages and associated metadata to enable split inferencing for AI/ML applications in IMS-based media services. It builds upon and updates contribution S4aR260009, with specific focus on defining the differences between device inferencing and split inferencing scenarios.

Main Technical Contributions

1. Negotiation Message Summary Table (Section A.4.2)

Key Addition: Introduction of Table A4.2-1 summarizing all negotiation messages for split inferencing call flows.

The table defines the following message pairs with their associated metadata:

Application Discovery Messages:
AI_APPLICATION_DISCOVERY_REQUEST (HTTP GET) - carries family/type of AI/ML applications
AI_APPLICATION_DISCOVERY_RESPONSE (HTTP RESPONSE) - returns list of AI/ML applications
Application Selection Messages:
AI_APPLICATION_REQUEST (HTTP GET) - carries URN of selected application
AI_APPLICATION_RESPONSE (HTTP RESPONSE) - returns selected application binary and metadata
Split Model List Messages:
MODELS_LIST_REQUEST (HTTP POST) - carries UE capabilities
MODELS_LIST_RESPONSE (HTTP RESPONSE) - returns candidate AI/ML models and partitionings
Split Inference Configuration Messages:
AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST (HTTP POST) - carries URN(s) of selected models and submodel partitioning
SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE (HTTP RESPONSE) - returns selected models/submodels binary and metadata
Model Selection Messages:
AI_MODEL_SELECTION_REQUEST - carries URN(s) of selected models/submodels
AI_MODEL_SELECTION_RESPONSE - returns selected models/submodels binary and metadata

2. Common Metadata Information (Section A.4.3)

A.4.3.1 Application Metadata

Defines characteristics and requirements of applications and associated AI/ML media processing tasks
Includes performance, accuracy, energy constraints, and supported models
New for split inferencing: Indicates supported split and remote inference modes and whether model supports partitioning

A.4.3.2 Endpoint Capabilities Metadata

Introduces separation between static and dynamic capabilities:

Static capabilities: Fixed or infrequently changing properties
Processing architecture
Peak compute capacity
Supported AI/ML frameworks
Available execution engines (CPU, GPU, NPU)
Supported numerical precisions
Hardware acceleration features
Dynamic capabilities: Runtime-dependent characteristics
Available memory
Current compute load
Energy mode
Battery level
Accelerator availability

This separation enables both long-term compatibility checks and short-term runtime optimization.

A.4.3.3 Model Information Metadata

Describes functional, structural, and performance characteristics of AI/ML models
Includes supported tasks, input/output specifications, resource requirements, latency/energy metrics
New: Indicates whether model supports partitioning

3. Split Inferencing-Specific Metadata (Section A.4.3.4)

A.4.3.4.1 Submodel Partitioning Metadata

Major technical contribution: Comprehensive metadata structure for describing model partitioning for split inferencing.

Key metadata elements:

| Field | Description |
|-------|-------------|
| submodelsPartitioningIdentifier | URN identifying the partitioning configuration |
| submodelComposition | Array of submodel objects (1..N) |
| submodelIdentifier | URN of individual submodel |
| endpointType | Execution location (UE, SERVER, EDGE, CLOUD, CUSTOM) |
| subtaskTypeIdentifier | Subtask type supported by submodel |
| submodelType | Role in pipeline (HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL) |
| size | Submodel file size in MB |
| submodelInputs/Outputs | Tensor specifications (ID, type, shape) |
| outputAccuracy | Trained accuracy percentage |
| subModelDataType | Data type (Uint8, Float32, Float16) |

Tensor specifications include:
- tensorID - identifier for input/output tensor
- tensorType - data type (integer, float32, float16)
- tensorShape - tensor dimensions (e.g., (1,3,300,300))

JSON Example provided: Complete example showing HEAD submodel on UE and TAIL submodel on DCAS for object detection task.

4. Negotiation Message Format (Section A.4.5)

Generic message structure defined:

Table 5: AI Metadata Messages Format

messages: Array of Message objects (1..n)
Each message follows Message data type specification

Table 6: Metadata Message Data Type

| Field | Type | Cardinality | Description |
|-------|------|-------------|-------------|
| id | string | 1..1 | Unique identifier within data channel session |
| type | number | 1..1 | Message subtype identifier |
| payload | object | 1..1 | Type-dependent message payload |
| sessionId | string | 1..1 | Associated multimedia session identifier |
| sendingAtTime | number | 0..1 | Wall clock transmission time |

Defined message types:
- MODELS_LIST_REQUEST
- MODELS_LIST_RESPONSE
- SPLIT_INFERENCE_CONFIGURATION_REQUEST
- AI_APPLICATION_DISCOVERY_REQUEST
- AI_APPLICATION_DISCOVERY_RESPONSE
- AI_APPLICATION_REQUEST
- AI_APPLICATION_RESPONSE
- AI_SERVER_CONFIGURATION_REQUEST
- AI_SERVER_CONFIGURATION_RESPONSE
- AI_MODEL_SELECTION_REQUEST
- AI_MODEL_SELECTION_RESPONSE

Summary of Changes

The CR introduces three main changes:

Complete message taxonomy for split inferencing negotiation with HTTP protocol mapping
Comprehensive metadata definitions covering applications, endpoint capabilities, models, and split-specific partitioning information
Generic message format for AI metadata exchange over data channels with extensible type system

The contribution enables complete end-to-end split inferencing capability negotiation between UE and remote endpoints, with particular emphasis on submodel partitioning metadata that allows flexible distribution of AI/ML model execution across network nodes.

Document Information

TDoc:
S4-260183

Source:
InterDigital Finland Oy

Type:
discussion

For:
Agreement

Original Document:
View on 3GPP

Title: [AIML_IMS-MED] Negotiation messages for split inferencing

Agenda item: 10.5

Agenda item description: AI_IMS-MED (Media aspects for AI/ML in IMS services)

Doc type: discussion

For action: Agreement

Release: Rel-20

Contact: Stephane Onno

Uploaded: 2026-02-03T19:11:22.680000

Contact ID: 84864

TDoc Status: merged

Is revision of: S4aR260010

Reservation date: 03/02/2026 16:32:54

Agenda item sort order: 52