Summary of 3GPP Technical Document S4-260181
Document Overview
This is a revision of S4aR260012 proposing additional details for negotiation messages and associated metadata in support of AI/ML-based media services (AIML_IMS-MED). The document provides JSON-formatted metadata examples and updates to align with the agreed call flow from S4aR260014.
Main Technical Contributions
1. Negotiation Message Summary Table (Section A.4.2)
The document introduces Table A4.2-1 which defines the complete set of negotiation messages for local inferencing call flows. Key updates include:
- AI_APPLICATION_DISCOVERY_REQUEST/RESPONSE: Discovery of AI/ML application families/types with optional UE capability filtering
- AI_APPLICATION_REQUEST/RESPONSE: Selection of specific AI/ML application with URN, returning application binary data and metadata
- CANDIDATE_MODELS_LIST_REQUEST/RESPONSE: Renamed from previous version, exchanges UE capabilities for list of candidate models
- AI_MODEL_SELECTION_REQUEST/RESPONSE: Model selection using URN(s), returning model binary data and metadata
Each message is mapped to possible HTTP protocol operations (GET, POST, RESPONSE) and associated metadata parameters.
2. Metadata Information Definitions (Section A.4.3)
A.4.3.1 Application Metadata
Defines characteristics and requirements of AI/ML applications including:
- applicationIdentifier: URN-based identification
- taskList: Contains task type identifiers, supported task types (ASR, TTS, Translation)
- Performance constraints:
- maximumTaskInferenceLatency (milliseconds)
- minimumTaskInferenceAccuracy
- maximumLocalEnergyConsumption (joules)
- taskAccuracy (e.g., mAP score)
- taskOperationalCharacteristics: computeIntensity, memoryFootprint, latencySensitivity, energySensitivity
- associatedModels: List of models with modelName and modelDescription
A.4.3.2 Endpoint Capabilities Metadata
Separates capabilities into static and dynamic categories:
Static Capabilities (fixed/infrequently changed):
- endpointIdentifier
- flopsProcessingCapabilities (peak compute in FLOPS)
- macOpProcessingCapabilities (MAC operations)
- supportedAiMlFrameworks
- accelerationSupported (boolean)
- supportedEngines (CPU, GPU, NPU)
- supportedPrecision (FP32, FP16, INT8)
Dynamic Capabilities (runtime-dependent):
- availableMemorySize
- currentComputeLoad
- energyMode (Eco/Balanced/Performance)
- batteryLevel
- acceleratorAvailability
This separation enables both long-term compatibility checks and short-term runtime optimization.
A.4.3.3 Model Information Metadata
Comprehensive model characterization including:
- Identification: modelIdentifier (URN), taskIdentifier (supports multi-task models)
- Model properties: modelSize (MB), format, formatVersion, framework, frameworkVersion
- Input/Output specifications:
- inputMediaIdentifier, inputType, inputShape
- outputIdentifier, outputType, outputShape, outputAccuracy
- Performance metrics:
- targetInferenceLatency (with hardwarePlatformIdentifier)
- flopsProcessingCapabilities
- macOpProcessingCapabilities
- energyEstimation (joules, platform-specific)
- Data types: modelDataType (Uint8, Float32, Float16)
3. Generic Negotiation Message Format (Section A.4.4)
Defines a transport-protocol-independent message format for AI metadata exchange over data channels:
Messages Container:
- Array of Message objects (1..n cardinality)
Message Data Type includes:
- id: Unique identifier within data channel session scope
- type: Message subtype enumeration:
- CANDIDATE_MODELS_REQUEST
- CANDIDATE_MODELS_RESPONSE
- AI_APPLICATION_DISCOVERY_REQUEST/RESPONSE
- AI_APPLICATION_REQUEST/RESPONSE
- AI_MODEL_SELECTION_REQUEST/RESPONSE
- payload: Type-dependent message content
- sessionId: Associated multimedia session identifier
- sendingAtTime: Wall clock timestamp (optional)
This format provides flexibility for various transport protocols (e.g., HTTP) without imposing specific constraints.
Key Design Principles
- Separation of concerns: Application, endpoint, and model metadata are independently defined
- Static vs. dynamic distinction: Enables efficient capability negotiation and runtime adaptation
- Protocol independence: Generic message format supports multiple transport options
- Comprehensive metadata: Covers functional, performance, energy, and accuracy requirements
- Multi-task support: Models can serve multiple AI/ML tasks
- Platform-specific metrics: Latency and energy measurements tied to hardware platforms