←

Read-only Review: 10.5

AI_IMS-MED (Media aspects for AI/ML in IMS services)

Meeting:
Generated: 2026-04-07 09:47:54

Show columns:

TDoc Number	Title	Source	Summary	Proposals	Comments
S4-260100 (pdf)	Network, QoS and UE Considerations for client side inferencing AIML/IMS	Huawei Tech.(UK) Co.. Ltd	Network, QoS and UE Considerations for Client Side Inferencing AIML/IMS 1. Introduction This contribution addresses network-related issues in the previously discussed call flow for client/UE side inferencing (S4aR260004a). The main concerns relate to steps 12-16 of the draft call flow, which involve model download and deployment for UE-based AI inferencing. 2. Network Related Issues 2.1 Model Size Problem Identification: - TR 26.927 indicates models are approximately 40 MB (Table 6.6.2-1) - Current publicly available models for practical use cases are significantly larger (100+ GB) - Example: Hunyuan Image generation model set is 169 GB (available on Hugging Face) - Simple language models (e.g., single language translation) are approximately 100 MB Required Action: Details on supported model sizes and required response times need to be defined. 2.2 Network QoS Support Problem Identification: - For real-time request-response (500 ms or even 1000 ms), current mobile networks cannot support required bit-rates - Example calculation: 100 GB model with 1000 ms response time requires ~800 Gbps - Such bit-rates are not realistic in current mobile networks Required Actions: - Define supported model size and transfer time requirements - Identify appropriate QoS profile (5QI) - If no suitable 5QI exists, request SA2 to update 5QI specifications for this use case 2.3 Compression and UE Support Problem Identification: - TR 26.927 details NN compression with 2-20% compression ratios - Even with compression, resulting bit-rates remain infeasible for mobile networks - No UE capabilities for NN codec support have been defined - Cannot assume UE support for such capabilities Required Action: Clarify whether NNC is required for client-side inferencing and document related requirements. 2.4 Protocol Support Issue Problem Identification: - S4aR260004a mentions HTTP for download - HTTP/TCP is suboptimal for large, quick data downloads due to: - TCP slow start - Congestion control introducing additional latency - Tail latency from head-of-line blocking Proposed Solutions: - Consider alternative protocols: - RTP protocol with 3GPP burst QoS - QUIC (has bindings to 5G XRM framework for improved QoS support) - Leverage 3GPP XRM QoS support for bursty data transfer (HTTP/3 with QUIC or RTP) 2.5 Caching and Bandwidth Wastage Problem Identification: - Current call flow indicates model download for every request - No explicit caching or model update mechanism - Results in: - Huge bandwidth wastage - Impossible network bit-rate requirements in current mobile networks Required Action: Include model updates and caching mechanisms in call flow rather than requesting new model from network each time. 3. Suggested Way Forward The contribution emphasizes that the intention is not to exclude UE inferencing (as agreed for the work item), but to clarify limitations and requirements before agreeing to a CR detailing such call flows. Proposed Actions: Scope Limitation: Add note that client-side inferencing only works for simple cases: Explicitly exclude complex VLM/LLM Define maximum model size limits Specify applicable use cases for smaller models Latency Requirements: Clarify end-to-end latency requirements and derive required bit-rate/latency and loss profiles Protocol Clarification: Clarify correct protocol usage (typically not HTTP/TCP) to support the use case with required latency SA2 Coordination: Ask SA2: How such bursts can be supported Whether new QoS profile is needed or if existing profiles suffice Codec Support: Clarify required neural network codec support (if any) for the UE Caching Mechanism: Add caching and model update mechanisms in call flow to avoid downloading model for each task	Proposal 1: Add a note that this can only work for very simple cases excluding complex VLM/LLM explicitly in the text and limited to a model size, and what use cases this can be used for that can use such smaller models. Proposal 2: Clarify end-end latency requirements and derive required bit-rate/latency and loss profiles Proposal 3: Clarify the correct protocol usage to support this use case and the required latency, typically not HTTP/TCP. Proposal 4: Ask SA2 how such burst can be supported and if a new QoS profile is needed or if existing. Proposal 5: Clarify the required support of neural network codec if any for the UE Proposal 6: Consider adding caching and model updates in the call flow to avoid downloading a model for each task.	manager: [Technical] The core premise in §2.2 (downloading a 100 GB model within 500–1000 ms) is not a realistic or relevant requirement for UE-side inferencing; the call flow should instead assume pre-provisioned/on-device models or background download over minutes/hours, otherwise the derived “800 Gbps” conclusion is a strawman that will derail the discussion. [Technical] §2.1 cites TR 26.927 Table 6.6.2-1 (~40 MB) but then jumps to “public models 100+ GB” without mapping to the IMS/AIML use cases under discussion; the contribution needs to distinguish between (i) UE inference models intended for mobile deployment and (ii) data-center class generative models, otherwise the “required action” to define supported sizes is ungrounded. [Technical] The text conflates “real-time request-response latency” with “model transfer time” (§2.2); in most architectures the model download is not on the critical path of a single inference transaction, so QoS/latency requirements should be split into (a) inference transaction latency and (b) model acquisition/update latency. [Technical] Requesting SA2 to “update 5QI specifications” (§2.2) is premature and underspecified: the contribution does not identify whether the model transfer is best treated as GBR/non-GBR, what packet delay budget/jitter/loss are needed, or whether existing 5QIs (e.g., for TCP-based data) are insufficient; without concrete QoS characteristics, SA2 cannot act. [Technical] §2.4’s protocol critique is internally inconsistent: proposing RTP for “large, quick data downloads” is atypical and ignores reliability, congestion control, and content integrity needs; if the issue is TCP behavior, the more relevant comparison is HTTP/2 vs HTTP/3 (QUIC) and/or segmented download with application-layer pacing rather than RTP. [Technical] The claim that QUIC “has bindings to 5G XRM framework for improved QoS support” (§2.4) is vague and risks being incorrect/misleading in 3GPP terms; if the intent is to leverage 5G QoS (5QI/ARP/reflective QoS) or ATSSS, the contribution should reference the specific 3GPP mechanisms and how they apply to QUIC flows. [Technical] §2.3 cites “2–20% compression ratios” from TR 26.927 but does not clarify whether this refers to bitrate reduction, model size reduction, or accuracy trade-offs; without specifying the compression target and acceptable quality loss, the conclusion “still infeasible” is not technically supported. [Technical] The “No UE capabilities for NN codec support have been defined” point (§2.3) is valid, but the required action is incomplete: it should propose where UE capability signaling would live (e.g., NAS/IMS/UE capability exchange) and what minimum interoperability baseline is assumed if NNC is optional. [Technical] §2.5 asserts the call flow “indicates model download for every request,” but does not quote the exact steps 12–16 behavior; if the original flow already implies model reuse or versioning, this criticism may be inaccurate—please pinpoint the exact normative/diagram text that mandates per-request download. [Technical] The proposed “scope limitation” to exclude “complex VLM/LLM” (§3.1) is not actionable without objective criteria (parameter count, model size on disk, compute class, or use-case categories); otherwise it becomes a subjective exclusion that is hard to standardize. [Technical] The contribution focuses almost entirely on downlink throughput but omits other critical feasibility constraints for UE inferencing (compute, memory footprint, thermal/power, storage, and model integrity/attestation); these are central to whether UE-side inferencing is viable and should be at least acknowledged if the goal is “network, QoS and UE considerations.” [Editorial] Several “Required Action(s)” are phrased as open-ended requests (“need to be defined”, “clarify correct protocol usage”) without proposing concrete spec text, assumptions, or a target spec/TR clause; as written it reads more like a discussion note than a contribution ready to drive a CR. [Editorial] Terminology is inconsistent/unclear (e.g., “client/UE side inferencing”, “client-side inferencing”, “UE-based AI inferencing”, “AIML/IMS”) and should be aligned with the agreed WI terminology and the referenced flow (S4aR260004a) to avoid ambiguity about the architecture being critiqued. 2026-02-09 04:01
S4-260112 (pdf)	[AI_IMS-MED] AI/ML media processing and task updating	Nokia	Summary of S4-260112: AI/ML Media Processing and Task Updating Document Overview This contribution proposes updates to AI/ML media processing procedures and task updating call flows for IMS Data Channel (DC) applications. It builds upon TR 26.927 and TS 23.228 Annex AC, incorporating agreements from SA4#134 (S4-252075) and addressing feedback from SA2's reply LS on AIML for Media. Main Technical Contributions 1. Refinement of AI/ML Task Processing Call Flows Issues Identified with TR 26.927 Architectural ambiguity: Split media processing location (UE vs network) was unclear DC AS introduction timing: Not properly specified after ADC establishment Confusing step numbering: Parallel options (5a, 5b) caused confusion MRF references: MRF should be removed as SA2 clarified it doesn't play a role in Data Channel (removed from TS 23.228) Incomplete task update procedures: Steps 9-10 lacked detail on how UE updates AI/ML inference tasks Updated Call Flow Structure (Steps 1-23) The revised flow incorporates common call flows agreed in S4-252075: Initial Setup (Steps 1-13): - UE1 registers to IMS with AI/ML capability indication - MMTel session establishment between UE1 and UE2 - Bootstrap Data Channel (BbDC) establishment between UE1 and MF - DCSF creates DC application list based on: - Subscription list filter - UE static capabilities - Application list includes AI service information (e.g., intelligent translation service) - UE1 downloads application list and selects application - Application Data Channel (AaDC) establishment between UE1 and DC AS - Task selection and AI/ML model selection Media Processing Execution (Steps 14-16): - Media session runs over MMTel session - UE1 executes selected task and transmits input media streams - Network runs inference and forwards processed streams to UE2 (or UE1, or both depending on application) - Different alternatives supported based on inference location (local/remote/split) 2. Task Reselection and Update Mechanisms Task Reselection (Step 17) Trigger: New actions in applications or other triggers during session Process: UE1 reselects tasks from previously downloaded task metadata Flow: Returns to step 10 (task selection from app manifest) Task Update (Steps 17-23) Use Case: New requirements during running IMS session not fulfilled by downloaded tasks Example: New callee (UE3) joins call speaking new language requiring additional translation Update Procedure: - Step 17: UE1 sends UPDATE Task request over ADC with: - Task ID - New parameters - Start time (when to apply new parameters) - Optional additional parameters Steps 18-19: MF checks request and reconfigures task MF may reject invalid requests MF may establish new application DC or media flows if needed MF may stop existing flows no longer needed MF forwards UPDATE Task request to DC AS if needed DC AS reconfigures task according to new parameters Alternative Execution Paths (Steps 20-22): - Alt a - Local Inference: - DC AS sends UPDATE Task response (including new models) to UE1 via MF - UE1 runs updated inference task locally Alt b - Remote/Split Inference: DC AS sends UPDATE Task response to UE1 via MF UE1 transmits media streams to network for inference Network runs inference and forwards processed streams to UE2 Step 23: Remote UE (UE2) informed when task updates impact it 3. Task Control Messages 3.1 START Task Message Purpose: Request to start an inference task (for split or remote inference) Message Content: - `id`: Message identifier - `type`: "urn:3gpp:aiml:start-task" - `task_id`: Task identifier (e.g., "speech-to-speech-translation") - `parameters`: Task-specific parameters (e.g., inputLanguage, outputLanguage) - `input`: Protocol and media stream identifier (mid from SDP) - `output`: Protocol and media stream identifier - `timestamp`: Timestamp of request Response Message: - `task_session_id`: Unique identifier for specific task instance - `response_code`: Status (e.g., "200 OK") - Echoes task_id and parameters Media Stream Identification: Uses "mid" identifier from RFC 8843 as included in SDP offer/answer. Multiple RTP streams identified by comma-separated mid values. 3.2 UPDATE Task Message Purpose: Update existing task that has already been started (requires prior 200 OK response to START Task) Use Cases: - Update model, parameters, input or output of existing task - Indicate new input/output stream (e.g., new UE added to call) Message Content: - `id`: Message identifier - `type`: "urn:3gpp:aiml:update-task" - `task_id`: Task identifier - `task_session_id`: References task from START Task Response - `parameters`: Updated parameters - `output`: Updated output stream information - `timestamp`: Timestamp of request Response Message: - `task_session_id`: Same as in request - `response_code`: Status indication - Confirms task_id Key Technical Clarifications Inference Location Flexibility The specification supports three inference deployment models: 1. Local inference: AI model downloaded and executed in UE 2. Remote inference: Inference executed in network (MF) 3. Split inference: Inference split between UE and network Message Exchange Protocol Task control messages exchanged over Application Data Channel (AaDC) Messages use structured format with JSON-like syntax Unique identifiers (task_session_id) maintain task context across updates Network Entity Roles DCSF: Creates and filters DC application list based on subscription and UE capabilities MF: Manages media flows, coordinates with DC AS, executes inference tasks DC AS: Provides AI applications and models, reconfigures tasks MRF: Explicitly removed from procedures (per SA2 clarification) Editorial Notes Network functional entity for inference task execution depends on SA2's reply LS Further details on message formats to be provided in future contributions	Extracted Proposals Based on my review of the document, there are no explicit proposals in this 3GPP contribution. The document contains sections titled "Introduction," "Discussion," and "Proposal," but the "Proposal" section (Section 3) does not contain any text formatted as a proposal. Instead, it only states: "We propose to add the following changes to the main CR, or base CR to TS 26.264." This is followed by technical content showing proposed changes to specifications (marked as "First change" and "Second change"), but these are not formatted as formal proposals in any of the standard formats (Proposal X:, Proposal X., Proposal:, etc.). The document is a discussion and agreement document that proposes technical changes to specifications, but does not contain formally marked proposals in the expected format.	manager: [Technical] The contribution still assigns AI/ML inference execution to the MF (“MF manages media flows… executes inference tasks”), but in SA2/TS 23.228 the MF role and its capabilities for AI/ML execution are not clearly standardized; you need to align the execution entity with the agreed IMS DC architecture (e.g., DC AS vs other function) and avoid introducing compute semantics for MF without normative backing. [Technical] The call flow mixes BbDC to MF and AaDC to DC AS while also stating “Task control messages exchanged over AaDC,” yet UPDATE is sent “over ADC” and may be forwarded by MF; the spec impact is unclear unless you precisely define which DC (BbDC vs AaDC) carries which control messages and whether MF is a DC endpoint or only a relay. [Technical] “Split inference” is claimed as supported, but no concrete procedure is provided for partitioning, synchronization, or model/component distribution between UE and network (e.g., how intermediate representations are transported, how latency/jitter is handled, and how the split point is negotiated), making the split option currently non-actionable. [Technical] The UPDATE procedure introduces “start time (when to apply new parameters)” but there is no definition of the time base (RTP timestamp, NTP, IMS time, or relative time), nor how both UE and network ensure consistent switchover, which is critical to avoid media glitches and inconsistent translation states. [Technical] The task control messages define `input`/`output` using SDP `mid`, but `mid` identifies an m-line, not a specific direction/source in multi-party or mixer scenarios; for conferencing/add/remove participants you likely need additional identifiers (SSRC, RID, participant identity, or DC application-level stream IDs) and rules for multiple simultaneous inputs/outputs. [Technical] The document says multiple RTP streams are identified by “comma-separated mid values,” which is not aligned with typical JSON typing and lacks constraints (ordering, uniqueness, mapping to codecs); you should define an array structure and normative mapping to SDP offer/answer and subsequent re-INVITE/UPDATE procedures. [Technical] The UPDATE use case “new UE added to call” implies new media descriptions and potentially new `mid` values, but the procedure does not specify the required SIP/SDP signaling (re-INVITE/UPDATE) and how task update sequencing relates to SDP completion (race conditions between task update and new m-line establishment). [Technical] The contribution states “UE1 registers to IMS with AI/ML capability indication,” but does not specify the mechanism (feature tags, IMS media feature tags, SIP header, or 3GPP-defined capability exchange); without a concrete method it risks conflicting with existing IMS capability negotiation procedures. [Technical] The response codes are shown as HTTP-like (“200 OK”) inside JSON messages; unless the DC protocol explicitly reuses HTTP semantics, you should define a 3GPP-specific result code space or map to existing DC/IMS error handling to avoid ambiguity and inconsistent implementations. [Technical] Security and authorization are not addressed for downloading models and updating tasks (e.g., ensuring only authorized UEs can request model delivery, integrity of models, privacy of media sent for inference, and policy controls based on subscription list filtering), which is essential given the introduction of AI model distribution and media offload. [Technical] Step 23 “Remote UE (UE2) informed when task updates impact it” is underspecified: it does not define whether UE2 is informed via SIP/SDP, DC messaging, or application signaling, nor what UE2 must do (e.g., render changes, accept new media, consent), leaving interoperability gaps. [Editorial] The document references “TR 26.927 and TS 23.228 Annex AC” but does not provide exact clause numbers for the proposed modifications; for a contribution intended to update specs, you should cite the precise sections/figures being replaced and provide proposed normative text or updated call-flow diagrams. [Editorial] Terminology is inconsistent: “ADC,” “AaDC,” “Application Data Channel,” and “BbDC” are used interchangeably in places; you should normalize terms and abbreviations and ensure they match the definitions in TS 23.228/related DC specs. [Editorial] The message format is described as “JSON-like syntax” with URN `type` values, but no formal schema, field cardinality, optionality, or extensibility rules are given; even at TR level, a consistent pseudo-schema would reduce ambiguity and prevent incompatible interpretations. 2026-02-09 04:02
S4-260118 (pdf)	[AIML_IMS-MED] Base CR for TR 26.114	Samsung Electronics Iberia SA	3GPP Technical Document Summary: CR 0607 to TS 26.114 Document Information CR Number: 0607 Specification: TS 26.114 v19.2.0 Category: B (addition of feature) Release: Rel-20 Work Item: AIML_IMS-MED Source: Samsung Electronics Iberia SA Purpose and Rationale This Change Request introduces stage 3 specifications for AI/ML processing capabilities in IMS services. The CR addresses the missing technical specifications for AI/ML data delivery and signaling mechanisms required to support AI/ML-enhanced IMS services in Release 20. Main Technical Contributions 1. References, Terms, and Abbreviations (Clauses 2, 3.1, 3.3) Updates to include AI/ML-specific terminology, definitions, and abbreviations relevant to IMS services. Specific content marked as Editor's Notes for future completion. 2. New Annex AC: AI/ML Assisted Media Processing for MTSI A comprehensive new normative annex is introduced covering all aspects of AI/ML integration with MTSI. AC.1 Introduction Provides introductory material on AI/ML capabilities in IMS services. AC.2 Terminal Architecture Defines updates to terminal architecture to accommodate: - Inference engine - AI/ML models - Intermediate data handling AC.3 End-to-End Reference Architecture Potential updates to the end-to-end reference architecture for AI/ML support. Notes indicate possible liaison requirements with SA2. 3. AI/ML Call Flows (AC.4) AC.4.1 AI/ML Model Delivery for Device Inferencing Detailed 15-step call flow for AI/ML model delivery and execution: Key Steps: 1. Session Establishment: MMTel service establishment 2. Bootstrap Data Channel (BDC) Setup: Established between UE and MF per TS 23.228 3. Application Discovery: UE requests application list via HTTP over BDC 4. Application List Creation: DCSF generates user-specific DC application list with metadata including: - Generic app information (description, ID, URL) - AI-specific information (AI feature tags, task descriptions) 5. Application Selection: User selects app based on AI service descriptions 6-9. Application Download: Selected AI application downloaded from DCSF via MF to UE, including AI task metadata (task manifest) 10. Task Selection: User presented with AI task list and selects desired tasks 11. Model Request: Selected tasks and models communicated to MF via: - BDC: HTTP GET with task/model URLs - ADC: AI Model Selection Request with model URNs 12. Model Retrieval: MF fetches AI models from either: - 12a: DCAR via DCSF - 12b: DC AS 13. Model Download: UE downloads AI models from MF via: - BDC: HTTP response with AI models as resource - ADC: AI Model Selection Response with model data 14. Inference Execution: Tasks executed on UE 15. Task Reselection: User/UE may reselect tasks during session using received metadata Open Issues Identified: - Whether MF needs to understand AI task semantics (FFS) - Application types that can be handled - Large model handling mechanisms AC.4.2 Network Inferencing Placeholder for network-based inference scenarios. AC.4.3 Split Inferencing Placeholder for distributed inference scenarios across UE and network. 4. AI/ML Capabilities (AC.5) Defines capabilities and requirements for: - AC.5.1 UE Capabilities: Device-side AI/ML requirements - AC.5.2 Network Capabilities: Network-side AI/ML requirements 5. AI/ML Formats (AC.6) Specification of formats for: - AI/ML models - Intermediate data 6. AI/ML Metadata (AC.7) Definition of necessary metadata structures for AI/ML operations, including task manifests referenced in the call flows. 7. Negotiation and Signaling (AC.8) Procedures for: - Model delivery negotiation - Inferencing coordination - General AI/ML media processing signaling 8. Data Channel Transport (AC.9) Specification of AI/ML data transport mechanisms: - What data to transport over BDC (Bootstrap Data Channel) - What data to transport over ADC (Application Data Channel) - Transport procedures and protocols Key Technical Entities MF: Media Function DCSF: Data Channel Selection Function DCAR: Data Channel Application Repository DC AS: Data Channel Application Server BDC: Bootstrap Data Channel ADC: Application Data Channel Implementation Status Most technical content is marked with Editor's Notes, indicating this is a skeleton CR establishing the structure for future detailed specifications. The most complete section is AC.4.1 (AI/ML model delivery for device inferencing), which provides a concrete call flow example.	Based on my analysis of the provided 3GPP document, there are no proposals explicitly marked as such in this document. The document is a Change Request (CR) form for TS 26.114 related to AI/ML for IMS services. While it contains several "Editor's Notes" indicating areas that need to be updated or clarified, these are not formatted as proposals. The document primarily consists of: A CR form header with metadata Reason for change, summary of change, and consequences sections Multiple editor's notes describing future work needed A call flow diagram (AC.4.1) with numbered steps None of these elements are explicitly labeled as "Proposal" in any of the formats specified (Proposal X:, Proposal X., Proposal:, Proposal., or Proposal ).	manager: [Technical] The CR is largely non-actionable as a normative change: “most technical content is marked with Editor’s Notes” and multiple placeholders (AC.4.2/AC.4.3) mean TS 26.114 would gain an annex that cannot be implemented or verified, which is not appropriate for a Category B CR. [Technical] The proposed call flow in AC.4.1 relies on entities/procedures (MF, DCSF, DCAR, DC AS, BDC/ADC) “per TS 23.228” without ensuring those entities and interfaces are actually defined for IMS MTSI in the referenced specs; TS 26.114 cannot normatively introduce new core network functions and HTTP-based procedures without tight alignment to SA2/CT. [Technical] Step 11/13 introduces “ADC: AI Model Selection Request/Response” signaling, but TS 26.114 does not define a new application-layer signaling protocol on ADC; the CR must specify whether this is SIP/SDP, MSRP, HTTP, or a new payload format, and how it is negotiated and secured. [Technical] The flow mixes “application download” and “model download” via MF as an intermediary (steps 6–9, 12–13) without clarifying whether MF is acting as a proxy/cache, whether this is allowed by IMS architecture, and how content integrity, authorization, and accounting are handled end-to-end. [Technical] Security and trust are missing: there is no normative requirement for model signing, integrity verification, provenance, or authorization (UE trusting MF/DCSF/DCAR), which is critical when downloading executable models and “AI applications” into the terminal. [Technical] The CR does not define how AI/ML capabilities are negotiated within MTSI session establishment (e.g., SDP attributes, feature tags, SIP option-tags), yet claims “negotiation and signaling (AC.8)”—without this, interoperability and backward compatibility with non-AI/ML UEs cannot be ensured. [Technical] The relationship to existing MTSI media processing and data channel usage in TS 26.114 is unclear: the annex appears to introduce new “AI/ML assisted media processing” but does not specify how it interacts with existing codecs, RTP/RTCP, media handling, and data channel procedures already standardized. [Technical] AC.4.1 is user-driven (“User selects app/tasks”) but TS 26.114 is a stage 3 protocol spec; the CR should define UE behavior and protocol triggers independent of UI assumptions, and specify machine-driven selection/updates for unattended operation. [Technical] Large model handling is explicitly FFS, yet model size impacts transport choice, fragmentation, resumption, caching, and latency; without at least baseline requirements (chunking, range requests, resume, max sizes), the proposed BDC/ADC transport is incomplete. [Technical] The CR introduces “AI/ML formats” (AC.6) and “intermediate data” but does not constrain formats (e.g., ONNX/TFLite) or define MIME types, encapsulation, versioning, and compatibility rules; this will lead to non-interoperable implementations. [Technical] The end-to-end reference architecture (AC.3) is described as “potential updates” and “possible liaison requirements with SA2,” indicating architectural dependencies are unresolved; TS 26.114 should not proceed with normative annex text until SA2 architecture and responsibilities are agreed. [Editorial] The annex is described as “comprehensive new normative annex,” but the presence of Editor’s Notes, placeholders, and FFS items contradicts “normative”; the CR should clearly separate informative background from normative requirements and remove unfinished notes before approval. [Editorial] Terminology is inconsistent/undefined in the summary: “AI application,” “task manifest,” “AI feature tags,” “model URNs,” and “intermediate data” need precise definitions in clauses 3.1/3.3 and consistent use throughout AC.4–AC.9. [Editorial] The step numbering and dual-path descriptions (e.g., 12a/12b, BDC vs ADC in steps 11 and 13) need clearer conditional logic (“shall/should/may” conditions) to avoid ambiguity about which transport and signaling are mandatory vs optional for compliance. 2026-02-09 04:03
S4-260127 (pdf)	[AIML_IMS-MED] Further details on DC app list	Samsung Electronics Iberia SA	Further Details on DC Application List Introduction This contribution consolidates relevant text from existing 3GPP specifications (TS 23.228, TS 29.176, and TS 26.114) regarding the Data Channel (DC) application list request mechanism, particularly focusing on root URL replacement procedures. The document aims to clarify how these procedures are already defined in the context of Bootstrap Data Channel (BDC) setup and proposes their reuse for AIML_IMS-MED work. Relevant Specifications Overview Bootstrap Data Channel Setup Signalling (TS 23.228 Clause AC.7.1) The specification defines the complete BDC establishment procedure for person-to-person use cases where the Media Function (MF) anchors the bootstrap data channel: Key Steps: Steps 1-2: UE#1 sends SIP INVITE with initial SDP containing bootstrap DC offers. IMS AS validates user subscription and determines if DCSF notification is required. Steps 3-6: IMS AS notifies DCSF via `Nimsas_SessionEventControl_Notify` with session parameters. DCSF determines policy, reserves MDC1 media information for both originating and terminating sides, and responds with `Nimsas_MediaControl_MediaInstruction` containing: MDC1 media endpoint addresses DC Stream ID Replacement HTTP URL representing the application list offered via MDC1 interface Steps 7-10: IMS AS selects MF and invokes `Nmf_MRM_Create` to allocate DC media resources. The request includes information for both Mb and MDC1 interfaces. MF responds with negotiated media resource information. Steps 11-19: SDP negotiation completes through terminating network, with similar DCSF/MF resource allocation on terminating side. Bootstrap data channels are established. Steps 20-24: Critical application list request flow: UEs send application request messages to MF via established bootstrap data channel MF replaces the root URL with the replacement URL received in step 8 MF forwards message to DCSF media endpoint DCSF provides application list and DC applications to UEs based on capabilities and choices through MF Either UE may select applications from local or remote DCSF (subject to DCSF policy) Media Control Service Operation (TS 23.228 Clause AA.2.4.3.2) The `Nimsas_MediaControl_MediaInstruction` service operation defines the MediaInstructionSet structure: DC Media Specification includes: Media proxy configuration (HTTP or UDP) MDC1/MDC2 media endpoint address Replacement HTTP URL per stream ID allocated by application layer representing the application list (e.g., graphical user interface) provided to IMS subscriber via MDC1 interface (used only in BDC establishment) Data Channel Mapping and Configuration information DC Interworking indication Data Channel Port and SCTP Port Media Instructions supported: - TerminateMedia - OriginateMedia - TerminateAndOriginateMedia - UpdateMedia - DeleteMedia - RejectMedia MF Resource Management (TS 29.176 Clause 5.2.2.2) The `Nmf_MRM_Create` service operation defines how NF service consumer (IMS AS) requests media context creation: For DC media resource type, the request includes: Media proxy configuration in `mediaProxyConfig` attribute Data channel mapping and configuration in `streams` attribute (SCTP stream ID, subprotocol, order, maxRetry, maxTime, priority) Remote SCTP and DTLS endpoint information in `remoteDcEndpoint` Optional maximum message size For bootstrap data channel specifically: Remote MDC1 media specification in `remoteMdc1Endpoint` attribute within `Mdc1Info` data type Replacement HTTP URL for each streamId allocated by application layer representing the application list offered to IMS subscriber via MDC1 interface For P2A/A2P application data channel: Remote MDC2 media specification in `remoteMdc2Endpoint` attribute within `Mdc2Info` data type Data Channel Application Definition (TS 26.114 Clause 6.2.10.1) Data channel application consists of: - HTML web page including JavaScript(s) - Optionally image(s) and style sheet(s) Bootstrap data channel is defined as: - Data channel used to retrieve DC application(s) for DCMTSI client - Data channel stream ID below 1000 - Uses HTTP protocol as data channel subprotocol - Application accessible at HTTP root ("/") URL describes GUI and logic for further data channel usage - Authority (host) part of URL and "Host" HTTP header shall be ignored on reception and set to empty value by DCMTSI client Discussion The complete flow for application list handling is already specified: DCSF provides replacement HTTP URL to IMS AS in MediaInstructionSet (representing application list) IMS AS forwards replacement HTTP URL to MF during resource allocation via `Nmf_MRM_Create` UE sends HTTP GET request for application list to MF via bootstrap data channel MF performs root URL replacement using the replacement HTTP URL received from IMS AS MF forwards request to DCSF media endpoint (MDC1) DCSF provides application list and selected DC applications to UE through MF The specifications explicitly state that "the details of how to provide the application list to the UE and how to use it by the UE are not defined in TS 23.228," but the transport mechanism and URL replacement procedures are fully defined. Proposal From the UE perspective, the following procedures are already well-defined in TS 23.228 as part of BDC setup signalling: Request of an application list Download of the application list Request of a selected application Download of the selected application For AIML_IMS-MED work: Reuse these existing procedures and HTTP protocol for the same purposes Capability exchange negotiation between UE and MF (e.g., for task and/or model selection) should happen after the selection and download of a DC application Capability exchange should occur via an application data channel established for that specific application This approach leverages existing standardized mechanisms and maintains consistency with current IMS DC architecture.	Proposal 1: From the perspective of a UE, it is clear that the procedures and service operations concerning the request of an application list, download of the application list, request of a selected application, and download of the selected application, are already well defined in TS 23.228 as part of the BDC setup signalling procedures. Proposal 2: We proposed to re-use these procedures and the HTTP protocol for the same purposes in AIML_IMS-MED. Proposal 3: The negotiation of any capability exchange between the UE and the MF (e.g. for task and/or model selection) should happen after the selection and download of a DC app, via an application data channel application for that app.	manager: [Technical] The contribution asserts that “request/download of the application list” and “request/download of a selected application” are “already well-defined in TS 23.228”, but TS 23.228 explicitly says the details of providing/using the application list are not defined; you need to distinguish clearly between (a) transport anchoring and URL rewriting behavior vs (b) application-layer semantics (HTTP resources, formats, selection procedure), which are not fully specified. [Technical] The “root URL replacement” description is oversimplified: TS 26.114 states the application is accessible at HTTP root “/” and that the authority/Host are ignored by the DCMTSI client, but the contribution doesn’t explain how MF should rewrite absolute URLs, relative paths, query strings, fragments, or non-GET methods—without this, “replacement URL” handling is ambiguous and may break real HTTP interactions. [Technical] The document claims MF replaces the root URL “with the replacement URL received in step 8”, but in the earlier flow you describe DCSF providing it in steps 3–6 and IMS AS forwarding it in `Nmf_MRM_Create`; the step numbering and ownership of the URL parameter are inconsistent and could mislead normative behavior (which interface carries what, and when). [Technical] Reuse for AIML_IMS-MED is proposed without checking applicability constraints from TS 26.114 (e.g., bootstrap DC uses HTTP subprotocol, stream ID < 1000, Host header handling); if AIML_IMS-MED needs different subprotocols, larger stream IDs, or non-HTTP payloads, the “reuse as-is” claim is not technically justified. [Technical] The proposal “capability exchange negotiation between UE and MF should happen after selection and download of a DC application” conflicts with the stated DCSF policy/capability-based provisioning (“based on capabilities and choices”); if capability affects which application list entries are offered, it may need to occur before or during application list retrieval, not strictly after. [Technical] The contribution treats “application list offered via MDC1 interface” as a single replacement URL per stream, but doesn’t address multi-stream bootstrap cases or multiple application lists (e.g., per service, per media component); the mapping between streamId and replacement URL needs to be explicit to avoid cross-stream confusion. [Technical] Security and authorization implications of URL rewriting are not discussed: if MF rewrites requests to DCSF endpoints, the contribution should at least acknowledge how TLS/DTLS termination, origin policy, and access control are preserved (otherwise “reuse” may introduce new trust assumptions for AIML content delivery). [Technical] The statement “Either UE may select applications from local or remote DCSF (subject to DCSF policy)” is presented as part of the “already specified” flow, but no concrete normative reference or condition is given; this is a potentially significant behavioral claim that should be backed by exact clause text or removed. [Editorial] Several interface names/operations are used without consistent capitalization or exact naming (e.g., `Nimsas_SessionEventControl_Notify`, `Nimsas_MediaControl_MediaInstruction`, `Nmf_MRM_Create`); for a standards contribution, cite the exact service operation names and clause numbers to avoid misquoting. [Editorial] The contribution mixes “MDC1 media endpoint address”, “DCSF media endpoint”, and “MDC1 interface” loosely; tighten terminology to match TS 23.228/29.176 data types (e.g., `remoteMdc1Endpoint`, `Mdc1Info`) so it’s clear whether you mean IP:port, an HTTP URL, or an NF service endpoint. [Editorial] The “Discussion” section claims “transport mechanism and URL replacement procedures are fully defined” but earlier you quote that details are not defined; rephrase to avoid overclaiming and explicitly scope what is defined (signalling parameters and MF forwarding behavior) vs what remains unspecified (content format, selection logic, HTTP resource structure). [Editorial] The document reads like a consolidation note but ends with a “Proposal” that would require normative changes for AIML_IMS-MED; it should state explicitly whether you are requesting spec changes, a work item alignment, or merely recording that existing mechanisms could be reused. 2026-02-09 04:03
S4-260129 (pdf)	[AIML_IMS-MED] Call flow for split inferencing	Samsung Electronics Iberia SA	Summary of S4-260129: Call Flow for Split Inferencing Document Information Source: Samsung Electronics Co., Ltd. Meeting: TSG-SA WG4 Meeting #135 (February 2026, Goa, India) Work Item: AIML_IMS-MED Purpose: Approval of call flow for split inferencing Main Technical Contribution This document proposes a detailed call flow for split inferencing in IMS-based AI/ML services, where AI model execution is distributed between the UE and network elements (MF - Media Function). The contribution is intended for inclusion in clause AC.4.3 of the base Change Request. Split Inferencing Call Flow Session Establishment and Bootstrap (Steps 1-2) MMTel service establishment Bootstrap Data Channel (BDC) establishment between UE and MF per TS 23.228, clause AC.7.1 Application Discovery and Selection (Steps 3-6) Application List Request: UE requests available DC applications from MF via HTTP over BDC MF Routing: MF replaces root URL with replacement URL and forwards to DCSF Application Metadata Creation: DCSF creates user-specific DC application list based on: User subscription information Application metadata including: Generic app information (description, app ID, URL) AI-specific information (AI feature tag indicating requirements, AI task descriptions) Metadata Delivery: DCSF provides application list URL and metadata to UE via MF User Selection: User selects application based on AI service description and AI task annotations Application Download (Steps 7-9) UE requests selected application from MF MF fetches AI application from DCSF Application downloaded to UE via BDC along with AI task metadata (expressed as task manifest per clause AC.7) AI Task Selection and Configuration (Steps 10-13) Task Presentation: User presented with list of AI tasks supported by application, including: Annotations from AI task metadata Task description information Information on execution endpoints supported by each task/subtask User Task Selection: User selects desired AI task(s) Application Data Channel: Established between UE and DC AS per TS 23.228, clause AC.7.2 Split Configuration Decision: UE identifies which tasks/AI models to execute locally vs. in network based on: User-selected AI tasks AI task metadata UE capabilities Configuration Request: UE requests split inference configuration from network, identifying AI models for UE and network execution Model Distribution and Configuration Response (Steps 14-16) Requirements Check: MF verifies requirements for network-side AI tasks/models; MF reallocation if requirements not met Model Fetching: MF obtains AI models for both UE and network execution from either: DCAR via DCSF (step 15a), or DC AS (step 15b - alternative) Configuration Response: MF sends response to UE including AI models for UE execution Inference Execution (Steps 17-22) SDP Re-negotiation: Associates media/data/intermediate data flows between UE and MF with corresponding tasks UE Inference: Tasks designated for UE execution are performed Data Transfer to Network: Output (media/data/intermediate data) from UE tasks sent to MF Network Inference: MF executes tasks designated for network execution Result Delivery: MF sends output (results or intermediate data for further UE processing) to UE Optional Further UE Processing: UE may execute additional tasks as part of selected AI task(s) Dynamic Task Reselection (Step 23) User/UE may reselect AI tasks during session using AI task metadata from step 9 On reselection, flow returns to step 12 (split configuration decision) Key Technical Features Metadata Framework Application metadata includes both generic and AI-specific information AI task metadata (task manifest) provides detailed information on: Task descriptions Execution endpoint options Requirements for split execution Flexibility in Execution Distribution UE determines split configuration based on capabilities and metadata Network validates requirements and may reallocate MF resources Dynamic task reselection supported during active session Model Distribution Options Multiple sources for AI model retrieval (DCAR via DCSF or DC AS) Models distributed to both UE and network as needed for split execution Media/Data Flow Management SDP re-negotiation ensures proper association of data flows with tasks Support for intermediate data exchange between UE and network for multi-stage inference pipelines	Extracted Proposals Proposal: It is proposed to include the contents of clause 2 into clause AC.4.3 of the base CR.	manager: [Technical] The flow makes the UE the primary decision point for split placement (“UE identifies which tasks/AI models to execute locally vs. in network” in steps 10–13), but it does not define how network policy, subscription restrictions, privacy constraints, or charging constraints can override UE choice; this risks contradicting typical IMS/service control principles and needs explicit arbitration rules and rejection/alternative handling. [Technical] “MF replaces root URL with replacement URL and forwards to DCSF” (step 4) is underspecified and potentially unsafe: there is no normative mechanism described for URL rewriting, origin validation, or integrity protection, and it is unclear how this aligns with TS 23.228 procedures for BDC/DC application routing without creating an open redirect or content substitution risk. [Technical] The document relies on “HTTP over BDC” for application list and download (steps 3–9) but does not state whether TLS is mandatory, how server authentication is done (MF vs DCSF vs DC AS), and how end-to-end integrity of the application package/task manifest is ensured; without this, the call flow is incomplete for a security-sensitive “download code + model” procedure. [Technical] The “AI models distributed to both UE and network” (steps 14–16) lacks any versioning, compatibility, or hash/signature binding between (a) the downloaded application, (b) the task manifest, and (c) the model artifacts; this omission can lead to mismatched model/app execution and makes rollback/update behavior undefined. [Technical] Step 14 (“MF verifies requirements… MF reallocation if requirements not met”) is not consistent with typical IMS functional splits unless the MF has explicit resource management hooks; the contribution should specify what “requirements” are (compute, GPU, latency, locality), how they are signaled, and what the UE sees when reallocation fails (error codes, alternative MF, or fallback to UE-only). [Technical] The flow introduces “execution endpoints supported by each task/subtask” (step 10) but does not define how endpoint selection maps to actual IMS entities (MF vs DC AS vs other) and how the UE discovers reachable endpoints; this is critical for interoperability and should be tied to explicit identifiers and procedures in the referenced clauses. [Technical] “Application Data Channel established between UE and DC AS per TS 23.228, clause AC.7.2” (step 12) is inserted after application download via MF, but the call flow does not explain why the UE needs a separate channel to DC AS if MF remains the mediator for model/configuration; the roles of MF vs DC AS vs DCSF in control vs data plane are ambiguous. [Technical] The SDP re-negotiation step (17) is too vague: it claims to “associate media/data/intermediate data flows … with corresponding tasks” but does not specify the SDP attributes, identifiers, or mapping rules needed to bind a given m-line/data channel to a task/subtask, nor how this interacts with existing IMS media negotiation and BDC/DC constructs. [Technical] Dynamic task reselection (step 23) “returns to step 12” but ignores the need to re-negotiate media/data flows, update model placement, and potentially revoke/replace already delivered models; without explicit state handling, this can cause inconsistent task-to-flow bindings and resource leaks. [Technical] The model retrieval alternatives (step 15a via DCAR/DCSF vs 15b via DC AS) are presented without selection criteria, authorization checks, or consistency requirements; the spec text would need to define when each path is allowed and how the UE can trust provenance regardless of source. [Technical] The contribution assumes “user selects application” and “user selects AI task(s)” (steps 6 and 11), but does not address non-interactive/automated selection, accessibility, or policy-driven selection; for IMS service behavior, the UE behavior should not be purely UI-driven in normative flow. [Editorial] References to “TS 23.228, clause AC.7.1/AC.7.2/AC.7” and “clause AC.4.3” are not verifiable as written because AC numbering appears to be document-internal; the contribution should cite stable clause numbers/titles or clearly state these are draft annex clauses in the CR baseline. [Editorial] Terminology is inconsistent/unclear: “MF (Media Function)” is not a standard IMS functional entity name in TS 23.228, and “DC application,” “BDC,” “DCSF,” and “DCAR” are used without definitions in this summary; the call flow text should align naming with the spec’s defined entities and abbreviations. [Editorial] Several steps mix normative behavior with descriptive language (“may,” “optional,” “identifies,” “presented”) without indicating which parts are mandatory for interoperability; the proposed clause should separate normative requirements (shall) from informative UI/implementation examples. 2026-02-09 04:04
S4-260180 (pdf)	[AIML_IMS-MED] Call flow for split inferencing	InterDigital Finland Oy	Comprehensive Summary of S4-260180: Call Flow for Split Inferencing Document Overview This change request proposes updates to the AIML call flow for split inferencing in IMS-based media services. It revises the previously agreed device inferencing call flow (S4aR260014) to accommodate split inferencing scenarios where AI model execution is partitioned between the UE and network-based DC AS (Data Channel Application Server). Main Technical Contributions 1. Split Inferencing Capability Indication Key Addition: - The UE now indicates split inferencing availability in the application request message sent to the MF (Media Function) when requesting the application list via the Bootstrap Data Channel (BDC) - This allows the network to understand the UE's capability to participate in distributed AI inference 2. Enhanced Application and Task Selection Application Metadata Enhancements: - Application-related metadata now includes: - Generic app information (description, app ID, URL) - AI-specific information including AI feature tags indicating AI requirements - AI task-related descriptions for user-informed selection Task Metadata: - AI task metadata is delivered with the application, potentially expressed as a task manifest - Task list presented to users includes annotations from AI task metadata - Execution endpoints supported by each task and subtask are now exposed to enable split inference decisions 3. Model Partitioning Framework Partitioning List Introduction: The CR introduces a comprehensive partitioning framework: Request Phase (Step 10): - UE requests both a model list and a partitioning list from DCAS - UE provides its capability metadata to enable appropriate partitioning options Partitioning Metadata Definition: The partitioning list/submodel partitioning metadata specifies: - Submodel identifiers - unique identification of model partitions - Execution endpoints - where each submodel executes (UE vs. network) - Input/output tensor characteristics - data interfaces between submodels - Operational characteristics - performance and resource requirements Download Phase (Step 12): - UE downloads both the model list and partitioning list corresponding to its capabilities 4. User-Driven Partition Selection Selection Criteria (Step 13): - User is presented with lists of both models and partitions supported by the UE - User selects desired AI model(s) and partition - Partition selection may be based on: - Load distribution preferences - Battery impact considerations - Other task execution preferences 5. Split Inference Configuration and Execution Configuration Phase (Step 14): - UE configures split inference with DCAS by selecting: - A specific model - A specific partition - From these selections, the corresponding submodel(s) to be executed are derived Server-Side Preparation (Step 15): - DCAS prepares the server-side execution context - DCAS registers the sub-model(s) and associated metadata with the selected partitioning Configuration Confirmation (Step 16): - DCAS indicates whether the requested configuration is accepted - DCAS confirms readiness to execute the server-side sub-model(s) Submodel Deployment (Steps 17-18): - Selected tasks/models and corresponding AI submodels are communicated to DCAS - UE downloads the AI submodel(s) corresponding to subtasks to be executed on the device side Execution (Step 19): - Tasks identified for split inference between UE and DCAS are executed in a distributed manner Key Differences from Device Inferencing The main distinctions from pure device inferencing include: Distributed execution model - inference split across UE and network Partitioning metadata - new information element defining how models are divided Negotiation phase - explicit configuration of split points and execution distribution Submodel management - separate handling of device-side and server-side model components Execution coordination - mechanisms for DCAS to prepare and confirm readiness for server-side execution Open Issues The document notes one FFS (For Further Study) item: - How device capabilities are sent to obtain an accurate list of models (noted after Step 6)	Proposal: We propose to add the following change to the base CR	manager: [Technical] The proposal introduces a “partitioning list/submodel partitioning metadata” but does not define how the UE and DCAS ensure cryptographic binding between the selected model version and the partition definition (to prevent mismatched or tampered submodels/partitions), which is essential once execution is split across trust domains. [Technical] Steps 10–16 add a negotiation/configuration phase but do not specify acceptance/rejection causes or fallback behavior (e.g., if DCAS rejects a partition, whether UE retries with another partition, falls back to device-only, or aborts), leaving the call flow incomplete for interoperable behavior. [Technical] The flow makes partition selection “user-driven” (Step 13), but split-point selection is typically constrained by latency, uplink bandwidth, privacy policy, and DCAS load; without normative constraints or policy control (operator/AS policy vs. user preference), the procedure risks being non-deployable or inconsistent with operator-controlled IMS service behavior. [Technical] The “execution endpoints supported by each task and subtask” and “operational characteristics” are introduced without clarifying whether these are static metadata, dynamically updated (e.g., DCAS load), or per-session; if dynamic, the document needs a mechanism for freshness/validity and update triggers. [Technical] The proposal adds UE capability indication to the MF during BDC application list request, but it is unclear why MF (rather than DCAS or an AIML-specific function) is the correct termination point for split-inference capability negotiation; this risks misplacing AIML-specific logic into a generic media function and creating architectural inconsistency. [Technical] The FFS “how device capabilities are sent to obtain an accurate list of models” is not a minor detail: capability exchange is foundational to Steps 10–12 (model/partition list derivation) and needs at least a baseline definition (capability categories, granularity, and privacy considerations) to avoid incompatible implementations. [Technical] The partitioning metadata includes “input/output tensor characteristics,” but the call flow does not address how tensor data is transported over the data channel (format, compression, quantization, framing) nor how interoperability is ensured between UE and DCAS runtimes. [Technical] Steps 17–18 state “Selected tasks/models and corresponding AI submodels are communicated to DCAS” while also saying UE downloads device-side submodels; it is ambiguous whether DCAS also downloads/hosts its submodels, whether they are pre-provisioned, or whether UE triggers DCAS-side model acquisition—this impacts feasibility and timing. [Technical] The proposal does not address session continuity and reconfiguration (e.g., UE mobility, radio degradation, battery drop) where the split point may need to change mid-session; without a re-negotiation/update procedure, split inference will be brittle in real networks. [Technical] No explicit handling is described for privacy/security constraints when tensors or intermediate features are sent to the network (which can leak sensitive information); the partitioning framework should at least indicate how privacy requirements influence allowable partitions. [Editorial] The contribution summary references “revises previously agreed device inferencing call flow (S4aR260014)” but does not clearly identify the exact spec clause(s), figure numbers, or step numbers being changed, making it hard to review consistency and impacts. [Editorial] Terminology is inconsistent/undefined (MF, BDC, DC AS/DCAS, “application request message,” “task manifest,” “partitioning list”); the CR should align with existing 3GPP term definitions and use one consistent acronym per entity. [Editorial] Several steps use non-normative phrasing (“may be based on…”, “potentially expressed as…”) for core interoperability aspects (task manifest, selection criteria), which should be tightened or explicitly scoped as informative to avoid ambiguous requirements. 2026-02-09 04:07
S4-260181 (pdf)	[AIML_IMS-MED] Negotiation messages	InterDigital Finland Oy	Summary of 3GPP Technical Document S4-260181 Document Overview This is a revision of S4aR260012 proposing additional details for negotiation messages and associated metadata in support of AI/ML-based media services (AIML_IMS-MED). The document provides JSON-formatted metadata examples and updates to align with the agreed call flow from S4aR260014. Main Technical Contributions 1. Negotiation Message Summary Table (Section A.4.2) The document introduces Table A4.2-1 which defines the complete set of negotiation messages for local inferencing call flows. Key updates include: AI_APPLICATION_DISCOVERY_REQUEST/RESPONSE: Discovery of AI/ML application families/types with optional UE capability filtering AI_APPLICATION_REQUEST/RESPONSE: Selection of specific AI/ML application with URN, returning application binary data and metadata CANDIDATE_MODELS_LIST_REQUEST/RESPONSE: Renamed from previous version, exchanges UE capabilities for list of candidate models AI_MODEL_SELECTION_REQUEST/RESPONSE: Model selection using URN(s), returning model binary data and metadata Each message is mapped to possible HTTP protocol operations (GET, POST, RESPONSE) and associated metadata parameters. 2. Metadata Information Definitions (Section A.4.3) A.4.3.1 Application Metadata Defines characteristics and requirements of AI/ML applications including: applicationIdentifier: URN-based identification taskList: Contains task type identifiers, supported task types (ASR, TTS, Translation) Performance constraints: maximumTaskInferenceLatency (milliseconds) minimumTaskInferenceAccuracy maximumLocalEnergyConsumption (joules) taskAccuracy (e.g., mAP score) taskOperationalCharacteristics: computeIntensity, memoryFootprint, latencySensitivity, energySensitivity associatedModels: List of models with modelName and modelDescription A.4.3.2 Endpoint Capabilities Metadata Separates capabilities into static and dynamic categories: Static Capabilities (fixed/infrequently changed): - endpointIdentifier - flopsProcessingCapabilities (peak compute in FLOPS) - macOpProcessingCapabilities (MAC operations) - supportedAiMlFrameworks - accelerationSupported (boolean) - supportedEngines (CPU, GPU, NPU) - supportedPrecision (FP32, FP16, INT8) Dynamic Capabilities (runtime-dependent): - availableMemorySize - currentComputeLoad - energyMode (Eco/Balanced/Performance) - batteryLevel - acceleratorAvailability This separation enables both long-term compatibility checks and short-term runtime optimization. A.4.3.3 Model Information Metadata Comprehensive model characterization including: Identification: modelIdentifier (URN), taskIdentifier (supports multi-task models) Model properties: modelSize (MB), format, formatVersion, framework, frameworkVersion Input/Output specifications: inputMediaIdentifier, inputType, inputShape outputIdentifier, outputType, outputShape, outputAccuracy Performance metrics: targetInferenceLatency (with hardwarePlatformIdentifier) flopsProcessingCapabilities macOpProcessingCapabilities energyEstimation (joules, platform-specific) Data types: modelDataType (Uint8, Float32, Float16) 3. Generic Negotiation Message Format (Section A.4.4) Defines a transport-protocol-independent message format for AI metadata exchange over data channels: Messages Container: - Array of Message objects (1..n cardinality) Message Data Type includes: - id: Unique identifier within data channel session scope - type: Message subtype enumeration: - CANDIDATE_MODELS_REQUEST - CANDIDATE_MODELS_RESPONSE - AI_APPLICATION_DISCOVERY_REQUEST/RESPONSE - AI_APPLICATION_REQUEST/RESPONSE - AI_MODEL_SELECTION_REQUEST/RESPONSE - payload: Type-dependent message content - sessionId: Associated multimedia session identifier - sendingAtTime: Wall clock timestamp (optional) This format provides flexibility for various transport protocols (e.g., HTTP) without imposing specific constraints. Key Design Principles Separation of concerns: Application, endpoint, and model metadata are independently defined Static vs. dynamic distinction: Enables efficient capability negotiation and runtime adaptation Protocol independence: Generic message format supports multiple transport options Comprehensive metadata: Covers functional, performance, energy, and accuracy requirements Multi-task support: Models can serve multiple AI/ML tasks Platform-specific metrics: Latency and energy measurements tied to hardware platforms	Extracted Proposals Based on my analysis of the document, there are no explicit proposals in this 3GPP contribution. The document contains a "Proposal" section (Section 3), but it describes what "is proposed to update" rather than using the standard proposal format (e.g., "Proposal 1:", "Proposal:", etc.). The text in Section 3 reads: "It is proposed to update the base CR by - defining the set of negotiation messages description corresponding to the local inferencing call flow - adding a description of the associated metadata for applications, endpoint capabilities, and AI/ML models. - adding a generic negotiation message format for AI metadata exchange" This is a descriptive statement about what the document proposes to do, rather than a formally numbered or formatted proposal that would typically appear in the "Conclusions" section of a 3GPP contribution. The document appears to be a Change Request (CR) that provides technical details and examples for negotiation messages and metadata, but does not contain explicitly formatted proposals following the standard 3GPP proposal conventions.	manager: [Technical] The contribution introduces a “transport-protocol-independent” container while simultaneously mapping each negotiation message to HTTP GET/POST/RESPONSE in Table A4.2-1; the spec needs a clear normative statement on whether HTTP is mandatory, optional, or merely an example, otherwise implementers will diverge. [Technical] Several metadata fields are not operationally well-defined for interoperability (e.g., `minimumTaskInferenceAccuracy`, `taskAccuracy`, `outputAccuracy`, `energyEstimation`): there is no mandated metric definition, dataset/benchmark reference, confidence interval, or measurement conditions, so endpoints cannot compare values consistently. [Technical] The static/dynamic capability split is sensible, but the document does not define update triggers, validity timers, or how dynamic values (e.g., `currentComputeLoad`, `acceleratorAvailability`, `batteryLevel`) are refreshed and correlated to a specific decision point, risking stale negotiation decisions. [Technical] Units and ranges are inconsistent or missing across key parameters (e.g., `flopsProcessingCapabilities`, `macOpProcessingCapabilities`, `availableMemorySize`, `currentComputeLoad`); without explicit units (GFLOPS vs FLOPS, bytes vs MB, load as % vs normalized), negotiation logic is ambiguous. [Technical] The proposal returns “application binary data” and “model binary data” in responses but does not specify integrity/authenticity mechanisms (hash, signature), versioning, or licensing/authorization checks; this is a major gap given executable/model distribution security requirements. [Technical] URN-based identifiers (`applicationIdentifier`, `modelIdentifier`, task identifiers) are introduced without defining the URN namespace, registration/ownership model, and collision handling; interoperability across vendors depends on a normative identifier scheme. [Technical] The message type list in A.4.4 includes `CANDIDATE_MODELS_REQUEST/RESPONSE`, while the summary table uses `CANDIDATE_MODELS_LIST_REQUEST/RESPONSE`; this naming mismatch will cause implementers to treat them as different procedures unless aligned. [Technical] The call-flow alignment claim (“agreed call flow from S4aR260014”) is not backed by explicit sequencing rules (e.g., whether discovery is mandatory before application request, whether model selection can be repeated, error handling); the procedure needs normative state machine/ordering constraints. [Technical] `sessionId` is described as “multimedia session identifier” but the document does not specify which 3GPP identifier it maps to (IMS dialog identifiers, SDP session, MSRP/RTC data channel association, etc.), making correlation across signaling planes unclear. [Technical] `sendingAtTime` uses “wall clock timestamp” without defining format (e.g., RFC 3339), time base, and clock synchronization assumptions; if used for ordering/latency, it will be unreliable across endpoints. [Technical] The endpoint capability fields overlap and may be contradictory (`accelerationSupported` boolean vs `supportedEngines` including NPU/GPU, plus `acceleratorAvailability` dynamic); rules are needed to resolve inconsistencies and define what “acceleration” precisely means. [Technical] Model I/O descriptors (`inputType`, `inputShape`, `outputType`, `outputShape`) lack a normative schema (tensor layout, channel order, sample rate for audio, language tags for ASR/TTS/translation), so two endpoints may “match” a model but still be incompatible. [Editorial] The document mixes “AIML_IMS-MED”, “AI/ML-based media services”, and “local inferencing call flows” terminology without a stable definition section; consistent naming and abbreviations are needed to avoid scope confusion. [Editorial] Several parameter names are verbose or inconsistent in style (`maximumTaskInferenceLatency` vs `targetInferenceLatency`, `modelDataType` values like `Uint8` vs typical `uint8`), and the JSON examples should be aligned to a single naming convention and enumerated value casing. 2026-02-09 04:08
S4-260182 (pdf)	[AI_IMS-MED] Adaptive Model Delivery	Nokia	Summary of S4-260182: Adaptive Model Delivery for IMS DC Applications 1. Introduction This contribution revises previous documents (S4-251799, S4aR250211) on adaptive model delivery, incorporating the agreed call flow for device inferencing from S4aR260014 (agreed in SA4#134). The work builds upon TR 26.927 which documented AI/ML model delivery procedures. 2. Discussion 2.1 Background and Motivation The document addresses the critical challenge of timely model delivery for UE-centric inference in IMS DC-based AI/ML applications. Key points: Real-time nature of multimedia communication sessions makes startup latency particularly problematic Delayed inference startup adversely affects QoE and service usefulness Adaptive model delivery can mitigate these challenges 2.2 Adaptive Model Delivery Concept Based on TR 26.927 clause 5.2.2.2: Reduces startup latency by delivering a smaller, lower precision but inference-ready model first Subsequently updates to higher precision through model updates Bit-incremental model update approach was evaluated in TR 26.847 clause 5.4 2.3 Reference Call Flows The document references two agreed high-level call flows: General AIML IMS DC Call Flow (from S4-252075) Key steps include: 1. MMTel service establishment 2. BDC establishment between UE and MF 3. DCSF creates DC application list based on subscription filter and UE static capabilities 4. Application list includes AI service information 5. User selects app based on AI service 6. App download via BDC 7. Task selection and model variant selection 8. ADC establishment 9. Three inferencing modes: Local, Remote, or Split Device Inferencing Call Flow (from S4aR260014) Detailed 15-step procedure including: - Application discovery with AI_APPLICATION_DISCOVERY_REQUEST/RESPONSE messages - Application metadata including AI feature tags and task descriptions - Task manifest delivery - Model selection and delivery via BDC or ADC - Support for task reselection during session 3. Technical Proposal 3.1 New Clause: AI/ML Model Delivery to DCMTSI Client 3.1.1 General Model Delivery Procedure Figure X.X-1: Basic Model Delivery over IMS DC 14-step procedure: 0. UE1 registers to IMS with AI/ML capability indication 1. MMTEL session establishment 2. IMS AS allocates DC resources 3. Session established between UE1 and UE2 4. Bootstrap Data Channel (bDC) establishment 5. DCSF creates subscriber-specific application list 6. Application list delivery over bDC 7. App selection and download with app manifest (includes inference tasks and model lists) 8. UE2 side DC procedures 9-10. Application data channel establishment with DC AS 11-12. Model selection and delivery (from DC AS or DCAR via DCSF) 13. Media exchange over MMTEL session 14. Inference execution on local or remote media 3.1.2 Adaptive Model Delivery Procedure Figure X.Y-2: Adaptive Model Delivery over IMS DC Enhanced procedure building on basic delivery: Steps 1-10: Same as basic model delivery, with lower precision model selection in step 10 Step 11: Request for updatable model via MF Steps 12a/12b: Model delivery from either: - Option a: DCAR via DCSF - Option b: DC-AS Step 13: Model download to UE Step 14: Inference loop starts and continues Step 15: UE requests model update via MF Steps 16a/16b: Model update delivery from either: - Option a: DCAR via DCSF - Option b: DC-AS Step 17: Model update download via MF Step 18: UE applies model update to initial model Step 19: Inference continues with potential for further updates 3.2 Key Technical Features Two-stage delivery: Initial lower precision model followed by precision updates Dual source support: Models and updates can be sourced from either DCAR (via DCSF) or DC-AS Continuous inference: Inference can continue while model updates are applied Flexible model selection: Selection can be performed by UE, MF, or DC AS Session-aware: Procedure integrated with IMS DC session lifecycle Editor's Notes and Open Issues The referenced S4aR260014 document contains an Editor's Note indicating: - Whether MF needs to understand AI task semantics requires clarification (FFS) - Application type handling needs clarification - Large model handling procedures need clarification	Extracted Proposals Based on my analysis of the document, there are no explicit proposals in this contribution. The document contains sections for "Discussion" and "Proposal" (Section 3), but Section 3 only states "It is proposed to add the following changes to the base CR of AIML_IMS-MED" followed by technical content describing procedures and call flows. This is a description of proposed changes rather than a formally formatted proposal statement. The document appears to be a technical contribution discussing adaptive model delivery procedures, but it does not contain any text explicitly formatted as: - "Proposal X: " - "Proposal X. " - "Proposal: " - "Proposal. " - "Proposal " or similar variations that would indicate a formal proposal statement.	manager: [Technical] The adaptive procedure introduces “request for updatable model via MF” (step 11) and “UE requests model update via MF” (step 15) but does not define any new IMS DC/MF protocol messages, parameters, or mapping to existing procedures, so the flow is not implementable or verifiable against TS 26.114/26.927-derived mechanisms. [Technical] The contribution is inconsistent on the transport for model delivery/updates: basic flow says delivery via BDC or ADC (and “from DC AS or DCAR via DCSF”), while adaptive steps 12/17 say “download via MF”; it must be clarified whether MF is in the media path, control-only, or simply an endpoint label, otherwise it conflicts with the IMS DC architecture where bDC/ADC are the defined data channels. [Technical] “Lower precision model” and “bit-incremental model update” are referenced, but no normative constraints are given on model compatibility (e.g., base model ID, update applicability, versioning, delta format, rollback), risking interoperability failures when applying updates (step 18). [Technical] The proposal claims “inference can continue while model updates are applied,” but does not specify atomicity/synchronization requirements (e.g., when the new model becomes active, how to avoid mixed-parameter inference, buffering, or dual-model execution), which is critical for real-time IMS sessions. [Technical] “Selection can be performed by UE, MF, or DC AS” is too open-ended without a defined decision procedure and signaling of the selected variant; this creates ambiguous authority and potential mismatch between what is requested, delivered, and executed (especially with two sources: DCAR vs DC-AS). [Technical] Dual-source delivery (DCAR via DCSF vs DC-AS) lacks rules for precedence, consistency, and trust (e.g., how UE validates that an update from DCAR matches the base model from DC-AS), which is essential for security and correctness. [Technical] The flow mixes UE1/UE2 roles but does not clearly state which UE performs inference and which endpoints receive model delivery; step 8 (“UE2 side DC procedures”) is particularly unclear and could contradict the stated “UE-centric inference” objective. [Technical] The initial steps mention “UE1 registers to IMS with AI/ML capability indication” but no capability container, registration mechanism, or reference to an existing IMS/SDP/DC capability exchange is provided; without this, DCSF filtering and app list creation (step 5/6) is underspecified. [Technical] The proposal does not address session continuity and reselection interactions already mentioned from S4aR260014 (task reselection during session) with adaptive updates—e.g., what happens to pending updates when task/model changes mid-session. [Technical] Large model handling is explicitly FFS in the referenced editor’s note, yet adaptive delivery is motivated by startup latency; without at least a baseline for chunking/resume, partial delivery, or caching, the proposed procedure does not resolve the core latency problem in realistic model sizes. [Editorial] The contribution uses inconsistent terminology and acronyms (MF, DC-AS, DC AS, DCAR, DCSF, bDC/BDC, ADC, DCMTSI client) without a definitions section or consistent casing, making it hard to align with existing spec terms. [Editorial] Step numbering is inconsistent (basic flow includes step “0” and adaptive has “12a/12b, 16a/16b”), and references to “Figure X.X-1 / X.Y-2” are placeholders; for a spec contribution/CR, the exact clause/figure numbers and stable step numbering are needed. [Editorial] The document cites TR 26.927 and TR 26.847 for concepts but does not identify the target normative specification and exact clauses to be added/modified (e.g., TS 26.114), which is necessary to assess consistency and avoid duplicating or contradicting existing normative text. 2026-02-09 04:08
S4-260183 (pdf)	[AIML_IMS-MED] Negotiation messages for split inferencing	InterDigital Finland Oy	3GPP Change Request Summary: Split Inferencing Negotiation Messages Document Overview This contribution (S4-260183) proposes additional messages and associated metadata to enable split inferencing for AI/ML applications in IMS-based media services. It builds upon and updates contribution S4aR260009, with specific focus on defining the differences between device inferencing and split inferencing scenarios. Main Technical Contributions 1. Negotiation Message Summary Table (Section A.4.2) Key Addition: Introduction of Table A4.2-1 summarizing all negotiation messages for split inferencing call flows. The table defines the following message pairs with their associated metadata: Application Discovery Messages: `AI_APPLICATION_DISCOVERY_REQUEST` (HTTP GET) - carries family/type of AI/ML applications `AI_APPLICATION_DISCOVERY_RESPONSE` (HTTP RESPONSE) - returns list of AI/ML applications Application Selection Messages: `AI_APPLICATION_REQUEST` (HTTP GET) - carries URN of selected application `AI_APPLICATION_RESPONSE` (HTTP RESPONSE) - returns selected application binary and metadata Split Model List Messages: `MODELS_LIST_REQUEST` (HTTP POST) - carries UE capabilities `MODELS_LIST_RESPONSE` (HTTP RESPONSE) - returns candidate AI/ML models and partitionings Split Inference Configuration Messages: `AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST` (HTTP POST) - carries URN(s) of selected models and submodel partitioning `SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE` (HTTP RESPONSE) - returns selected models/submodels binary and metadata Model Selection Messages: `AI_MODEL_SELECTION_REQUEST` - carries URN(s) of selected models/submodels `AI_MODEL_SELECTION_RESPONSE` - returns selected models/submodels binary and metadata 2. Common Metadata Information (Section A.4.3) A.4.3.1 Application Metadata Defines characteristics and requirements of applications and associated AI/ML media processing tasks Includes performance, accuracy, energy constraints, and supported models New for split inferencing: Indicates supported split and remote inference modes and whether model supports partitioning A.4.3.2 Endpoint Capabilities Metadata Introduces separation between static and dynamic capabilities: Static capabilities: Fixed or infrequently changing properties Processing architecture Peak compute capacity Supported AI/ML frameworks Available execution engines (CPU, GPU, NPU) Supported numerical precisions Hardware acceleration features Dynamic capabilities: Runtime-dependent characteristics Available memory Current compute load Energy mode Battery level Accelerator availability This separation enables both long-term compatibility checks and short-term runtime optimization. A.4.3.3 Model Information Metadata Describes functional, structural, and performance characteristics of AI/ML models Includes supported tasks, input/output specifications, resource requirements, latency/energy metrics New: Indicates whether model supports partitioning 3. Split Inferencing-Specific Metadata (Section A.4.3.4) A.4.3.4.1 Submodel Partitioning Metadata Major technical contribution: Comprehensive metadata structure for describing model partitioning for split inferencing. Key metadata elements: \| Field \| Description \| \|-------\|-------------\| \| `submodelsPartitioningIdentifier` \| URN identifying the partitioning configuration \| \| `submodelComposition` \| Array of submodel objects (1..N) \| \| `submodelIdentifier` \| URN of individual submodel \| \| `endpointType` \| Execution location (UE, SERVER, EDGE, CLOUD, CUSTOM) \| \| `subtaskTypeIdentifier` \| Subtask type supported by submodel \| \| `submodelType` \| Role in pipeline (HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL) \| \| `size` \| Submodel file size in MB \| \| `submodelInputs/Outputs` \| Tensor specifications (ID, type, shape) \| \| `outputAccuracy` \| Trained accuracy percentage \| \| `subModelDataType` \| Data type (Uint8, Float32, Float16) \| Tensor specifications include: - `tensorID` - identifier for input/output tensor - `tensorType` - data type (integer, float32, float16) - `tensorShape` - tensor dimensions (e.g., (1,3,300,300)) JSON Example provided: Complete example showing HEAD submodel on UE and TAIL submodel on DCAS for object detection task. 4. Negotiation Message Format (Section A.4.5) Generic message structure defined: Table 5: AI Metadata Messages Format `messages`: Array of Message objects (1..n) Each message follows Message data type specification Table 6: Metadata Message Data Type \| Field \| Type \| Cardinality \| Description \| \|-------\|------\|-------------\|-------------\| \| `id` \| string \| 1..1 \| Unique identifier within data channel session \| \| `type` \| number \| 1..1 \| Message subtype identifier \| \| `payload` \| object \| 1..1 \| Type-dependent message payload \| \| `sessionId` \| string \| 1..1 \| Associated multimedia session identifier \| \| `sendingAtTime` \| number \| 0..1 \| Wall clock transmission time \| Defined message types: - `MODELS_LIST_REQUEST` - `MODELS_LIST_RESPONSE` - `SPLIT_INFERENCE_CONFIGURATION_REQUEST` - `AI_APPLICATION_DISCOVERY_REQUEST` - `AI_APPLICATION_DISCOVERY_RESPONSE` - `AI_APPLICATION_REQUEST` - `AI_APPLICATION_RESPONSE` - `AI_SERVER_CONFIGURATION_REQUEST` - `AI_SERVER_CONFIGURATION_RESPONSE` - `AI_MODEL_SELECTION_REQUEST` - `AI_MODEL_SELECTION_RESPONSE` Summary of Changes The CR introduces three main changes: Complete message taxonomy for split inferencing negotiation with HTTP protocol mapping Comprehensive metadata definitions covering applications, endpoint capabilities, models, and split-specific partitioning information Generic message format for AI metadata exchange over data channels with extensible type system The contribution enables complete end-to-end split inferencing capability negotiation between UE and remote endpoints, with particular emphasis on submodel partitioning metadata that allows flexible distribution of AI/ML model execution across network nodes.	Extracted Proposals Proposal It is proposed to update the base CR by defining the set of negotiation messages corresponding to the inferencing call flow adding a description of the associated metadata for applications, endpoint capabilities, AI/ML models, and submodel partitioning. adding a generic negotiation message format for AI metadata exchange including negotiation message between local and remote endpoints for split inferencing.	manager: [Technical] The contribution mixes two transport paradigms without a clear normative mapping: messages are described as HTTP GET/POST in Table A4.2-1 while Section A.4.5 defines a generic “AI metadata exchange over data channels”; the spec needs an explicit statement whether these are alternative transports, layered (HTTP payload carried in data channel), or separate procedures, otherwise interoperability will break. [Technical] Message taxonomy appears internally inconsistent: Table A.4.2 introduces `AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST` and `SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE`, while Section A.4.5 lists `SPLIT_INFERENCE_CONFIGURATION_REQUEST` (without “AI_”) and no matching `..._AI_RESPONSE`; naming and pairing must be aligned and uniquely defined. [Technical] There is functional overlap/ambiguity between `AI_MODEL_SELECTION_REQUEST/RESPONSE` and `AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST/RESPONSE` (both “carry URN(s) of selected models/submodels” and return binaries/metadata); the procedure needs a clear separation of purpose (e.g., selection vs partitioning vs download) and state machine/ordering constraints. [Technical] The proposal introduces `AI_SERVER_CONFIGURATION_REQUEST/RESPONSE` in Section A.4.5 but it is not described in the negotiation summary table nor in the earlier metadata sections; either add the missing procedure/metadata or remove it to avoid undefined behavior. [Technical] The “type” field is defined as a number (message subtype identifier) but no registry/enum values are provided for the listed message types; without normative numeric assignments and extensibility rules, independent implementations cannot interoperate. [Technical] `sessionId` is described as “multimedia session identifier” but there is no definition of which IMS identifier is used (SIP Call-ID, dialog identifiers, MSRP session, etc.) and how it binds to the data channel/HTTP exchange; this is critical for correlating negotiation to the correct media session. [Technical] The endpoint execution location values (`UE, SERVER, EDGE, CLOUD, CUSTOM`) are not tied to any 3GPP-defined entity (e.g., UE, IMS AS, MEC, DN) and “CUSTOM” is non-interoperable; the spec should either reference 3GPP architecture terms or define discovery/addressing and trust implications. [Technical] Partitioning metadata is underspecified for correctness: `submodelType` enumerates `HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL`, which hard-limits the number of partitions and cannot represent arbitrary N-way splits; it should be an ordered list with an index/graph structure rather than fixed labels. [Technical] Tensor metadata is inconsistent: `subModelDataType` uses `Uint8, Float32, Float16` while tensorType is described as `integer, float32, float16`; the data type vocabulary must be unified and should include signedness/bitwidth and quantization parameters if Uint8 is allowed. [Technical] `outputAccuracy` as a single “trained accuracy percentage” is not meaningful across tasks/datasets and is not comparable between partitionings; if kept, it needs a defined metric, evaluation dataset identifier, and conditions, otherwise it risks misleading selection logic. [Technical] Capability metadata separation into static/dynamic is reasonable, but the proposal lacks update/refresh rules (e.g., when dynamic capabilities are reported, validity timers, thresholds) and lacks units for key fields (compute capacity, memory, load), making negotiation non-deterministic. [Technical] The messages that “return selected application binary and metadata” / “return selected models/submodels binary and metadata” do not specify integrity/authenticity mechanisms (hash, signature, provenance) or versioning; for executable model binaries this is a security and lifecycle gap. [Editorial] Several identifiers are inconsistent in casing and spelling (`sendingAtTime` vs typical `sentAtTime`; `submodelsPartitioningIdentifier` vs `submodelPartitioningIdentifier`; `subModelDataType` camel-case mismatch), which will cause implementer confusion in JSON schema. [Editorial] The document references “Table 5” and “Table 6” in Section A.4.5 while earlier it introduces “Table A4.2-1”; table numbering should be consistent with the annex/section numbering conventions of the target specification. [Editorial] The contribution repeatedly uses “HTTP RESPONSE” as a message name rather than a defined response message type; if the intent is to define application-layer messages, the response should be named consistently (e.g., `..._RESPONSE`) and HTTP status/error handling should be specified separately. 2026-02-09 04:09
S4-260184 (pdf)	[AI_IMS_MED]On Application Manifest for AIML applications	Nokia, Samsung Electronics Co., Ltd	Summary of S4-260184: Application Manifest for AIML Applications 1. Introduction This contribution proposes IMS Data Channel (DC) application metadata for AI/ML applications. The document merges metadata elements from S4aR250213 and S4aR250208 based on previous RTC SWG discussions and email exchanges. It addresses comments from RTC Telco Post SA4#134-2 regarding the origin and transfer of the AIML application manifest. 2. Main Technical Contributions 2.1 General Framework for AI/ML Support over Data Channel The contribution defines AI/ML DC applications as IMS DC applications that: - Interact with AI/ML models (e.g., performing inference on UE) - Communicate AI/ML data - Support different inference paradigms: local inference, remote inference, and split inference Key architectural elements: - DCSF (via MF) provides policy and subscription-appropriate data channel applications to UE - DC Application Repository (DCAR) stores verified data channel applications - DCSF downloads applications from DCAR for distribution to UE - DCMTSI client uses metadata to select appropriate toolchains or execution environments 2.2 Base Application Manifest Structure The manifest contains essential information for AI/ML DC applications: Core elements: - baseUrl: URI template for downloading models with format: `baseurl/$taskId$/$version$/$framework$/$subtask$/$variant$/model.$format$` - tasks: Array of AI tasks enabled by the application - taskParameters: Configuration parameters for different conditions - models: Array of AI/ML model objects with metadata Task-level metadata includes: - taskId: Unique identifier - taskName/description: Human-readable task identifier (e.g., "Speech-to-speech Translation") - version: Task version number - capabilityIndex: Minimum capability requirements - executionCandidate: Supported endpoint locations (e.g., UE or MF) 2.3 Task Input/Output Specification Task inputs (taskInputs): - taskInputId: Unique identifier - media-type: Input media type - route-to: Specifies subtaskInputId for data routing Task outputs (taskOutputs): - taskOutputId: Unique identifier - media-type: Output media type - from: Specifies subtaskOutputId for output data origin 2.4 Model Metadata Each model object contains: - id: Unique model identifier - version: Model version/variant - capabilityIndex: Minimum capability requirements - url: Model download location - latency: Maximum latency requirement (milliseconds) - accuracy: Minimum accuracy requirement (metrics/value/direction - FFS) 2.5 Subtask Metadata (Extension Parameters) For tasks comprising multiple subtasks, the manifest includes detailed subtask information: Subtask-level parameters: - id: Unique subtask identifier - function: Description of subtask function - capabilityIndex: Capability requirements (matches AI model capability) - executionTarget: Intended endpoint location - executionFallback: Alternative endpoint when primary unavailable Subtask inputs (subtaskInputs): - subtaskInputId: Unique identifier - pipe-type: Logic for multiple data inputs (0=first available, 1=wait for all) - media-type: Input media type - from: Origin subtaskOutputId or taskInputId Subtask outputs (subtaskOutputs): - subtaskOutputId: Unique identifier - media-type: Output media type - route-to: Destination subtaskInputId or taskOutputId Subtask AI model parameters: - id, capabilityIndex, url, latency, accuracy (as per main model metadata) - contextSize: Maximum input data amount the model can process (typically in tokens) 3. Open Issues Several aspects remain FFS (For Further Study): - Editor's Note: Definition of AI/ML task may be needed (referencing TS 26.927) - Editor's Note: Whether all fields in tables are needed and their definitions - Editor's Note: Capability index definition and usage - Editor's Note: Clear definition of accuracy metrics - Editor's Note: Pipe-type parameter needs further clarification - Model metadata specification alignment with TR 26.927 4. Document Type This is a text proposal for the AI_IMS_MED work item, proposing new clauses (marked as "All New Text") to be added to the base CR.	Extracted Proposals Based on my review of the document, there are no explicit proposals in this 3GPP contribution. The document contains a "Proposal" section (Section 2), but it does not contain text formatted as "Proposal X:", "Proposal:", or similar proposal markers. Instead, Section 2 only states: "We propose to add the following change to the base CR of AI_IMS_MED." This is followed by technical specification text describing changes to be made to the document, but no formal numbered or unnumbered proposals are listed. The document does not have a "Conclusions" section that would typically contain a list of proposals.	manager: [Technical] The contribution introduces new entities and flows (DCSF via MF, DCAR, DCMTSI client selecting toolchains) without anchoring them to existing normative architecture and reference points in the target spec; it is unclear which functions are already defined vs newly assumed, risking inconsistency with IMS DC framework definitions. [Technical] The manifest’s `baseUrl` URI template (`baseurl/$taskId$/$version$/$framework$/$subtask$/$variant$/model.$format$`) hard-codes path semantics and variables that are not defined elsewhere (e.g., $framework$ , $variant$ , $format$ ) and conflicts with also having per-model `url`; the spec needs one authoritative download addressing scheme and clear precedence rules. [Technical] The proposal mixes “task version” and “model version/variant” but does not define compatibility rules (e.g., which model versions satisfy a given task version), nor how the UE/MF selects among multiple models for the same task/subtask. [Technical] `capabilityIndex` is used at task, subtask, and model levels but is explicitly FFS; without a defined capability taxonomy, comparison method, and negotiation procedure, the manifest cannot be used interoperably and will lead to vendor-specific interpretation. [Technical] Execution placement fields are inconsistent and under-specified: task-level `executionCandidate` (UE or MF) vs subtask-level `executionTarget`/`executionFallback`; there is no defined resolution algorithm when task and subtask preferences conflict, nor how split inference is represented end-to-end. [Technical] The routing model (`route-to`, `from`) references IDs across task/subtask scopes (e.g., “subtaskOutputId or taskInputId”) but does not define namespace rules, uniqueness constraints, or validation, making it ambiguous and error-prone for complex graphs. [Technical] `pipe-type` semantics (“0=first available, 1=wait for all”) are too simplistic for multi-input synchronization (ordering, buffering, timeouts, partial availability) and lack normative behavior, which is critical for real-time media/DC processing. [Technical] Latency is specified as “maximum latency requirement (milliseconds)” per model, but there is no definition of what latency measures (inference only vs end-to-end including transport), nor how it is enforced/used in selection across UE vs MF execution. [Technical] Accuracy is left FFS (“metrics/value/direction”) yet is included as a selection parameter; without a standardized metric definition per task type (e.g., WER, BLEU, MOS proxy), it cannot be compared across models and undermines interoperability. [Technical] `contextSize` is described as “typically in tokens,” which is model/framework-specific; the manifest needs explicit units and encoding assumptions (tokens vs bytes vs samples) and how the sender/receiver ensures compliance. [Technical] Media typing fields (`media-type`) are introduced for inputs/outputs but there is no linkage to existing 3GPP media type registries/usages (e.g., MIME types, codec parameters, SDP mapping), nor how DC payload formats are negotiated. [Editorial] Field naming is inconsistent (e.g., `media-type` vs typical JSON `mediaType`, `route-to` vs `routeTo`, `pipe-type`), and the document does not state the serialization format (JSON/YAML/XML) or schema conventions, which is necessary for implementable normative text. [Editorial] Several key terms are introduced without definition or reference (e.g., “AI/ML DC application,” “task,” “subtask,” “toolchains/execution environments”), and the editor’s notes suggest reliance on TS/TR 26.927 but no concrete normative references or alignment text is provided. [Editorial] The contribution claims it “merges metadata elements” from prior documents, but does not provide a change log, rationale for conflicts resolved, or explicit mapping to the target clause structure (“All New Text” only), making review and integration into the base CR difficult. 2026-02-09 04:09
S4-260185 (pdf)	[AI_IMS_MED] Call flow for split inferencing loop	InterDigital Finland Oy	Summary of S4-260185: Call Flow for Split Inferencing Loop Document Metadata Source: Interdigital Finland Oy Meeting: TSG-SA4 Meeting #135, Goa, India (9-13 February 2026) Work Item: AIML_IMS-MED Type: Change Request / Text Proposal Main Technical Contribution This contribution proposes a call flow for split inferencing operations between the UE and DCAS (Data Collection and Analytics Server), building upon previous work in TR 26.927 and earlier contributions. Split Inferencing Architecture The proposed call flow describes a collaborative inference execution model where: - The UE and DCAS jointly execute an inference task - The inference workload is split between the two entities - Intermediate inference results are exchanged over the user plane - Communication is facilitated through the MF (Media Function) Proposed Call Flow Steps The text proposal adds the following procedural steps: Configuration Phase UE and DCAS (via MF) configure intermediate data format parameters over ADC (Application Data Collection) Parameters include tensor characteristics and compression profile identifiers UE-Side Processing UE captures input media data UE executes its inference subtask using the selected UE submodel UE generates intermediate data for continuation at DCAS Data Exchange UE transmits intermediate data to DCAS (via MF) according to configured format DCAS-Side Processing DCAS executes its inference task on received intermediate data DCAS uses selected Remote submodel DCAS generates processed media data based on inference results Result Delivery DCAS transmits processed media data to UE (via MF) UE renders the final processed media data Technical Significance This proposal enables distributed AI/ML inference for media processing, allowing workload distribution between device and network based on computational capabilities, latency requirements, and network conditions. The standardization of intermediate data format parameters ensures interoperability in split inference scenarios.	Proposal: Add the following text to the base CR: The UE and the DCAS (via the MF) configure the exchanged intermediate data format parameters over ADC The intermediate data format parameters may include information on tensors characteristics, including the compression profile identifiers. The UE captures input media data and provides it to inference task running on the UE. The UE executes the UE inference subtask on the input media data using selected UE submodel and generates intermediate data for continuation of the inference process at the DCAS. The UE transmits the intermediate data to DCAS (Via MF) according to the selected intermediate data format. DCAS executes inference task on the received intermediate data using selected Remote submodel and generates the processed media data based on the inference results. DCAS transmits the processed media data to UE (via MF). The UE renders the final processed media data.	manager: [Technical] The proposal introduces “intermediate data format parameters” (tensor characteristics, compression profile identifiers) but does not define where these are specified (e.g., codec/format registry, SDP attributes, ADC schema) or how interoperability is ensured across vendors; without normative parameter definitions and negotiation rules, the call flow is not implementable. [Technical] The use of “ADC (Application Data Collection)” for configuring split-inference media/tensor exchange is unclear and likely mis-scoped: ADC is typically about data collection/analytics, not real-time media processing session negotiation; the contribution needs to justify why ADC is the right control plane versus existing IMS/SDP or service-specific signaling. [Technical] The call flow relies on “MF (Media Function)” as a relay for intermediate tensors and processed media, but does not specify whether MF is in the media path (RTP) or an application proxy, nor how it handles non-media tensor payloads (e.g., RTP payload format, framing, congestion control), creating a major gap for user-plane transport. [Technical] No session establishment details are provided (e.g., IMS offer/answer, media descriptions, directionality, ports, security), so it is impossible to map the proposed steps onto actual 3GPP procedures; at minimum, the contribution should indicate which existing procedures are reused and what new signaling is required. [Technical] The proposal assumes a fixed UE→DCAS intermediate data direction and DCAS→UE processed media return, but split inference can be iterative/bi-directional; the “split inferencing loop” term implies multiple exchanges, yet only a single pass is described, risking mismatch with the intended feature. [Technical] Latency, jitter, and synchronization requirements are not addressed (e.g., timestamping of intermediate tensors relative to captured media, buffering, reordering), which is critical for real-time media processing and for any MF-mediated transport. [Technical] There is no error handling or fallback behavior (e.g., DCAS unreachable, intermediate data decode failure, compression profile mismatch, packet loss), which is essential for a normative call flow and impacts service continuity. [Technical] Security and privacy aspects are missing: intermediate tensors may leak sensitive content; the contribution should specify protection (e.g., SRTP/DTLS, keying, authorization of DCAS, integrity) and whether MF can access plaintext or must be end-to-end protected. [Technical] The selection of “UE submodel” and “Remote submodel” is mentioned but not defined (who selects, based on what capabilities/conditions, how model identifiers/versions are negotiated, and how consistency is ensured), which is a core part of split inference interoperability. [Technical] The role and definition of DCAS in the IMS-MED context is not clarified (is it an AF, an application server, a media resource function, or an external analytics server), leading to architectural ambiguity and potential inconsistency with existing SA4 functional entities. [Editorial] The contribution references TR 26.927 but does not cite specific clauses or align terminology; terms like “processed media data,” “intermediate data,” and “tensor characteristics” should be harmonized with existing definitions to avoid introducing parallel vocabularies. [Editorial] The text reads as a high-level narrative rather than a CR-ready specification change: it lacks explicit target spec/clause numbers, exact proposed text, and clear indication of additions/deletions, making it difficult for SA4 to assess normative impact. [Editorial] “Configuration Phase … over ADC” and “via MF” are repeated without clarifying interfaces (reference points) and message names; the call flow would benefit from a diagram or step numbering tied to concrete protocol exchanges rather than abstract actions. 2026-02-09 04:12
S4-260189 (pdf)	[AIML_IMS-MED] AI intermediate data format	InterDigital Finland Oy	Comprehensive Summary of S4-260189: AI Intermediate Data Format 1. Introduction and Scope This contribution proposes defining an intermediate data carriage format for AI/ML split inferencing, derived from TR 26.927. The document introduces: A description of intermediate data Definition of intermediate data structure An example format structure (proposed as an Annex) including: AI Parameter Set (AIPS) specifying AI-related parameters TLV encapsulation for both AIPS and intermediate data 2. Technical Background and Motivation 2.1 Split Inferencing Requirements Split inferencing, approved and mandated in 5G, is a key objective of the work item. The solution must support: Different input data types producing intermediate data Multiple media modalities (video, audio, text) without restriction to one An agnostic transport format for 5G use cases 2.2 Source and Derivation The proposed format is derived from: User-plane data structure in Clause 6.8 of TR 26.927 Addition of a partition identifier (previously "split-point identifier") from Clause 6.6 of TR 26.927 The partition identifier enables selection of pre-configured partitioning negotiated during configuration phase 2.3 Dynamic Nature of Tensor Characteristics Tensor characteristics are not static and may change dynamically based on: Resolution of input inference Content of input inference These characteristics must be conveyed through the user plane for accurate interpretation at the receiving end. 3. Main Technical Contributions 3.1 Intermediate Data Definition (Clause X.X.1) Key Definition: Intermediate data refers to output tensor(s) computed by a sub-model executing an inference subtask up to a defined and negotiated partitioning, transferred between endpoints (device, edge, server) to serve as input to a subsequent sub-model. Characteristics: - May be compressed and/or encoded before transmission - Processing shall not alter semantics required by receiving sub-model - Non-persistent, dynamic, and context-dependent - Characteristics (shape, size, format) vary as function of: - Input data - Selected inference partitioning - Runtime configuration 3.2 Intermediate Data Structure (Clause X.X.2) Configuration Stage: Structure defined and exchanged at configuration stage, referred to as partitioning configuration. Dynamic Factors: - Input media size/resolution changes may alter tensor shape - Selected partitioning identifies active partitioning among pre-configured options - Selected compression profile (algorithm and parameters) optimized for efficiency Required Information in Format: - Tensor identifier - Inferred tensor length (derived from current tensor shape) - Partitioning identifier (referencing negotiated configuration) - Compression profile identifier (indicating compression method) Solution: AI Parameter Set (AIPS) defined to capture information applicable to all tensors and associated data. 3.3 AI Parameter Set (AIPS) Definition (Annex X.X.1-3) Purpose: Carries metadata (tensor metadata) associated with intermediate data payload. AIPS Lifetime: - Starts: When decoder first receives and parses AIPS TLV unit - Ends: When: - New AIPS with same or different `ai_parameter_set_id` is received - New session begins - Decoder is reset - Number of tensors or tensor shape changes AIPS Fields (Table X.X.13-1): \| Field \| Meaning \| \|-------\|---------\| \| `ai_parameter_set_id` \| Unique ID of AIPS \| \| `split_point_id` or `partition_id` \| Key identifier of split point/partition \| \| `num_tensors` \| Number of tensors \| \| For each tensor: \| \| \| - `tensor_id` \| Tensor identifier \| \| - `dtype` \| Data type of tensor data \| \| - `rank` \| Number of dimensions \| \| - For each dimension: `dimension` \| Size of dimension \| \| - `compression_profile_id` \| Compression profile identifier \| 3.4 TLV Encapsulation (Clause X.X.2-4) TLV Message Components: - Type: Indicates payload information - Length: Value of payload - Payload: Data TLV Unit Types (Table X.X.24-1): \| Type Value \| Description \| \|------------\|-------------\| \| 0 \| Reserved \| \| 1 \| AI Parameter Set data (AIPS) \| \| 2 \| Intermediate data \| \| 3-255 \| Undefined \| Encapsulation Scenarios: AIPS Data Encapsulation (X.X.24.2): TLV unit encapsulating AIPS data as defined in clause 1.3 Single Tensor Encapsulation (X.X.24.3): TLV unit value comprises AIPS identifier and tensor data Tensor data includes: tensor identifier, tensor length (optional), tensor payload data Tensor payload contains flattened byte array, possibly compressed per AIPS compression profile ID Multiple Tensors Encapsulation (X.X.24.4): TLV unit encapsulating more than one tensor data 4. Key Changes from Previous Version Terminology Updates: - "Split point" terminology changed to "partitioning" throughout - "Head sub-model" and "Tail sub-model" terminology refined to "sub-model" and "subsequent sub-model" Structural Additions: - Addition of partition identifier (highlighted as new in original document) - Formalization of AIPS lifetime management - Complete TLV encapsulation framework 5. Proposal for Integration The document proposes: Incorporate changes 1 and 2 into a base CR Include change 3 (AIPS and TLV details) in a dedicated annex for illustration purposes	Extracted Proposals Proposal We propose to add the following changes to a base CR. Incorporate change 1 and 2 into a base CR Include change 3 in a dedicated annex for illustration purpose	manager: [Technical] The contribution asserts “split inferencing, approved and mandated in 5G” without citing the specific WI/TS/TR scope and normative home; this is risky because S4 needs a clear mapping to the target spec (e.g., TS 26.xxx vs TR 26.927) and whether the format is normative or informative. [Technical] The proposal introduces an intermediate-data TLV format but does not define how it is carried over 5GS user plane (PDU session type, RTP/UDP/IP, QUIC, etc.) nor how the receiver identifies framing boundaries; without a transport binding and packetization rules, interoperability is not achievable. [Technical] The “AIPS lifetime ends when number of tensors or tensor shape changes” conflicts with the earlier claim that tensor characteristics are dynamic and must be conveyed; if shapes can change frequently, the design needs an explicit per-access-unit signaling mechanism (or delta updates) rather than an AIPS that becomes invalid on common runtime events. [Technical] The document mixes “split_point_id” and “partition_id” and states the partition identifier is “previously split-point identifier” but does not define uniqueness scope, negotiation procedure, or collision handling across sessions/models; this will cause ambiguity when multiple partition configurations exist or when reusing IDs across models. [Technical] The AIPS field set is incomplete for correct tensor interpretation: it lacks explicit endianness, alignment, tensor layout/order (e.g., NCHW/NHWC), quantization parameters (scale/zero-point), and any indication whether `dtype` refers to pre- or post-decompression representation. [Technical] Compression signaling is underspecified: `compression_profile_id` is referenced but no registry, negotiation, parameter carriage, or error behavior is defined, and it is unclear whether compression applies per tensor, per TLV unit, or per byte range (especially for “multiple tensors encapsulation”). [Technical] The “tensor length (optional)” is problematic because length is essential for parsing concatenated tensors and for skipping unknown tensors; if omitted, the receiver must derive it from shape×dtype, which fails when compression is used or when padding/strides exist. [Technical] The TLV Type space (1=AIPS, 2=Intermediate data, 3–255 undefined) lacks versioning and extensibility rules (e.g., how to ignore unknown types, forward compatibility, critical vs non-critical TLVs), which is typically required for long-lived 3GPP formats. [Technical] The “multiple tensors encapsulation” is not fully specified: it does not define ordering, repetition rules, whether tensor IDs may repeat, and how to associate each tensor payload with its metadata when shapes can change dynamically. [Technical] The proposal does not address integrity/confidentiality or robustness aspects (e.g., protection against malformed TLVs, maximum sizes, resource exhaustion), which is important given user-plane parsing of potentially large tensors. [Editorial] Several clause/table references are inconsistent or placeholder-like (e.g., “Clause X.X.24.2”, “Table X.X.13-1”, “Table X.X.24-1”), making it impossible to verify consistency with the rest of the specification or TR 26.927 without a concrete target document structure. [Editorial] The text alternates between “partitioning”, “partition”, and “split point” and between “AIPS identifier” and `ai_parameter_set_id`; terminology should be normalized and aligned with existing 3GPP AI/ML vocabulary to avoid interpretive differences. [Editorial] The contribution claims the format is “derived from Clause 6.8” and adds a field from “Clause 6.6” of TR 26.927, but it does not clearly enumerate deltas versus the TR baseline (field-by-field) nor justify why the TR structure is insufficient, weakening the rationale for standardization. 2026-02-09 04:12
S4-260195 (pdf)	CR on AIML processing in IMS calls	Qualcomm Inc.	3GPP CR 0608 - AI/ML Processing in IMS Calls Change Request Overview Specification: TS 26.114 v19.2.0 Category: B (Addition of feature) Release: Rel-20 Work Item: AIML_IMS-MED This CR introduces normative procedures, formats, and signaling for AI/ML assisted media processing in DCMTSI (Data Channel for Multimedia Telephony Service over IMS). Main Technical Contributions 1. General Framework and Architecture (AD.1, AD.2, AD.3) Key Definitions AI/ML application: Data channel application providing AI/ML assisted media processing during IMS sessions AI/ML processing task: Well-defined AI/ML functions (e.g., speech-to-text, translation, noise suppression, scene description) AI/ML model: Parameters and metadata required for inference execution AI/ML inference engine: Local UE execution environment (e.g., WebNN-aligned runtime) AI/ML metadata: Data derived from media streams with timing and binding information Task manifest: UTF-8 JSON describing supported tasks and candidate models Model card: UTF-8 JSON describing model identity, format, artifacts, I/O conventions, runtime requirements Model artifact: Downloadable model binary and auxiliary files Terminal Architecture Requirements DCMTSI clients must support: - Media engine functions for RTP-based audio/video - Data channel client (bootstrap and application data channels per clauses 6.2.10, 6.2.13) - AI/ML application execution environment (e.g., web runtime) - AI/ML inference engine for local model execution - Capability discovery function (execution devices, operators, data types, resource limits) - Model validation function (integrity/authenticity verification via SHA-256 and digital signatures) - Binding and synchronization function (associates AI/ML tasks/metadata to RTP streams using SDP identifiers and media time anchors) Reference Architecture UE establishes Bootstrap Data Channel (BDC) to MF for retrieving DC application lists, AI/ML applications, and model artifacts via HTTP DCSF and repositories (e.g., DCAR) provide provisioning of AI/ML applications and models Application Data Channel (ADC) may be established to DC AS for task control, policy exchange, and metadata delivery IMS Media Function does not perform inference or process RTP media for AI/ML purposes 2. Call Flows (AD.4) AD.4.1 AI/ML Application and Model Delivery for Device Inferencing 14-step procedure: MMTel service establishment BDC establishment between UE and MF (per TS 23.228, clause AC.7.1) UE requests application list from MF via HTTP over BDC; MF forwards to DCSF DCSF creates user-specific DC application list (JSON/HTML) with: Generic app info (description, ID, URL) AI-specific info (AI feature tag, task descriptions) DCSF provides URL to application list; UE downloads list with metadata User selects app based on AI service description UE requests selected app from MF MF fetches AI application from DCSF AI application downloaded to UE via BDC with AI task metadata (task manifest) User presented with AI task list (with annotations from task metadata, execution endpoint info) Selected tasks/models informed to MF via: BDC: HTTP GET with task/model URLs ADC: AI Model Selection Request with model URNs MF fetches AI models from: 12a: DCAR via DCSF 12b: DC AS (alternative) UE downloads AI models from MF via: BDC: HTTP response with model resources ADC: AI Model Selection Response with model data Tasks executed for inference in UE User/UE may reselect AI tasks during session using received metadata Editor's Note: Clarification needed on whether MF understands AI task nature, application handling types, and large model handling. AD.4.2 On-Device Inferencing and Split Inference Operation User/application selects AI/ML processing task during session AI/ML application performs local capability discovery and selects compatible model artifact Inference engine configured and task bound to RTP media streams using binding rules (clause AD.8) If DC AS coordination required: UE establishes application data channels (clause 6.2.13) Associates with AI/ML application using a=3gpp-req-app SDP attribute Exchanges capability, task, configuration, status via "3gpp-ai" subprotocol (clause AD.9.2) Derived AI/ML metadata used for local rendering and/or transmitted over ADC Metadata includes RTP stream identifier (mid) and media time anchor for alignment with RTP playout Note: Split inference may use on-device inference for one task (e.g., STT) and DC AS for another (e.g., translation) while keeping RTP media unchanged. 3. Capabilities (AD.5) AD.5.1 UE Capabilities DCMTSI clients must determine and expose to AI/ML application: - Supported execution devices (CPU, GPU, NPU, accelerators) - Supported operator sets and data types (per local inference framework) - Resource limits (memory constraints, concurrent task limits) - Availability of audio/video media access points (e.g., decoded media frames) Web runtime capability discovery may align with WebNN. Capability summary may be conveyed to DC AS using capability message type (clause AD.9.2). AD.5.2 Network Capabilities DC AS supporting AI/ML processing may provide: - Repositories and discovery information for AI/ML applications/models - Policy information (restrictions on tasks, model usage, data retention) - Application data channels for coordination with AI/ML application - Note: Network-side inference capabilities are outside Phase 1 scope 4. AI/ML Formats (AD.6) Mandatory Model Format: - ONNX format conforming to ONNX version 1.16.0 - Minimum required opset version: 18 - Encoding: ONNX Protocol Buffers representation 5. Task Manifest and Model Card (AD.7) AD.7.1 Task Manifest UTF-8 JSON object included with AI/ML application delivery, containing: - List of supported tasks and optional subtasks with human-readable descriptions - For each task: candidate model identifiers (model_id, model_version_id) and model card resource reference - Task-specific configuration parameters including RTP stream mid binding requirements AD.7.2 Model Card UTF-8 JSON object provided for each candidate model, including: - Model identifier and version identifier - Model format specification (ONNX version, minimum opset, IR version) - Model I/O description: - Tensor element type and shape - Dynamic axes, layout, normalization conventions - Execution constraints: - Required operator support - Required data types - Quantization convention - Minimum resource requirements - Downloadable model artifacts: - Artifact URI, size, content type - Integrity information (SHA-256 digest) - Optional digital signature and key identifier AD.7.2.1 JSON Schema for Model Card Comprehensive JSON schema provided defining structure for: - model_card_version: Schema version (semver pattern) - identity: model_id, model_version_id, name, description, publisher, license, timestamps, tasks, languages, tags - format: type (const: "onnx"), onnx_version (const: "1.16.0"), min_opset (≥18), onnx_ir_version, encoding (enum: "protobuf") - artifacts: Array of downloadable artifacts with: - artifact_id, uri, content_type, size_bytes, sha256 - Optional compression (none/gzip/zstd) - Optional signature (alg, kid, sig) - variant (precision, quantization, preferred_devices, max_latency_ms) - selection_constraints (requires_webnn, requires_ops, requires_data_types, min_memory_mib, min_peak_scratch_mib) - io: inputs/outputs (tensorSpec arrays), preprocessing (audio/text), postprocessing (stt/tts), output_application_format - runtime: min_memory_mib, min_peak_scratch_mib, max_concurrent_instances, required_operator_sets, required_data_types, webnn preferences, device_preference - selection_policy: strategy (min_latency/min_energy/best_accuracy/balanced/custom), fallback_order tensorSpec definition: - name, element_type (float32/float16/int8/int32/uint8/bool) - shape (array with integers or strings for dynamic axes) - Optional layout and dynamic_axes mapping AD.7.3 Model Artifact Selection and Validation Procedure: 1. UE performs capability discovery (devices, operators, data types, memory limits) 2. UE filters artifacts satisfying selection_constraints against UE capabilities 3. UE selects preferred artifact based on selection_policy and device_preference 4. UE downloads selected artifact URI via HTTP over BDC 5. UE verifies artifact using SHA-256 digest from model card 6. UE should verify digital signature when provided 7. UE instantiates inference engine and binds model I/O per model card (io.preprocessing, io.inputs, io.outputs, io.postprocessing) 6. Negotiation, Signaling, and Media Time Binding (AD.8) AD.8.1 Binding to RTP Streams AI/ML tasks operating on RTP media bound to RTP streams using SDP "mid" identifier Task configuration and AI/ML metadata messages include relevant mid value AD.8.2 Media Time Binding for AI/ML Metadata AI/ML metadata over ADC may experience different delay/jitter vs. RTP media To avoid drift, metadata messages shall include media time anchor derived from RTP media clock of stream identified by mid For audio tasks, media time anchor may use: NTP-based timestamp associated with RTP stream + duration in audio samples, OR RTP timestamp Time anchor representation must be consistent within session for given task When DC AS forwards AI/ML metadata between endpoints, DC AS shall preserve mid binding and media time anchor for receiver alignment with RTP playout 7. Data Channel Transport (AD.9) AD.9.1 Bootstrap Data Channel Transport BDC uses HTTP subprotocol (clause 6.2.10) AI/ML applications, task manifests, model cards, model artifacts retrieved via HTTP GET over BDC DCMTSI client shall not transmit user media over BDC AD.9.2 Application Data Channel Transport Subprotocol: "3gpp-ai" for AI/ML control and metadata Message Format: UTF-8 encoded JSON objects Generic Message Types: - capability: UE inference capability summary - task: AI/ML processing task selection and model identifiers - configuration: Task configuration parameters including media stream mid binding and media time anchor representation - status: Lifecycle state and error reporting - metadata: Derived AI/ML metadata bound to media stream (mid) and media time Detailed schema specified by AI/ML application. For cross-vendor interoperability, schema should be standardized for specific task. Example metadata message: `json { "type": "metadata", "task": "stt", "mid": "audio", "segmentId": 1842, "ntpTs": 381245120, "durSamples": 16000, "text": "...", "conf": 0.87 }` Summary This CR establishes comprehensive normative framework for AI/ML assisted media processing in DCMTSI, covering: - Complete architecture with on-device and split inference support - Detailed call flows for application/model delivery and runtime operation - Capability discovery mechanisms for UE and network - Standardized ONNX model format requirements - Rich metadata structures (task manifests and model cards with JSON schemas) - Deterministic model selection and validation procedures - Media time binding mechanisms for metadata synchronization - Data channel transport protocols for control and metadata exchange The framework enables AI/ML tasks (STT, translation, TTS, noise suppression, scene description) while maintaining compatibility with existing DCMTSI media handling.	Extracted Proposals This document does not contain any explicitly marked proposals. The document is a Change Request (CR) that defines normative procedures, formats, and signaling for AI/ML assisted media processing in DCMTSI, but it does not include sections with text explicitly labeled as "Proposal", "Proposal:", "Proposal X:", etc.	manager: [Technical] The CR introduces extensive new normative behavior (task manifests, model cards, “3gpp-ai” subprotocol, time binding) but does not identify exact TS 26.114 clause numbers to be changed nor provide normative text deltas, making it impossible to verify consistency with existing DCMTSI procedures and requirements in clauses 6.2.10/6.2.13. [Technical] The call flow AD.4.1 is internally inconsistent (it claims a “14-step procedure” but lists 15 steps) and mixes BDC/ADC roles in a way that conflicts with the stated architecture (e.g., step 11 “BDC: HTTP GET with task/model URLs” implies UE informs MF via GET, which is semantically wrong for selection signaling). [Technical] The proposal assumes MF can fetch and relay large AI model artifacts (steps 12–13) while also stating “IMS Media Function does not perform inference or process RTP media”; it still requires MF to understand application/model selection and act as a distribution proxy, but no impact analysis is provided for MF behavior, caching, authorization, charging, or load. [Technical] Introducing a new ADC subprotocol “3gpp-ai” and JSON message types without defining a stable, interoperable schema (it says “detailed schema specified by AI/ML application”) undermines cross-vendor interoperability and contradicts the claim of “normative procedures, formats, and signaling.” [Technical] The time-binding mechanism in AD.8.2 is underspecified/incorrect for synchronization: “NTP-based timestamp associated with RTP stream” is not defined (RTP itself doesn’t carry NTP unless using RTCP SR mapping), and allowing either RTP timestamp or “NTP + durSamples” without a mandated mapping to RTP clock rate/RTCP SR risks ambiguous alignment at the receiver. [Technical] The use of SDP “mid” alone for binding is insufficient in multi-SSRC, simulcast, or stream-restart scenarios; the CR should specify whether SSRC, RID, or other identifiers are needed to uniquely bind metadata to a specific RTP source/encoding within a given mid. [Technical] The CR mandates ONNX 1.16.0 and opset ≥18 as “Mandatory Model Format” without justification or fallback; this is likely unrealistic for UE implementations and conflicts with the stated “WebNN-aligned runtime” (WebNN support is not equivalent to ONNX opset support), risking non-deployable requirements. [Technical] Security/integrity is incomplete: SHA-256 verification is mentioned, signatures are “optional,” but there is no trust model (certificate chain, key provisioning, revocation), no binding between model card and artifact, and no protection against downgrade/mix-and-match attacks across model_id/model_version_id/artifact variants. [Technical] The CR introduces “capability discovery” and “resource limits” exchange but does not define privacy constraints, user consent, or minimization; exposing detailed device accelerator/operator support to a DC AS can materially increase fingerprinting risk and may conflict with 3GPP privacy expectations. [Technical] The “split inference” concept is described but not normatively constrained: it is unclear how tasks are partitioned, how intermediate representations are transported (if any), and how this interacts with the statement “RTP media unchanged,” especially for tasks like noise suppression that typically require modifying the media stream. [Technical] The CR claims “DCMTSI client shall not transmit user media over BDC,” but then allows ADC metadata delivery without defining whether metadata may contain user content (e.g., STT text is user content); policy and confidentiality requirements (encryption end-to-end vs hop-by-hop) are not addressed. [Editorial] Several references appear incorrect or non-actionable: “TS 23.228, clause AC.7.1” and new clauses “AD.1…AD.9” are not existing TS 26.114 clause identifiers, so the contribution cannot be reviewed against the actual spec structure. [Editorial] Terminology is inconsistent and sometimes non-3GPP: “DCMTSI clients must support … web runtime,” “WebNN-aligned runtime,” “model card,” and “task manifest” are introduced as if standardized, but no alignment to existing 3GPP definitions or external normative references (W3C WebNN, ONNX spec) is provided. [Editorial] The example metadata uses fields like `"ntpTs": 381245120` without defining epoch, units, wraparound, or relation to RTP/RTCP, and uses `"mid": "audio"` even though mid values are tokens negotiated in SDP (not semantic labels), which may mislead implementers. 2026-02-09 04:13
S4-260197 (pdf)	[AIML_IMS-MED] NNC web decoder demo	Fraunhofer HHI, Nokia	Summary of S4-260197: NNC Web Decoder Demo 1. Introduction This contribution presents a live demonstration of a web-based Neural Network Codec (NNC) decoder, following up on previous telco discussions where decoding times and end-to-end latency were reported. The demonstration shows substantial latency reductions under realistic download conditions. The document also addresses security concerns regarding WebAssembly (Wasm) that were raised in the previous telco. 2. Decoder Implementation Technical Architecture Base Implementation: Built on NNCodec and MPEG's reference software NCTM Language: Reuses existing C++ entropy coding (CABAC) components with additional functionality ported from Python to C++ Web Deployment: Compiled into WebAssembly (Wasm) library using Emscripten Supported Features Supports NNC edition 2 Limitation: Does not support tools using temporal prediction Performance Optimizations Parallelization: CABAC decoding parallelized across NNR data units Scheduling Strategy: Prioritizes largest available NNR data unit first to reduce tail latency when multiple units are pending 3. Web Application Integration Wasm decoder library embedded into JavaScript web application Executable in standard browsers JavaScript application invokes Wasm decoder and provides user interface for timing measurements User Interface Features Configuration Options: Simulated download rate selection Number of decoding threads selection Execution Modes: Decoding after complete model download Simultaneous download and decoding (progressive decoding of fully received NNR data units) Measurement Capabilities Download Simulation: Delays availability of each tensor/NNR data unit according to selected throughput Metrics Captured: Decoding time Total end-to-end latency (from download start to complete model decoding) 4. Test Conditions Model and Configuration Model: Wav2Vec for automatic speech recognition (evaluated in 3GPP TR 26.847) Encoder Settings: Dependent scalar quantization (`use_dq`) Parameter optimization for DeepCABAC (`param_opt`) Unary binarization length 11 (`cabac_unary_length_minus1`) QP −27 No data-driven tools Compression Performance Original Model: ~377 MB (94.4M float32 parameters) Compressed Size: ~49 MB Compression Ratio: ~13% ASR Performance (LibriSpeech test-clean) Original WER: 3.4% Compressed WER: 3.6% Test Environment Browser: Brave 1.86.142 (64-bit), Chromium 144.0.7559.97 Hardware: Dell Precision 7680 Laptop, Intel Core i9-13950HX, 64 GB RAM OS: Windows 10 Enterprise 5. WebAssembly Security Analysis The contribution addresses security concerns raised in the previous telco with four key arguments: 5.1 Expert Development and Maintenance Developed within W3C by WebAssembly Working Group Participation from major browser vendors and technology companies (Mozilla, Microsoft, Google, Apple, Intel, ByteDance, Red Hat) Browser support since 2017 Actively maintained (latest core draft: 16 June 2025) 5.2 Security Model and Mechanisms Operates under web security model in browsers Key Security Features: Sandboxed execution No implicit privileges Module validation before execution Memory isolation Enforcement of standard browser security policies 5.3 Broad Industry Deployment Examples of widely deployed Wasm applications: - Adobe Photoshop on the web - Google Earth on the web - TensorFlow.js (WebAssembly backend) - ONNX Runtime Web (Microsoft) - AutoCAD Web - ffmpeg.wasm project This broad deployment indicates strong industry confidence in WebAssembly's security model. 5.4 3GPP-Specific Considerations IMS DC applications have different threat model than open web Applications come from trusted sources Authentication and authorization required before execution on UE Applications authorized by DCSF/DC-AR before download/execution Precedent: SA4 already considers WebAssembly in TR 26.858 (Study on APIs for 3GPP Speech and Audio Codecs) in clauses 5.3.3 and 6 6. Conclusion The contribution proposes scheduling a time slot for live demonstration (e.g., during a meeting break) and concludes that WebAssembly is secure for running NNC decoder in web environments based on: 1. Expert-driven standardization and ongoing maintenance 2. Sandboxed execution model and security mechanisms 3. Broad deployment across major browsers and applications 4. Security considerations specific to IMS DC applications	Proposal: Schedule a time slot for the live demonstration of the NNC web decoder, for example during a meeting break.	manager: [Technical] The claim that “WebAssembly is secure” is not substantiated for the IMS DC threat model: the contribution lists generic browser sandbox properties but does not address concrete risks relevant to codec execution (e.g., side channels, Spectre-class leaks, JIT/Wasm engine vulnerabilities, supply-chain integrity of modules, and DoS via CPU/memory exhaustion), nor does it propose mitigations or normative constraints. [Technical] The “3GPP-specific considerations” assume trusted sources and authorization by DCSF/DC-AR, but the document does not map these assumptions to any existing SA4/CT/SA security requirements or specify how integrity/authenticity of the Wasm binary and associated model bitstreams are ensured end-to-end (signing, hashing, secure transport, revocation). [Technical] Performance/latency conclusions are not reproducible: the contribution provides no actual measured numbers (decode time, end-to-end latency, throughput points, thread counts) and no methodology details (number of runs, variance, warm-up, caching disabled, CPU governor), so “substantial latency reductions” cannot be evaluated. [Technical] The demo uses a single high-end laptop CPU and a Chromium-based browser; without results on representative UE-class hardware (mobile SoCs, thermal throttling, limited cores) the conclusions about practical IMS DC deployment latency are weak. [Technical] The decoder “does not support tools using temporal prediction,” which may be a significant functional gap versus NNC Edition 2 usage scenarios; the contribution does not clarify whether the tested bitstream avoids those tools by construction, nor whether this limitation affects interoperability expectations in SA4. [Technical] Parallelizing CABAC “across NNR data units” and scheduling “largest first” may change buffering and memory pressure; the document does not quantify peak memory, queueing behavior, or whether this strategy can starve smaller units and delay first-usable partial model availability (important for progressive use cases). [Technical] The progressive mode description (“decode fully received NNR data units”) omits dependency handling: it is unclear whether any cross-unit dependencies exist in the bitstream/tools used (e.g., shared contexts, parameter optimization side information), and if so how correctness is preserved when decoding out of original order. [Technical] The download simulation (“delays availability of each tensor/NNR data unit”) is not equivalent to real network behavior (TCP slow start, jitter, HOL blocking, retransmissions); without modeling these effects, end-to-end latency claims under “realistic download conditions” are overstated. [Technical] The encoder settings include “QP −27” and specific DeepCABAC options, but the contribution does not state the exact NNC profile/constraints used (and whether they align with TR 26.847 assumptions), making it hard to judge whether the demo reflects typical operating points. [Technical] The security argument based on “broad industry deployment” is not a security proof and is weak for 3GPP decision-making; the contribution should instead reference concrete security analyses, CVE handling expectations, and required sandboxing policies (e.g., no shared memory, no SIMD, no threads) if those are intended constraints. [Editorial] The document references “previous telco” concerns and “reported” decoding times/latency but does not cite the specific meeting, contribution numbers, or clauses, making the context and addressed issues hard to trace. [Editorial] Several statements are absolute or vague (“substantial latency reductions,” “realistic download conditions,” “WebAssembly is secure”) and should be qualified with data, scope, and assumptions to avoid being read as normative conclusions. [Editorial] The “Precedent” reference to TR 26.858 clauses 5.3.3 and 6 is not summarized; without stating what TR 26.858 actually concluded about Wasm (and under what constraints), the precedent claim is difficult to assess. 2026-02-09 04:13
S4-260198 (pdf)	[AIML_IMS-MED] On Compression of AI/ML data in IMS	Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe	Summary of S4-260198: On Compression of AI/ML Data in IMS 1. Introduction and Motivation This contribution proposes the adoption of efficient compression techniques for AI/ML data transport in IMS services, specifically advocating for the specification of MPEG's Neural Network Coding standard ISO/IEC 15938-17 (NNC) as a representation format. 2. Technical Justification 2.1 Use Case Requirements The document identifies critical challenges in AI/ML data exchange based on SA1 and SA4 use cases: Model delivery for local UE inference: Multiple context-dependent downloads (location, time, task) with limited local storage requiring frequent model re-downloads Incremental AI/ML model updates: Both unidirectional (continuous UE updates) and multidirectional (co-learning between UEs and edge nodes) scenarios 2.2 Benefits of Compression The contribution highlights three key advantages: Bandwidth Optimization: Reduced model size minimizes data transfer and operational costs Reduced Latency: Faster transmission to UEs and edge devices for real-time applications Broader Accessibility: Enables AI/ML applications in bandwidth-constrained networks 2.3 NNC Standard Capabilities The document presents NNC (ISO/IEC 15938-17) as the solution, demonstrating: Compression performance: 0.1% to 20% of original size with transparent performance (validated in SA4 and MPEG evaluations) Standardized format: Ensures interoperability for multi-party scenarios (e.g., third-party model providers, application server execution) 2.4 Advanced NNC Features Key technical features beyond compression: Topology Signalling: Generic syntax for AI/ML model architecture encoding Random Access: Independent tensor decoding enabling parallelization Parameter Update Signalling: Metadata for incremental update dependencies and relations Robustness and Error Resilience: Configurable prioritization/error-protection through packetization; missing parameter update detection Performance Indicator: Signals model performance metrics (e.g., accuracy) Encapsulation Flexibility: Integration of existing formats (PyTorch, ONNX, NNEF, TensorFlow) with generic support for others 2.5 Web Application Suitability WASM-based NNC decoder validation demonstrates: - Browser-side decoding feasibility - Reduced end-to-end latency (download + decoding) compared to uncompressed delivery - Multi-fold speed-ups under representative network conditions 3. Proposal The contribution proposes considering NNC-based compression for inclusion in IMS-based AI/ML services. Annex: Detailed NNC Technical Syntax A.1 Data Components A.1.1 Payload Types NNC specifies representation through NNR compressed data units (NNR_NDU) with multiple payload types: \| Payload Type \| Compressed Parameter Type \| Description \| \|--------------\|---------------------------\|-------------\| \| NNR_PT_INT \| - \| Integer parameter tensor \| \| NNR_PT_FLOAT \| - \| Float parameter tensor \| \| NNR_PT_RAW_FLOAT \| - \| Uncompressed float parameter tensor \| \| NNR_PT_BLOCK \| NNR_CPT_DC (0x01) \| Weight tensor decomposition \| \| \| NNR_CPT_LS (0x02) \| Local scaling parameters \| \| \| NNR_CPT_BI (0x04) \| Biases present \| \| \| NNR_CPT_BN (0x08) \| Batch norm parameters \| Context-adaptive entropy coding using DeepCABAC (except NNR_PT_RAW_FLOAT) Support for various bit depths via `nnr_decompressed_data_format` Pre-quantized float parameter tensor representation A.1.2 Topology Data NNR topology units (NNR_TPL) signal AI/ML topology: - Storage format and compression signaled via `topology_storage_format` and `topology_compression_format` - Byte sequence representation (typically null-terminated UTF-8 strings) - Optional deflation per RFC 1950 - Topology element specification in NNR_NDU via `topology_elem_id` or `topology_elem_id_index` A.1.3 Meta Data NNR_NDU meta data syntax elements: - Tensor dimensions: `tensor_dimensions_flag`, `tensor_dimension_list()` - Scan order: Mapping of parameter values to dimensions - Entry points: `bit_offset_delta1`, `bit_offset_delta2` for individual tensor decoding Incremental coding support: - Parameter update tree (PUT) structure with parent-child relationships - Node identification via: - Enumeration: `device_id`, `parameter_id`, `put_node_depth` - Hash-based: `parent_node_payload_sha256`, `parent_node_payload_sha512` - Global NN meta data in NNR_MPS including `base_model_id` for update relationships A.1.4 Performance Data Performance metrics signaled in NNR_MPS and NNR_LPS: - Presence and type specification via `validation_set_performance_present_flag`, `metric_type_performance_map_valid_flag`, `performance_metric_type` - Validation set performance indication - Performance maps for different optimization variants: - `sparsification_performance_map()` - `pruning_performance_map()` - `unification_performance_map()` - `decomposition_performance_map()` A.1.5 Format Encapsulation NNC encapsulates existing formats (NNEF, ONNX, PyTorch, TensorFlow): - Topology data transmission in NNR topology data units - Quantization meta data in NNR quantization data units - Format-specific specifications in Annexes A-D of the standard A.2 Coding Tools A.2.1 Parameter Reduction Methods NNR_PT_BLOCK payload additional parameters: - Local scaling adaptation - Batch norm folding - Tensor decomposition with `decomposition_rank` and `g_number_of_rows` Predictive Residual Encoding (PRE): - Enabled via `nnr_pre_flag` in NNR_MPS - Codes difference between current and previous parameter updates Row-skipping mechanism: - Enabled via `row_skip_enabled_flag` - `row_skip_list` specifies entirely-zero tensor rows A.2.2 Quantization and Codebook Quantization control in `quant_tensor()`: - Method specification: `lps_quantization_method_flags`, `mps_quantization_method_flags`, `codebook_present_flag` - Quantization type: Uniform or dependent (`dq_flag`) - Step size: `qp_value`, `lps_qp_density`, `mps_qp_density` - Dependent quantization state: `dq_state_list` for entry point initialization Codebook mapping: - Integer value remapping via `integer_codebook()` structure A.2.3 Entropy Coding DeepCABAC (context adaptive binary arithmetic coding): - Applied to all payloads except NNR_PT_RAW_FLOAT - Binarization syntax elements: `sig_flag`, `sign_flag`, `abs_level_greater`-flags, `abs_remainder` - Binarization control: `cabac_unary_length` - Probability estimation: Initialization and update via `shift_idx_minus_1` - Random access support: `scan_order`, `bit_offset_delta1`, `cabac_offset_list` for entry points and state signaling Incremental update coding modes: - Temporal context modeling: `temporal_context_modeling_flag` for probability estimation dependency on previous tensors - Histogram-dependent probability: `hist_dep_sig_prob_enabled_flag` for multi-tensor dependency	Proposals Proposal: We propose to take into account the above discussion and consider the inclusion of NNC-based compression in IMS-based AI/ML services.	manager: [Technical] The contribution does not identify any concrete IMS/3GPP specification touchpoints (e.g., which TS 24.229/24.229 annex, TS 26.114/26.247, MSRP/SIP body usage, or media framework) where NNC would be normatively introduced, so the proposal is not actionable as a 3GPP change. [Technical] It is unclear whether the AI/ML “model delivery” is intended as IMS media, IMS file transfer, or HTTP-based download outside IMS; without clarifying the transport and session model, requirements like random access, packetization, and error resilience cannot be mapped to IMS procedures. [Technical] The proposal advocates ISO/IEC 15938-17 “as a representation format” but does not define MIME type(s), SIP/SDP signaling (e.g., `m=` line vs `application/` body), or capability negotiation needed for interoperability in IMS. [Technical] Claims of “robustness and error resilience through packetization” are not tied to any existing IMS bearer (RTP/RTCP, MSRP, or SIP MESSAGE) and no fragmentation/reassembly rules, loss recovery, or integrity mechanisms are specified for NNR_NDU/NNR_MPS carriage. [Technical] The stated compression ratios “0.1% to 20% of original size with transparent performance” are presented without defining baselines (FP32? FP16? already-quantized?), model classes, or acceptable accuracy loss criteria, making the performance claim non-verifiable for 3GPP normative adoption. [Technical] Incremental update signaling (PUT structure, parent hash fields, `base_model_id`) is described, but there is no end-to-end procedure for versioning, dependency resolution, rollback, or mismatch handling between UE/edge/server—key for “multidirectional co-learning” scenarios. [Technical] Security and trust are not addressed: model authenticity, integrity, and provenance (especially for “third-party model providers”) are critical in IMS, yet no mapping is given to IMS security (e.g., SIP identity, TLS, object signing) nor guidance on hash usage beyond parent linkage. [Technical] The “Performance Indicator” feature (accuracy metrics in NNR_MPS/NNR_LPS) raises interoperability and policy concerns (metric definitions, dataset identifiers, comparability, and potential misrepresentation), but the contribution does not define how metrics are standardized or validated in 3GPP context. [Technical] The document lists “encapsulation flexibility” for ONNX/PyTorch/TensorFlow, but does not specify which topology storage formats are permitted in 3GPP profiles, risking fragmentation where different vendors choose different encapsulated formats and still fail interoperability. [Technical] WASM/browser-side decoding is not obviously relevant to IMS normative scope and may conflict with UE implementation assumptions; if the intent is WebRTC/IMS interworking, the contribution should explicitly relate this to existing SA4 web real-time communication specifications. [Technical] Several syntax element names (e.g., `topology_storage_format`, `topology_compression_format`, `nnr_decompressed_data_format`) are presented as if stable normative identifiers, but no profile constraints (allowed values, mandatory/optional tools like PRE, row-skip, DeepCABAC settings) are proposed to ensure decoder interoperability. [Editorial] The contribution repeatedly cites “validated in SA4 and MPEG evaluations” without referencing specific meeting documents, reports, or objective test conditions; add precise references or remove the implication of SA4 endorsement. [Editorial] The summary mixes high-level motivation with deep syntax annex material (NNR_NDU fields, CABAC flags) without explaining why these details are needed for the 3GPP decision; it would be clearer to separate “what 3GPP needs to specify” from “how NNC works internally.” [Editorial] Terminology is inconsistent with 3GPP style: “IMS-based AI/ML services” and “AI/ML data transport in IMS” are vague—define whether this is an IMS multimedia service, an IMS enabler, or an application-layer service using IMS signaling only. 2026-02-09 04:14
S4-260200 (pdf)	[AIML_IMS-MED] Inclusion of NNC to AIML_IMS-MED	Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, Vodafone Group Plc	Summary of S4-260200: Inclusion of NNC to AIML_IMS-MED 1. Introduction and Context This contribution proposes the addition of Neural Network Coding (NNC) compression capabilities to the AIML_IMS-MED work item. The proposal is motivated by S4-260198, which demonstrates the necessity for compression of AI/ML data in IMS-based transport scenarios. The document presents changes to be incorporated into the common base Change Request for AIML_IMS-MED. 2. Main Technical Contributions 2.1 NNC Decoder Support Requirement The proposal mandates that DCMTSI clients supporting AI/ML model download or incremental model download shall support NNC decoding as specified in ISO/IEC 15938-17. Specifically: NNC Edition 2 support is enabled by setting the `general_profile_idc` syntax element equal to 1 This establishes a baseline compression capability for AI/ML model transport over IMS 2.2 Configuration for Full AI/ML Model Download For DCMTSI clients supporting complete AI/ML model download, the following NNC parameter configuration is specified: Payload type: `nnr_compressed_data_unit_payload_type` = NNC_PT_BLOCK Compressed parameter types: `compressed_parameter_types` = NNR_CPT_LS \| NNR_CPT_BN (enabling local scaling and batch normalization) Quantization options: Either `dq_flag` = 1 (dependent quantization) OR `codebook_present_flag` = 1 (codebook-based quantization) Probability estimation: `shift_idx_minus_1_present_flag` = 1 (optimal initialization) Functionality enabled: This configuration supports local scaling adaptation, batch norm folding, flexible quantization approaches, and optimized probability estimation for entropy coding. 2.3 Configuration for Incremental AI/ML Model Data Exchange For DCMTSI clients supporting incremental model updates, an extended parameter set is defined: Basic parameters: Same payload type (NNC_PT_BLOCK) and compressed parameter types (NNR_CPT_LS \| NNR_CPT_BN) as full model download Update tree support: `mps_parent_signalling_enabled_flag` = 1 and `parent_node_id_present_flag` = 1 Efficiency features: `row_skip_enabled_flag` = 1 (row skipping) `nnr_pre_flag` = 1 (predictive residual coding) `hist_dep_sig_prob_enabled_flag` = 1 (history-dependent significance probability) `temporal_context_modeling_flag` = 1 (temporal context adaptation) `scan_order` > 0 (parallel decoding support) Functionality enabled: This configuration provides comprehensive support for efficient incremental updates through parameter update trees, spatial/temporal prediction, adaptive probability modeling, and parallel processing capabilities. 2.4 Normative Reference Addition The proposal adds ISO/IEC 15938-17:2024 Edition 2 as a normative reference, establishing the technical foundation for NNC compression in the specification. Technical Significance The contribution establishes two distinct NNC profiles optimized for different AI/ML model transport scenarios in IMS networks: 1. A baseline profile for complete model downloads with essential compression features 2. An advanced profile for incremental updates with sophisticated prediction and adaptation mechanisms to minimize update payload sizes	Proposal: Add the following changes to the common base CR for AIML_IMS-MED.	manager: [Technical] Mandating that all DCMTSI clients supporting AI/ML model download “shall support NNC decoding” (ISO/IEC 15938-17) is a major interoperability and implementation-impacting requirement, but the contribution does not justify why decoding is mandatory for all such clients rather than being negotiated/capability-based (e.g., via SDP/IMS signaling) or optional with fallback. [Technical] The proposal hard-codes specific NNC tool flags/parameters (e.g., `compressed_parameter_types = NNR_CPT_LS \| NNR_CPT_BN`, `shift_idx_minus_1_present_flag = 1`, `row_skip_enabled_flag = 1`, etc.) without specifying how these are signaled on the wire and agreed between endpoints; without a negotiation mechanism, “shall support” toolsets can still fail interoperability if the sender chooses different settings. [Technical] The statement “NNC Edition 2 support is enabled by setting `general_profile_idc = 1`” is risky as written: it conflates edition selection with profile signaling and does not clarify whether `general_profile_idc` semantics in ISO/IEC 15938-17 actually map “1” to “Edition 2 baseline” (and whether other constraints/fields must also be set), which could lead to incorrect normative behavior. [Technical] The “either `dq_flag = 1` OR `codebook_present_flag = 1`” requirement is underspecified: it is unclear whether both may be present, whether one is mandatory, and what the receiver must support if the sender uses the other option—this can create non-interoperable subsets unless the spec defines a single mandatory-to-implement mode or explicit capability exchange. [Technical] For incremental updates, requiring `scan_order > 0` for “parallel decoding support” is not technically grounded in the summary and may be incorrect/overconstraining; scan order values are typically normative enumerations with defined meanings, so “> 0” is not a valid constraint unless the referenced standard defines it that way. [Technical] The incremental profile enables multiple advanced tools simultaneously (`nnr_pre_flag`, `hist_dep_sig_prob_enabled_flag`, `temporal_context_modeling_flag`), but the contribution does not address decoder complexity, memory, and latency impacts—critical for IMS endpoints—and provides no rationale for making these mandatory rather than optional. [Technical] The update-tree requirements (`mps_parent_signalling_enabled_flag = 1`, `parent_node_id_present_flag = 1`) assume a specific incremental update structure, but the contribution does not define how this maps to AIML_IMS-MED’s model update semantics (e.g., identification of layers/parameters, consistency checks, rollback), risking a mismatch between NNC syntax and the 3GPP application model. [Technical] The proposal defines two “profiles” but does not specify profile identifiers, compatibility rules, or receiver behavior when encountering unsupported tools; without explicit profile signaling and fallback rules, endpoints cannot reliably interoperate across “baseline” vs “advanced” modes. [Technical] Adding ISO/IEC 15938-17:2024 Ed.2 as a normative reference may be problematic for 3GPP availability and IPR/patent policy considerations, and the contribution does not discuss whether referencing a 2024 edition is acceptable for the targeted release timeline or whether an earlier stable edition should be referenced. [Editorial] The contribution reads like a high-level summary but claims “changes to be incorporated into the common base CR” without providing actual CR-style delta text (affected clauses, exact normative wording, and change marks), making it impossible to verify consistency with existing AIML_IMS-MED specification structure. [Editorial] Several terms are introduced without definition in the AIML_IMS-MED context (e.g., “DCMTSI clients”, “AIML_IMS-MED”, “incremental model download”), and the document does not cite the exact 3GPP spec/section where these entities and procedures are defined. [Editorial] The parameter names are presented as if they are normative fields (`nnr_compressed_data_unit_payload_type`, `NNC_PT_BLOCK`, etc.) but the contribution does not quote the exact ISO/IEC 15938-17 syntax names/casing nor provide a mapping table, increasing the risk of transcription errors and ambiguity in normative text. 2026-02-09 04:14
S4-260286 (pdf)	[AIML_IMS-MED] On Compression of AI/ML data in IMS	Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, Vodafone Group Plc	Comprehensive Summary: Compression of AI/ML Data in IMS Document Overview This contribution (S4-260286, revision of S4-260198) proposes the adoption of MPEG's Neural Network Coding standard ISO/IEC 15938-17 (NNC) for efficient compression and transport of AI/ML data in IMS services. The document is submitted by Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, and Vodafone Group Plc. Main Technical Contributions Motivation and Use Case Requirements The contribution identifies critical challenges in AI/ML data exchange for IMS services: Model Delivery Challenges: Use cases require multiple context-dependent model downloads (location, time, task-specific) rather than single downloads. Limited UE storage necessitates frequent model discarding and re-downloading. Incremental Updates: Applications require both unidirectional continuous model updates to UEs and multidirectional updates for co-learning scenarios involving multiple UEs and edge nodes. Key Benefits of Compression: Bandwidth optimization reducing operational costs Reduced latency through faster transmission Broader accessibility in reduced-bandwidth networks Interoperability through standardized data formats NNC Standard Capabilities The contribution highlights NNC's compression performance (0.1% to 20% of original size with transparent performance) and advanced features: Topology Signalling: Generic syntax for encoding AI/ML model architecture Random Access: Independent tensor decoding enabling parallelization Parameter Update Signalling: Metadata for incremental update dependencies and relations Robustness: Configurable prioritization/error-protection through packetization; missing update detection Performance Indicators: Signaling of model performance metrics (e.g., accuracy) Encapsulation Flexibility: Support for PyTorch, ONNX, NNEF, TensorFlow formats The document also references WASM-based NNC decoder feasibility in web applications, demonstrating multi-fold latency reductions under representative network conditions. Technical Details (Annex) NNC Data Components Payload Types (NNR_NDU) NNC specifies multiple payload types via `nnr_compressed_data_unit_payload_type`: NNR_PT_INT: Integer parameter tensors NNR_PT_FLOAT: Float parameter tensors NNR_PT_RAW_FLOAT: Uncompressed float tensors NNR_PT_BLOCK: Block-structured float parameters with sub-types: NNR_CPT_DC (0x01): Decomposed weight tensors NNR_CPT_LS (0x02): Local scaling parameters NNR_CPT_BI (0x04): Biases NNR_CPT_BN (0x08): Batch normalization parameters Non-RAW payloads use context-adaptive entropy coding (DeepCABAC). The `compressed_parameter_types` element uses OR-combination of parameter IDs. Support for various bit depths via `nnr_decompressed_data_format` and pre-quantized float tensors. Topology Data (NNR_TPL) Topology units signal AI/ML architecture via: - `topology_storage_format`: Storage format specification - `topology_compression_format`: Optional compression (RFC 1950 deflate) - `topology_data`: Byte sequence (typically UTF-8 string) - `topology_elem_id` / `topology_elem_id_index`: Topology element references in NNR_NDU Metadata NNR_NDU metadata includes: - Tensor Dimensions: `tensor_dimensions_flag`, `tensor_dimension_list()` - Scan Order: `scan_order` for parameter-to-dimension mapping - Entry Points: `bit_offset_delta1`, `bit_offset_delta2` for parallel decoding Incremental Coding Support: - Parameter Update Tree (PUT) structure via `mps_parent_signalling_enabled_flag`, `parent_node_id_present_flag` - Node identification through: - Enumeration: `device_id`, `parameter_id`, `put_node_depth` - Hash-based: `parent_node_payload_sha256`, `parent_node_payload_sha512` - Global metadata in NNR_MPS including `base_model_id` Performance Data Performance metrics signaled in NNR_MPS and NNR_LPS: - `validation_set_performance_present_flag`, `metric_type_performance_map_valid_flag`, `performance_metric_type` - `validation_set_performance`: Performance on validation set - Performance maps for post-processing operations: - `sparsification_performance_map()` - `pruning_performance_map()` - `unification_performance_map()` - `decomposition_performance_map()` Format Encapsulation Annexes A-D specify encapsulation of NNEF, ONNX, PyTorch, and TensorFlow data through NNR topology and quantization data units. Coding Tools Parameter Reduction Methods NNR_PT_BLOCK Reconstruction: Local scaling adaptation, batch norm folding, tensor decomposition with `decomposition_rank` and `g_number_of_rows` Predictive Residual Encoding (PRE): `nnr_pre_flag` enables differential coding against previous updates Row-Skipping: `row_skip_enabled_flag` and `row_skip_list` for zero-row signaling Quantization and Codebook Quantization control via `lps_quantization_method_flags`, `mps_quantization_method_flags`, `codebook_present_flag` `dq_flag`: Uniform vs. dependent quantization selection Quantization step size: `qp_value`, `lps_qp_density`, `mps_qp_density` Dependent quantization state: `dq_state_list` for entry point initialization Codebook mapping: `integer_codebook()` structure for value remapping Entropy Coding (DeepCABAC) Context-adaptive binary arithmetic coding for non-RAW payloads: Binarization: `sig_flag`, `sign_flag`, `abs_level_greater`-flags, `abs_remainder` with `cabac_unary_length` specification Probability Estimation: - Initialization/update: `shift_idx_minus_1` - Random access: `scan_order`, `bit_offset_delta1`, `cabac_offset_list` Incremental Update Modes: - `temporal_context_modeling_flag`: Probability estimation from previous tensor - `hist_dep_sig_prob_enabled_flag`: Multi-tensor historical dependency Proposal The contribution proposes considering NNC-based compression for inclusion in IMS-based AI/ML services, based on its compression efficiency, standardized format, and advanced features supporting various AI/ML data exchange scenarios.	Extracted Proposals Proposal: We propose to take into account the above discussion and consider the inclusion of NNC-based compression in IMS-based AI/ML services.	manager: [Technical] The contribution proposes adopting ISO/IEC 15938-17 (NNC) “for IMS services” but does not identify any concrete IMS enabler, interface, or 3GPP spec target (e.g., SIP/SDP offer/answer, MSRP, HTTP over Ut, MMTel, RCS, or a new IMS media type), so it is impossible to assess feasibility or normative impact. [Technical] There is no definition of how NNC payloads would be negotiated (capability exchange, codec parameters, versioning, profiles/levels) in IMS; without SDP attributes or equivalent negotiation rules, interoperability and fallback behavior are undefined. [Technical] The document mixes “compression and transport” claims, but NNC is a coding format; it does not specify the IMS transport mapping (RTP payload format, MSRP chunking, HTTP object transfer, SIP MESSAGE, etc.), nor packetization/fragmentation rules needed to realize the cited “robustness” and “random access” benefits. [Technical] Security/privacy implications are not addressed: model updates and co-learning can expose proprietary IP and user data; IMS requires clear handling for integrity, confidentiality, authorization, and potential end-to-end vs hop-by-hop protection, none of which is scoped or mapped to IMS security mechanisms. [Technical] The “incremental updates” and PUT/MPS/LPS mechanisms are described, but there is no 3GPP-level procedure for synchronization, loss recovery, ordering, or consistency (e.g., what happens if a UE misses an update, how base_model_id is managed across sessions, or how to prevent applying incompatible deltas). [Technical] The claim of “transparent performance” at 0.1%–20% size is presented without constraints (model types, tasks, quantization settings, acceptable accuracy loss, compute cost), which is too broad for 3GPP requirements and risks overpromising in normative text. [Technical] UE constraints are discussed (storage/latency), but decoder complexity, memory footprint, and power impact of DeepCABAC/NNC (including random access entry points and dependent quantization state) are not quantified or bounded, which is critical for UE implementability in IMS contexts. [Technical] The contribution lists encapsulation for PyTorch/ONNX/NNEF/TensorFlow, but does not specify how 3GPP would ensure deterministic execution compatibility (operator sets, versions, endianness, numeric formats), so “interoperability through standardized data formats” is not substantiated at the system level. [Technical] Error resilience statements (“configurable prioritization/error-protection through packetization; missing update detection”) are not tied to any IMS bearer/QoS mechanism or media transport behavior, so it is unclear how these features would actually be realized over typical IMS paths. [Technical] No content-type / MIME registration, SIP/SDP media type identification, or IANA/3GPP registry impact is discussed; without a clear identification scheme, IMS entities cannot route, store, or apply policy to NNC objects. [Editorial] The document reads as a technology overview rather than a 3GPP contribution with actionable change requests: it lacks proposed normative text, spec references, and explicit work item scope (which TS/TR to update and what exact additions are requested). [Editorial] Several acronyms and structures (e.g., NNR_NDU, NNR_MPS, NNR_LPS, PUT) are introduced without aligning terminology to existing 3GPP AI/ML service discussions in SA4, making it hard to map to current 3GPP architecture and terminology. [Editorial] The annex-level detail (flags, syntax elements like `bit_offset_delta1`, `dq_state_list`, etc.) is overly granular for SA4 decision-making unless accompanied by a clear mapping to 3GPP signaling/transport; it risks distracting from the missing system integration aspects. [Editorial] The WASM decoder and “multi-fold latency reductions” are mentioned without citing test methodology, baseline, or conditions; as evidence for standardization, the performance claims need references or reproducible parameters. 2026-02-09 04:19
S4-260374 (pdf)	[AIML_IMS-MED] Application manifest metadata	Samsung Electronics Co., Ltd., Qualcomm, Nokia, Interdigital Finland Oy.	No summary available	No proposals available	No comments
S4-260421 (pdf)	Network, QoS and UE Considerations for client side inferencing AIML/IMS	Huawei Tech.(UK) Co.. Ltd	No summary available	No proposals available	No comments
S4-260422 (pdf)	[AI_IMS-MED] AI/ML media processing and task updating	Nokia	No summary available	No proposals available	No comments
S4-260431 (pdf)	[AIML_IMS-MED] Inclusion of NNC to AIML_IMS-MED	Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, Vodafone Group Plc	No summary available	No proposals available	No comments
S4-260436 (pdf)	[AIML_IMS-MED] Base CR for TR 26.114	Samsung Electronics Iberia SA	No summary available	No proposals available	No comments
S4-260439 (pdf)	[AIML_IMS-MED] Application manifest metadata	Samsung Electronics Co., Ltd., Qualcomm, Nokia, Interdigital Finland Oy.	No summary available	No proposals available	No comments
S4-260449 (pdf)	[AIML_IMS-MED] Call flow for split inferencing	InterDigital Finland Oy; Samsung Electronics Co., Ltd, Qualcomm Inc. , Nokia,	No summary available	No proposals available	No comments
S4-260450 (pdf)	[AIML_IMS-MED] Negotiation messages	InterDigital Finland Oy	No summary available	No proposals available	No comments

Total TDocs: 25 | PDFs: 25 | Comments: 17

Read-only Review: 10.5

Network, QoS and UE Considerations for Client Side Inferencing AIML/IMS

1. Introduction

2. Network Related Issues

2.1 Model Size

2.2 Network QoS Support

2.3 Compression and UE Support

2.4 Protocol Support Issue

2.5 Caching and Bandwidth Wastage

3. Suggested Way Forward

Proposed Actions:

Summary of S4-260112: AI/ML Media Processing and Task Updating

Document Overview

Main Technical Contributions

1. Refinement of AI/ML Task Processing Call Flows

Issues Identified with TR 26.927

Updated Call Flow Structure (Steps 1-23)

2. Task Reselection and Update Mechanisms

Task Reselection (Step 17)

Task Update (Steps 17-23)

3. Task Control Messages

3.1 START Task Message

3.2 UPDATE Task Message

Key Technical Clarifications

Inference Location Flexibility

Message Exchange Protocol

Network Entity Roles

Editorial Notes

Extracted Proposals

3GPP Technical Document Summary: CR 0607 to TS 26.114

Document Information

Purpose and Rationale

Main Technical Contributions

1. References, Terms, and Abbreviations (Clauses 2, 3.1, 3.3)

2. New Annex AC: AI/ML Assisted Media Processing for MTSI

AC.1 Introduction

AC.2 Terminal Architecture

AC.3 End-to-End Reference Architecture

3. AI/ML Call Flows (AC.4)

AC.4.1 AI/ML Model Delivery for Device Inferencing

AC.4.2 Network Inferencing

AC.4.3 Split Inferencing

4. AI/ML Capabilities (AC.5)

5. AI/ML Formats (AC.6)

6. AI/ML Metadata (AC.7)

7. Negotiation and Signaling (AC.8)

8. Data Channel Transport (AC.9)

Key Technical Entities

Implementation Status

Further Details on DC Application List

Introduction

Relevant Specifications Overview

Bootstrap Data Channel Setup Signalling (TS 23.228 Clause AC.7.1)

Media Control Service Operation (TS 23.228 Clause AA.2.4.3.2)

MF Resource Management (TS 29.176 Clause 5.2.2.2)

Data Channel Application Definition (TS 26.114 Clause 6.2.10.1)

Discussion

Proposal

Summary of S4-260129: Call Flow for Split Inferencing

Document Information

Main Technical Contribution

Split Inferencing Call Flow

Session Establishment and Bootstrap (Steps 1-2)

Application Discovery and Selection (Steps 3-6)

Application Download (Steps 7-9)

AI Task Selection and Configuration (Steps 10-13)

Model Distribution and Configuration Response (Steps 14-16)

Inference Execution (Steps 17-22)

Dynamic Task Reselection (Step 23)

Key Technical Features

Metadata Framework

Flexibility in Execution Distribution

Model Distribution Options

Media/Data Flow Management

Extracted Proposals

Comprehensive Summary of S4-260180: Call Flow for Split Inferencing

Document Overview

Main Technical Contributions

1. Split Inferencing Capability Indication

2. Enhanced Application and Task Selection