Meeting: TSGS4_135_India | Agenda Item: 10.5
[AIML_IMS-MED] Call flow for split inferencing
InterDigital Finland Oy
discussion
Agreement
| TDoc | S4-260180 |
| Title | [AIML_IMS-MED] Call flow for split inferencing |
| Source | InterDigital Finland Oy |
| Agenda item | 10.5 |
| Agenda item description | AI_IMS-MED (Media aspects for AI/ML in IMS services) |
| Doc type | discussion |
| For action | Agreement |
| download_url | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260180.zip |
| For | Agreement |
| Type | discussion |
| Contact | Stephane Onno |
| Uploaded | 2026-02-03T19:11:22.463000 |
| Contact ID | 84864 |
| Revised to | S4-260449 |
| TDoc Status | revised |
| Is revision of | S4aR260008 |
| Reservation date | 03/02/2026 16:26:22 |
| Agenda item sort order | 52 |
[Technical] The proposal introduces a “partitioning list/submodel partitioning metadata” but does not define how the UE and DCAS ensure cryptographic binding between the selected model version and the partition definition (to prevent mismatched or tampered submodels/partitions), which is essential once execution is split across trust domains.
[Technical] Steps 10–16 add a negotiation/configuration phase but do not specify acceptance/rejection causes or fallback behavior (e.g., if DCAS rejects a partition, whether UE retries with another partition, falls back to device-only, or aborts), leaving the call flow incomplete for interoperable behavior.
[Technical] The flow makes partition selection “user-driven” (Step 13), but split-point selection is typically constrained by latency, uplink bandwidth, privacy policy, and DCAS load; without normative constraints or policy control (operator/AS policy vs. user preference), the procedure risks being non-deployable or inconsistent with operator-controlled IMS service behavior.
[Technical] The “execution endpoints supported by each task and subtask” and “operational characteristics” are introduced without clarifying whether these are static metadata, dynamically updated (e.g., DCAS load), or per-session; if dynamic, the document needs a mechanism for freshness/validity and update triggers.
[Technical] The proposal adds UE capability indication to the MF during BDC application list request, but it is unclear why MF (rather than DCAS or an AIML-specific function) is the correct termination point for split-inference capability negotiation; this risks misplacing AIML-specific logic into a generic media function and creating architectural inconsistency.
[Technical] The FFS “how device capabilities are sent to obtain an accurate list of models” is not a minor detail: capability exchange is foundational to Steps 10–12 (model/partition list derivation) and needs at least a baseline definition (capability categories, granularity, and privacy considerations) to avoid incompatible implementations.
[Technical] The partitioning metadata includes “input/output tensor characteristics,” but the call flow does not address how tensor data is transported over the data channel (format, compression, quantization, framing) nor how interoperability is ensured between UE and DCAS runtimes.
[Technical] Steps 17–18 state “Selected tasks/models and corresponding AI submodels are communicated to DCAS” while also saying UE downloads device-side submodels; it is ambiguous whether DCAS also downloads/hosts its submodels, whether they are pre-provisioned, or whether UE triggers DCAS-side model acquisition—this impacts feasibility and timing.
[Technical] The proposal does not address session continuity and reconfiguration (e.g., UE mobility, radio degradation, battery drop) where the split point may need to change mid-session; without a re-negotiation/update procedure, split inference will be brittle in real networks.
[Technical] No explicit handling is described for privacy/security constraints when tensors or intermediate features are sent to the network (which can leak sensitive information); the partitioning framework should at least indicate how privacy requirements influence allowable partitions.
[Editorial] The contribution summary references “revises previously agreed device inferencing call flow (S4aR260014)” but does not clearly identify the exact spec clause(s), figure numbers, or step numbers being changed, making it hard to review consistency and impacts.
[Editorial] Terminology is inconsistent/undefined (MF, BDC, DC AS/DCAS, “application request message,” “task manifest,” “partitioning list”); the CR should align with existing 3GPP term definitions and use one consistent acronym per entity.
[Editorial] Several steps use non-normative phrasing (“may be based on…”, “potentially expressed as…”) for core interoperability aspects (task manifest, selection criteria), which should be tightened or explicitly scoped as informative to avoid ambiguous requirements.