S4-260183

[AIML_IMS-MED] Negotiation messages for split inferencing

Source: InterDigital Finland Oy
Meeting: TSGS4_135_India
Agenda Item: 10.5

All Metadata

Agenda item description	AI_IMS-MED (Media aspects for AI/ML in IMS services)
Doc type	discussion
For action	Agreement
Release	Rel-20
download_url	Download Original
For	Agreement
Type	discussion
Contact	Stephane Onno
Uploaded	2026-02-03T19:11:22.680000
Contact ID	84864
TDoc Status	merged
Is revision of	S4aR260010
Reservation date	03/02/2026 16:32:54
Agenda item sort order	52

Review Comments

manager - 2026-02-09 04:09

[Technical] The contribution mixes two transport paradigms without a clear normative mapping: messages are described as HTTP GET/POST in Table A4.2-1 while Section A.4.5 defines a generic “AI metadata exchange over data channels”; the spec needs an explicit statement whether these are alternative transports, layered (HTTP payload carried in data channel), or separate procedures, otherwise interoperability will break.

[Technical] Message taxonomy appears internally inconsistent: Table A.4.2 introduces AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST and SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE, while Section A.4.5 lists SPLIT_INFERENCE_CONFIGURATION_REQUEST (without “AI_”) and no matching ..._AI_RESPONSE; naming and pairing must be aligned and uniquely defined.

[Technical] There is functional overlap/ambiguity between AI_MODEL_SELECTION_REQUEST/RESPONSE and AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST/RESPONSE (both “carry URN(s) of selected models/submodels” and return binaries/metadata); the procedure needs a clear separation of purpose (e.g., selection vs partitioning vs download) and state machine/ordering constraints.

[Technical] The proposal introduces AI_SERVER_CONFIGURATION_REQUEST/RESPONSE in Section A.4.5 but it is not described in the negotiation summary table nor in the earlier metadata sections; either add the missing procedure/metadata or remove it to avoid undefined behavior.

[Technical] The “type” field is defined as a number (message subtype identifier) but no registry/enum values are provided for the listed message types; without normative numeric assignments and extensibility rules, independent implementations cannot interoperate.

[Technical] sessionId is described as “multimedia session identifier” but there is no definition of which IMS identifier is used (SIP Call-ID, dialog identifiers, MSRP session, etc.) and how it binds to the data channel/HTTP exchange; this is critical for correlating negotiation to the correct media session.

[Technical] The endpoint execution location values (UE, SERVER, EDGE, CLOUD, CUSTOM) are not tied to any 3GPP-defined entity (e.g., UE, IMS AS, MEC, DN) and “CUSTOM” is non-interoperable; the spec should either reference 3GPP architecture terms or define discovery/addressing and trust implications.

[Technical] Partitioning metadata is underspecified for correctness: submodelType enumerates HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL, which hard-limits the number of partitions and cannot represent arbitrary N-way splits; it should be an ordered list with an index/graph structure rather than fixed labels.

[Technical] Tensor metadata is inconsistent: subModelDataType uses Uint8, Float32, Float16 while tensorType is described as integer, float32, float16; the data type vocabulary must be unified and should include signedness/bitwidth and quantization parameters if Uint8 is allowed.

[Technical] outputAccuracy as a single “trained accuracy percentage” is not meaningful across tasks/datasets and is not comparable between partitionings; if kept, it needs a defined metric, evaluation dataset identifier, and conditions, otherwise it risks misleading selection logic.

[Technical] Capability metadata separation into static/dynamic is reasonable, but the proposal lacks update/refresh rules (e.g., when dynamic capabilities are reported, validity timers, thresholds) and lacks units for key fields (compute capacity, memory, load), making negotiation non-deterministic.

[Technical] The messages that “return selected application binary and metadata” / “return selected models/submodels binary and metadata” do not specify integrity/authenticity mechanisms (hash, signature, provenance) or versioning; for executable model binaries this is a security and lifecycle gap.

[Editorial] Several identifiers are inconsistent in casing and spelling (sendingAtTime vs typical sentAtTime; submodelsPartitioningIdentifier vs submodelPartitioningIdentifier; subModelDataType camel-case mismatch), which will cause implementer confusion in JSON schema.

[Editorial] The document references “Table 5” and “Table 6” in Section A.4.5 while earlier it introduces “Table A4.2-1”; table numbering should be consistent with the annex/section numbering conventions of the target specification.

[Editorial] The contribution repeatedly uses “HTTP RESPONSE” as a message name rather than a defined response message type; if the intent is to define application-layer messages, the response should be named consistently (e.g., ..._RESPONSE) and HTTP status/error handling should be specified separately.

<ol>
<li>
[Technical] The contribution mixes two transport paradigms without a clear normative mapping: messages are described as HTTP GET/POST in Table A4.2-1 while Section A.4.5 defines a generic “AI metadata exchange over data channels”; the spec needs an explicit statement whether these are alternative transports, layered (HTTP payload carried in data channel), or separate procedures, otherwise interoperability will break.
</li>
<li>
[Technical] Message taxonomy appears internally inconsistent: Table A.4.2 introduces <code>AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST</code> and <code>SPLIT_INFERENCE_CONFIGURATION_AI_RESPONSE</code>, while Section A.4.5 lists <code>SPLIT_INFERENCE_CONFIGURATION_REQUEST</code> (without “AI_”) and no matching <code>..._AI_RESPONSE</code>; naming and pairing must be aligned and uniquely defined.
</li>
<li>
[Technical] There is functional overlap/ambiguity between <code>AI_MODEL_SELECTION_REQUEST/RESPONSE</code> and <code>AI_SPLIT_INFERENCE_CONFIGURATION_REQUEST/RESPONSE</code> (both “carry URN(s) of selected models/submodels” and return binaries/metadata); the procedure needs a clear separation of purpose (e.g., selection vs partitioning vs download) and state machine/ordering constraints.
</li>
<li>
[Technical] The proposal introduces <code>AI_SERVER_CONFIGURATION_REQUEST/RESPONSE</code> in Section A.4.5 but it is not described in the negotiation summary table nor in the earlier metadata sections; either add the missing procedure/metadata or remove it to avoid undefined behavior.
</li>
<li>
[Technical] The “type” field is defined as a number (message subtype identifier) but no registry/enum values are provided for the listed message types; without normative numeric assignments and extensibility rules, independent implementations cannot interoperate.
</li>
<li>
[Technical] <code>sessionId</code> is described as “multimedia session identifier” but there is no definition of which IMS identifier is used (SIP Call-ID, dialog identifiers, MSRP session, etc.) and how it binds to the data channel/HTTP exchange; this is critical for correlating negotiation to the correct media session.
</li>
<li>
[Technical] The endpoint execution location values (<code>UE, SERVER, EDGE, CLOUD, CUSTOM</code>) are not tied to any 3GPP-defined entity (e.g., UE, IMS AS, MEC, DN) and “CUSTOM” is non-interoperable; the spec should either reference 3GPP architecture terms or define discovery/addressing and trust implications.
</li>
<li>
[Technical] Partitioning metadata is underspecified for correctness: <code>submodelType</code> enumerates <code>HEAD, INTERMEDIATE1, INTERMEDIATE2, TAIL</code>, which hard-limits the number of partitions and cannot represent arbitrary N-way splits; it should be an ordered list with an index/graph structure rather than fixed labels.
</li>
<li>
[Technical] Tensor metadata is inconsistent: <code>subModelDataType</code> uses <code>Uint8, Float32, Float16</code> while tensorType is described as <code>integer, float32, float16</code>; the data type vocabulary must be unified and should include signedness/bitwidth and quantization parameters if Uint8 is allowed.
</li>
<li>
[Technical] <code>outputAccuracy</code> as a single “trained accuracy percentage” is not meaningful across tasks/datasets and is not comparable between partitionings; if kept, it needs a defined metric, evaluation dataset identifier, and conditions, otherwise it risks misleading selection logic.
</li>
<li>
[Technical] Capability metadata separation into static/dynamic is reasonable, but the proposal lacks update/refresh rules (e.g., when dynamic capabilities are reported, validity timers, thresholds) and lacks units for key fields (compute capacity, memory, load), making negotiation non-deterministic.
</li>
<li>
[Technical] The messages that “return selected application binary and metadata” / “return selected models/submodels binary and metadata” do not specify integrity/authenticity mechanisms (hash, signature, provenance) or versioning; for executable model binaries this is a security and lifecycle gap.
</li>
<li>
[Editorial] Several identifiers are inconsistent in casing and spelling (<code>sendingAtTime</code> vs typical <code>sentAtTime</code>; <code>submodelsPartitioningIdentifier</code> vs <code>submodelPartitioningIdentifier</code>; <code>subModelDataType</code> camel-case mismatch), which will cause implementer confusion in JSON schema.
</li>
<li>
[Editorial] The document references “Table 5” and “Table 6” in Section A.4.5 while earlier it introduces “Table A4.2-1”; table numbering should be consistent with the annex/section numbering conventions of the target specification.
</li>
<li>
[Editorial] The contribution repeatedly uses “HTTP RESPONSE” as a message name rather than a defined response message type; if the intent is to define application-layer messages, the response should be named consistently (e.g., <code>..._RESPONSE</code>) and HTTP status/error handling should be specified separately.
</li>
</ol>