S4-260286 - AI Summary

[AIML_IMS-MED] On Compression of AI/ML data in IMS

AI-Generated Summary AI

Comprehensive Summary: Compression of AI/ML Data in IMS

Document Overview

This contribution (S4-260286, revision of S4-260198) proposes the adoption of MPEG's Neural Network Coding standard ISO/IEC 15938-17 (NNC) for efficient compression and transport of AI/ML data in IMS services. The document is submitted by Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, and Vodafone Group Plc.

Main Technical Contributions

Motivation and Use Case Requirements

The contribution identifies critical challenges in AI/ML data exchange for IMS services:

Model Delivery Challenges: Use cases require multiple context-dependent model downloads (location, time, task-specific) rather than single downloads. Limited UE storage necessitates frequent model discarding and re-downloading.
Incremental Updates: Applications require both unidirectional continuous model updates to UEs and multidirectional updates for co-learning scenarios involving multiple UEs and edge nodes.
Key Benefits of Compression:
Bandwidth optimization reducing operational costs
Reduced latency through faster transmission
Broader accessibility in reduced-bandwidth networks
Interoperability through standardized data formats

NNC Standard Capabilities

The contribution highlights NNC's compression performance (0.1% to 20% of original size with transparent performance) and advanced features:

Topology Signalling: Generic syntax for encoding AI/ML model architecture
Random Access: Independent tensor decoding enabling parallelization
Parameter Update Signalling: Metadata for incremental update dependencies and relations
Robustness: Configurable prioritization/error-protection through packetization; missing update detection
Performance Indicators: Signaling of model performance metrics (e.g., accuracy)
Encapsulation Flexibility: Support for PyTorch, ONNX, NNEF, TensorFlow formats

The document also references WASM-based NNC decoder feasibility in web applications, demonstrating multi-fold latency reductions under representative network conditions.

Technical Details (Annex)

NNC Data Components

Payload Types (NNR_NDU)

NNC specifies multiple payload types via nnr_compressed_data_unit_payload_type:

NNR_PT_INT: Integer parameter tensors
NNR_PT_FLOAT: Float parameter tensors
NNR_PT_RAW_FLOAT: Uncompressed float tensors
NNR_PT_BLOCK: Block-structured float parameters with sub-types:
NNR_CPT_DC (0x01): Decomposed weight tensors
NNR_CPT_LS (0x02): Local scaling parameters
NNR_CPT_BI (0x04): Biases
NNR_CPT_BN (0x08): Batch normalization parameters

Non-RAW payloads use context-adaptive entropy coding (DeepCABAC). The compressed_parameter_types element uses OR-combination of parameter IDs. Support for various bit depths via nnr_decompressed_data_format and pre-quantized float tensors.

Topology Data (NNR_TPL)

Topology units signal AI/ML architecture via:
- topology_storage_format: Storage format specification
- topology_compression_format: Optional compression (RFC 1950 deflate)
- topology_data: Byte sequence (typically UTF-8 string)
- topology_elem_id / topology_elem_id_index: Topology element references in NNR_NDU

Metadata

NNR_NDU metadata includes:
- Tensor Dimensions: tensor_dimensions_flag, tensor_dimension_list()
- Scan Order: scan_order for parameter-to-dimension mapping
- Entry Points: bit_offset_delta1, bit_offset_delta2 for parallel decoding

Incremental Coding Support:
- Parameter Update Tree (PUT) structure via mps_parent_signalling_enabled_flag, parent_node_id_present_flag
- Node identification through:
- Enumeration: device_id, parameter_id, put_node_depth
- Hash-based: parent_node_payload_sha256, parent_node_payload_sha512
- Global metadata in NNR_MPS including base_model_id

Performance Data

Performance metrics signaled in NNR_MPS and NNR_LPS:
- validation_set_performance_present_flag, metric_type_performance_map_valid_flag, performance_metric_type
- validation_set_performance: Performance on validation set
- Performance maps for post-processing operations:
- sparsification_performance_map()
- pruning_performance_map()
- unification_performance_map()
- decomposition_performance_map()

Format Encapsulation

Annexes A-D specify encapsulation of NNEF, ONNX, PyTorch, and TensorFlow data through NNR topology and quantization data units.

Coding Tools

Parameter Reduction Methods

NNR_PT_BLOCK Reconstruction: Local scaling adaptation, batch norm folding, tensor decomposition with decomposition_rank and g_number_of_rows
Predictive Residual Encoding (PRE): nnr_pre_flag enables differential coding against previous updates
Row-Skipping: row_skip_enabled_flag and row_skip_list for zero-row signaling

Quantization and Codebook

Quantization control via lps_quantization_method_flags, mps_quantization_method_flags, codebook_present_flag
dq_flag: Uniform vs. dependent quantization selection
Quantization step size: qp_value, lps_qp_density, mps_qp_density
Dependent quantization state: dq_state_list for entry point initialization
Codebook mapping: integer_codebook() structure for value remapping

Entropy Coding (DeepCABAC)

Context-adaptive binary arithmetic coding for non-RAW payloads:

Binarization: sig_flag, sign_flag, abs_level_greater-flags, abs_remainder with cabac_unary_length specification

Probability Estimation:
- Initialization/update: shift_idx_minus_1
- Random access: scan_order, bit_offset_delta1, cabac_offset_list

Incremental Update Modes:
- temporal_context_modeling_flag: Probability estimation from previous tensor
- hist_dep_sig_prob_enabled_flag: Multi-tensor historical dependency

Proposal

The contribution proposes considering NNC-based compression for inclusion in IMS-based AI/ML services, based on its compression efficiency, standardized format, and advanced features supporting various AI/ML data exchange scenarios.

Document Information

TDoc:
S4-260286

Source:
Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, Vodafone Group Plc

Type:
discussion

For:
Discussion

Original Document:
View on 3GPP

Title: [AIML_IMS-MED] On Compression of AI/ML data in IMS

Agenda item: 10.5

Agenda item description: AI_IMS-MED (Media aspects for AI/ML in IMS services)

Doc type: discussion

For action: Discussion

Contact: Gerhard Tech

Uploaded: 2026-02-04T18:18:10.383000

Contact ID: 91711

TDoc Status: noted

Is revision of: S4-260198

Reservation date: 04/02/2026 18:02:35

Agenda item sort order: 52