[AIML_IMS-MED] On Compression of AI/ML data in IMS
This contribution (S4-260286, revision of S4-260198) proposes the adoption of MPEG's Neural Network Coding standard ISO/IEC 15938-17 (NNC) for efficient compression and transport of AI/ML data in IMS services. The document is submitted by Nokia, Fraunhofer HHI, Deutsche Telekom, InterDigital Europe, and Vodafone Group Plc.
The contribution identifies critical challenges in AI/ML data exchange for IMS services:
Model Delivery Challenges: Use cases require multiple context-dependent model downloads (location, time, task-specific) rather than single downloads. Limited UE storage necessitates frequent model discarding and re-downloading.
Incremental Updates: Applications require both unidirectional continuous model updates to UEs and multidirectional updates for co-learning scenarios involving multiple UEs and edge nodes.
Key Benefits of Compression:
The contribution highlights NNC's compression performance (0.1% to 20% of original size with transparent performance) and advanced features:
The document also references WASM-based NNC decoder feasibility in web applications, demonstrating multi-fold latency reductions under representative network conditions.
NNC specifies multiple payload types via nnr_compressed_data_unit_payload_type:
Non-RAW payloads use context-adaptive entropy coding (DeepCABAC). The compressed_parameter_types element uses OR-combination of parameter IDs. Support for various bit depths via nnr_decompressed_data_format and pre-quantized float tensors.
Topology units signal AI/ML architecture via:
- topology_storage_format: Storage format specification
- topology_compression_format: Optional compression (RFC 1950 deflate)
- topology_data: Byte sequence (typically UTF-8 string)
- topology_elem_id / topology_elem_id_index: Topology element references in NNR_NDU
NNR_NDU metadata includes:
- Tensor Dimensions: tensor_dimensions_flag, tensor_dimension_list()
- Scan Order: scan_order for parameter-to-dimension mapping
- Entry Points: bit_offset_delta1, bit_offset_delta2 for parallel decoding
Incremental Coding Support:
- Parameter Update Tree (PUT) structure via mps_parent_signalling_enabled_flag, parent_node_id_present_flag
- Node identification through:
- Enumeration: device_id, parameter_id, put_node_depth
- Hash-based: parent_node_payload_sha256, parent_node_payload_sha512
- Global metadata in NNR_MPS including base_model_id
Performance metrics signaled in NNR_MPS and NNR_LPS:
- validation_set_performance_present_flag, metric_type_performance_map_valid_flag, performance_metric_type
- validation_set_performance: Performance on validation set
- Performance maps for post-processing operations:
- sparsification_performance_map()
- pruning_performance_map()
- unification_performance_map()
- decomposition_performance_map()
Annexes A-D specify encapsulation of NNEF, ONNX, PyTorch, and TensorFlow data through NNR topology and quantization data units.
decomposition_rank and g_number_of_rowsnnr_pre_flag enables differential coding against previous updatesrow_skip_enabled_flag and row_skip_list for zero-row signalinglps_quantization_method_flags, mps_quantization_method_flags, codebook_present_flagdq_flag: Uniform vs. dependent quantization selectionqp_value, lps_qp_density, mps_qp_densitydq_state_list for entry point initializationinteger_codebook() structure for value remappingContext-adaptive binary arithmetic coding for non-RAW payloads:
Binarization: sig_flag, sign_flag, abs_level_greater-flags, abs_remainder with cabac_unary_length specification
Probability Estimation:
- Initialization/update: shift_idx_minus_1
- Random access: scan_order, bit_offset_delta1, cabac_offset_list
Incremental Update Modes:
- temporal_context_modeling_flag: Probability estimation from previous tensor
- hist_dep_sig_prob_enabled_flag: Multi-tensor historical dependency
The contribution proposes considering NNC-based compression for inclusion in IMS-based AI/ML services, based on its compression efficiency, standardized format, and advanced features supporting various AI/ML data exchange scenarios.