S4-260233 - AI Summary

On reference code and model format

Back to Agenda Download Summary
AI-Generated Summary AI

3GPP Change Request Summary: S4-260233

Document Overview

This contribution proposes the use of ML model formats as intermediate representations (IR) for the ULBC (Ultra Low Bitrate Codec) reference implementation, rather than a pure C implementation. The document is structured as a proposed Change Request (pCR) to TR 26.940, introducing a new clause 6.4.2.


Main Technical Contributions

1. Problem Statement and Motivation (Goal Section)

The document identifies a fundamental question for ULBC standardization: whether to provide the entire codec reference implementation in C (including neural network components) or to define specific parts based on ML model formats (e.g., ONNX, PyTorch, TensorFlow).

Key concerns with pure C implementation:
- Limits UE vendors from leveraging custom architectures and optimizations
- UE vendors typically have custom optimization pipelines to port ML models to internal formats
- Pure C approach restricts full utilization of specialized hardware (NPUs, DSPs, TPUs)

2. Limitations of C-Based Reference Implementation (Clause 6.4.2.1)

Issues with existing WMC (Weighted Million Operations) tool for complexity measurement:
- Weights in Table 18.3 of G.191 do not account for vectorized implementations of matrix multiplications
- Theoretical complexity estimation does not reflect actual runtime complexity
- Does not account for diversity of target platforms

Additional limitations identified:
- Hardware/platform dependencies: C implementations may rely on platform-specific intrinsics and vectorization pragmas, limiting portability to NPUs
- Unoptimized reference code: May not be optimized for certain platforms
- Compiler dependencies: Intrinsics are compiler-specific
- Maintenance burden: Keeping C implementation updated with new ML operators and architectures is costly and error-prone

3. Definitions and Concepts (Clause 6.4.2.1 - Definitions)

The document establishes clear terminology:

  • Graph format: Describes neural network as computational graph (structure only, no parameters)
  • Model format: Combines graph representation, trained parameters (weights, biases), and metadata; self-contained and directly runnable
  • Intermediate Representation (IR): Serves as bridge between high-level ML framework and execution runtimes

Note: PyTorch does not contain a graph format and requires model definition as Torchcode.

4. Advantages of Model Format Approach (Clause 6.4.3.2)

Platform Portability:
- Specifies what is computed, not how it's executed
- Framework-agnostic: models can be exported from different training frameworks
- Allows vendors to use custom toolchains for hardware-specific optimization

Hardware Evolution:
- Future-proof method to leverage latest AI processor developments
- Maintains compatibility with low maintenance effort

Combination with Standard C-code:
- ULBC can combine ML parts (as model format) with classic signal processing (in ANSI C)
- Backend runtime in C can integrate ML components
- Enables traditional 3GPP codec reference implementation structure

5. Comprehensive Model Format Analysis (Clause 6.4.3.3)

The document provides detailed comparison of major ML model formats:

| Format | Type | Key Advantages | Key Limitations |
|--------|------|----------------|-----------------|
| ONNX | Framework-agnostic IR | Cross-framework portability, wide runtime/hardware support, native OS support (Windows/Linux), dedicated C/C++ runtime | Operator coverage limitations, limited dynamic graph support |
| TensorFlow Lite (TFLite/LiteRT) | Edge/embedded-focused IR | Mobile/edge optimized, strong Android ecosystem, quantization tools, C/C++ runtime | TensorFlow-centric, partially vendor-specific maintenance |
| PyTorch/Python | Torch.nn.Module + checkpoints | Easy prototyping, highly optimized conversion tools | Suboptimal for real-world testing, Python dependencies, no C/C++ runtime without Python |
| TorchScript | PyTorch-specific serialized IR | Static graph without Python dependencies, supports custom ops, LibTorch C++ runtime | PyTorch-specific, deprecated (being replaced by ExportedProgram) |
| ExportedProgram & ExecuTorch | Two IRs: ExportedProgram and .pte | Replaces TorchScript, canonical PyTorch export IR, dedicated C++ runtime | PyTorch-specific, requires compilation to another IR, pipeline not fully mature |
| OpenVINO IR | Intel/CPU-centric IR | Strong Intel CPU/GPU optimization | Not suitable for mobile SoCs, extra conversion step needed |
| Proprietary vendor IRs | Vendor-specific internal IR | Highly hardware-optimized | Not portable, requires conversion from open IR |

Key observations:
- PyTorch format provides maximum flexibility and transparency but may have long-term compatibility concerns due to format evolution
- ONNX and TFLite are designed for inference deployment and cross-platform compatibility, representing stable industry standards
- ULBC ML parts will likely be based on PyTorch format, convertible to stable formats like ONNX or TFLite

6. SoC AI Engine Support Analysis (Clause 6.4.3.4)

Hardware landscape:
- Major smartphone SoCs include NPUs, DSPs, TPUs, GPUs, and CPUs
- Vendors provide specialized runtime environments and SDKs
- Vendors use native/preferred internal model formats optimized for their architecture

Industry pattern:
- All major vendors provide conversion mechanisms from popular open-source formats
- Common supported formats: ONNX, TFLite, PyTorch, TensorFlow
- References provided for major vendors: Qualcomm, Apple, Samsung, MediaTek, Google, Huawei

7. Summary and Recommendations (Clause 6.4.3.4)

Advantages of model-format/IR-based reference implementation:
- Decouples algorithm definition from hardware-specific implementation
- Leverages existing SoC vendor compilers, AI accelerators, and runtimes
- Significantly more portable, maintainable, and future-proof

Recommended approach for ULBC reference implementation:
1. Base reference on ML model-format with auxiliary signal processing in C
2. Include both ONNX and PyTorch as ML model-formats
3. Define neural network model-format including operator set and version
4. Specify I/O interfaces of ML models and auxiliary signal processing steps in C
5. Use reference implementation for integration illustration, verification, and testing


Proposal

The document proposes:
1. Discussion and agreement on selection of one or more model formats for ULBC reference implementation
2. Agreement on principle of using model format as part of ULBC standardization reference model
3. Documentation of findings in TR 26.940 under new clause 6.4.2


Key Technical Impact

This contribution represents a significant departure from traditional 3GPP codec standardization approaches by advocating for ML model formats rather than pure C implementations. The proposal addresses practical deployment considerations for ML-based codecs while maintaining compatibility with 3GPP standardization practices through hybrid approach combining model formats with C code for signal processing components.

Document Information
Source:
Fraunhofer IIS, Apple Inc.
Type:
pCR
For:
Discussion
Original Document:
View on 3GPP
Title: On reference code and model format
Agenda item: 7.8
Agenda item description: FS_ULBC (Study on Ultra Low Bitrate Speech Codec)
Doc type: pCR
For action: Discussion
Release: Rel-20
Specification: 26.94
Version: 0.5.1
Related WIs: FS_ULBC
Spec: 26.94
Contact: Markus Schnell
Uploaded: 2026-02-03T22:37:07.743000
Contact ID: 72605
TDoc Status: agreed
Reservation date: 03/02/2026 20:49:10
Agenda item sort order: 20