On reference code and model format
This contribution proposes the use of ML model formats as intermediate representations (IR) for the ULBC (Ultra Low Bitrate Codec) reference implementation, rather than a pure C implementation. The document is structured as a proposed Change Request (pCR) to TR 26.940, introducing a new clause 6.4.2.
The document identifies a fundamental question for ULBC standardization: whether to provide the entire codec reference implementation in C (including neural network components) or to define specific parts based on ML model formats (e.g., ONNX, PyTorch, TensorFlow).
Key concerns with pure C implementation:
- Limits UE vendors from leveraging custom architectures and optimizations
- UE vendors typically have custom optimization pipelines to port ML models to internal formats
- Pure C approach restricts full utilization of specialized hardware (NPUs, DSPs, TPUs)
Issues with existing WMC (Weighted Million Operations) tool for complexity measurement:
- Weights in Table 18.3 of G.191 do not account for vectorized implementations of matrix multiplications
- Theoretical complexity estimation does not reflect actual runtime complexity
- Does not account for diversity of target platforms
Additional limitations identified:
- Hardware/platform dependencies: C implementations may rely on platform-specific intrinsics and vectorization pragmas, limiting portability to NPUs
- Unoptimized reference code: May not be optimized for certain platforms
- Compiler dependencies: Intrinsics are compiler-specific
- Maintenance burden: Keeping C implementation updated with new ML operators and architectures is costly and error-prone
The document establishes clear terminology:
Note: PyTorch does not contain a graph format and requires model definition as Torchcode.
Platform Portability:
- Specifies what is computed, not how it's executed
- Framework-agnostic: models can be exported from different training frameworks
- Allows vendors to use custom toolchains for hardware-specific optimization
Hardware Evolution:
- Future-proof method to leverage latest AI processor developments
- Maintains compatibility with low maintenance effort
Combination with Standard C-code:
- ULBC can combine ML parts (as model format) with classic signal processing (in ANSI C)
- Backend runtime in C can integrate ML components
- Enables traditional 3GPP codec reference implementation structure
The document provides detailed comparison of major ML model formats:
| Format | Type | Key Advantages | Key Limitations |
|--------|------|----------------|-----------------|
| ONNX | Framework-agnostic IR | Cross-framework portability, wide runtime/hardware support, native OS support (Windows/Linux), dedicated C/C++ runtime | Operator coverage limitations, limited dynamic graph support |
| TensorFlow Lite (TFLite/LiteRT) | Edge/embedded-focused IR | Mobile/edge optimized, strong Android ecosystem, quantization tools, C/C++ runtime | TensorFlow-centric, partially vendor-specific maintenance |
| PyTorch/Python | Torch.nn.Module + checkpoints | Easy prototyping, highly optimized conversion tools | Suboptimal for real-world testing, Python dependencies, no C/C++ runtime without Python |
| TorchScript | PyTorch-specific serialized IR | Static graph without Python dependencies, supports custom ops, LibTorch C++ runtime | PyTorch-specific, deprecated (being replaced by ExportedProgram) |
| ExportedProgram & ExecuTorch | Two IRs: ExportedProgram and .pte | Replaces TorchScript, canonical PyTorch export IR, dedicated C++ runtime | PyTorch-specific, requires compilation to another IR, pipeline not fully mature |
| OpenVINO IR | Intel/CPU-centric IR | Strong Intel CPU/GPU optimization | Not suitable for mobile SoCs, extra conversion step needed |
| Proprietary vendor IRs | Vendor-specific internal IR | Highly hardware-optimized | Not portable, requires conversion from open IR |
Key observations:
- PyTorch format provides maximum flexibility and transparency but may have long-term compatibility concerns due to format evolution
- ONNX and TFLite are designed for inference deployment and cross-platform compatibility, representing stable industry standards
- ULBC ML parts will likely be based on PyTorch format, convertible to stable formats like ONNX or TFLite
Hardware landscape:
- Major smartphone SoCs include NPUs, DSPs, TPUs, GPUs, and CPUs
- Vendors provide specialized runtime environments and SDKs
- Vendors use native/preferred internal model formats optimized for their architecture
Industry pattern:
- All major vendors provide conversion mechanisms from popular open-source formats
- Common supported formats: ONNX, TFLite, PyTorch, TensorFlow
- References provided for major vendors: Qualcomm, Apple, Samsung, MediaTek, Google, Huawei
Advantages of model-format/IR-based reference implementation:
- Decouples algorithm definition from hardware-specific implementation
- Leverages existing SoC vendor compilers, AI accelerators, and runtimes
- Significantly more portable, maintainable, and future-proof
Recommended approach for ULBC reference implementation:
1. Base reference on ML model-format with auxiliary signal processing in C
2. Include both ONNX and PyTorch as ML model-formats
3. Define neural network model-format including operator set and version
4. Specify I/O interfaces of ML models and auxiliary signal processing steps in C
5. Use reference implementation for integration illustration, verification, and testing
The document proposes:
1. Discussion and agreement on selection of one or more model formats for ULBC reference implementation
2. Agreement on principle of using model format as part of ULBC standardization reference model
3. Documentation of findings in TR 26.940 under new clause 6.4.2
This contribution represents a significant departure from traditional 3GPP codec standardization approaches by advocating for ML model formats rather than pure C implementations. The proposal addresses practical deployment considerations for ML-based codecs while maintaining compatibility with 3GPP standardization practices through hybrid approach combining model formats with C code for signal processing components.