On complexity estimation of ULBC
This contribution addresses the complexity measurement methodology for the Ultra-Low Bitrate Codec (ULBC) under development in 3GPP SA4. The document proposes a hybrid complexity metric that combines traditional DSP-based measurements with ML-specific metrics.
Multiple input documents [1-4] have previously discussed complexity measurement approaches:
- Documents [1] and [3] proposed using WMOPS (Weighted Million Operations Per Second), following conventional speech codec practices
- Document [2] suggested using MACs and a modified WMOPS version
- Document [4] emphasized model size considerations
The key challenge is that ULBC will operate on heterogeneous, non-fixed target hardware and processors, requiring a platform-agnostic complexity metric.
The document proposes combining two complementary measurement approaches:
For DSP-based components:
- Use traditional WMOPS measurement
For ML-based components:
- Use MAC (Multiply-Accumulate) operations count
- Include parameter count for memory/model size considerations
Combined metric formula:
WMOPS + w · MACs
where w is an ML weighting factor (expected to be < 1) that reflects the vectorization capability of matrix multiplications.
Limitations of WMOPS-only approach:
- WMOPS reflects complexity primarily for DSP operations
- Does not account for modern vectorization capabilities available even on modern DSPs
- Less relevant for non-DSP processor types
- The WMOPS toolbox doesn't reflect modern computational capabilities
ML-specific considerations:
- ML component complexity is dominated by matrix multiplications
- Inference time and energy consumption are highly platform-dependent
- MAC count provides architecture-agnostic computational load measurement
- Parameter count relates directly to model size, memory usage, and energy consumption
The hybrid approach provides:
1. Overall complexity estimate for hybrid DSP+ML codec designs
2. Avoids over-constraining codec design toward specific platforms (referenced S4-260233)
3. Allows UE vendors to leverage custom architectures and optimizations
4. Accounts for efficient vectorization of ML components
5. Enables flexible computational cost balancing between DSP-based and ML-based components
6. Maintains continuity with established practice while accommodating emerging ML-based designs
The document provides example processing units and their vectorization capabilities to inform the ML weighting factor w:
| Chip | Type | Vectorization Capabilities |
|------|------|---------------------------|
| HiFi 5s | DSP | 32×(8×8 bit MAC)
16×(32×16 bit MAC)
8×(32×32 bit MAC) |
| ARM Cortex A55 | CPU | 16×(8×8 MAC)
8×(16×16 MAC FP) |
The source proposes to:
Combine according to: WMOPS + w · MACs (where w is an ML weighting factor)
Define a maximum value as the computational complexity limit in design constraints
Apply similar principles for memory counting metrics
The document references five previous contributions [1-4] and two external technical specifications [5-6] for processor capabilities.