On ULBC complexity and RTF analysis
This contribution addresses the need to finalize complexity and memory design constraints for the ULBC (Ultra-Low Bitrate Codec) study. Previous discussions at SA4 #133-e and the ULBC ADHOC meeting explored various complexity metrics and RTF performance data for existing AI codecs (DAC, Lyra v2, HIL). However, insufficient data exists to draw definitive conclusions on complexity constraints for ULBC.
The document builds upon previous contribution S4-251844 with the following modifications:
- Added CPU core information for experiments
- Aligned RTF definition with TR 26.940 clause 7.5.3
- Focused on model sizes 3-20M parameters (more relevant to ULBC use cases)
- Provided pCR for TR 26.940
- Removed large chunk-based processing experiments (not relevant for real-time voice communication)
Modified DAC architecture with reduced parameters while maintaining general structure:
- Model sizes: 20M, 15M, 9M, and 3M parameters (float32 precision)
- Training: Optimized for ~1 kbps bitrate at 32 kHz sampling rate
- Encoder rates: 4,4,8,10 for all models
Theoretical Complexity (GMACS):
- Computed using ptflops library
- Results show linear relationship between model size and GMACS:
- 20M: 5.14 GMACS
- 15M: 4.03 GMACS
- 9M: 2.39 GMACS
- 3M: 0.79 GMACS
Device 1 (2023):
- Hexa-core CPU: 2×3.46 GHz (P core) + 4×2.02 GHz (E core)
- Dynamic core switching observed between P and E cores
Device 2 (2022):
- Octa-core CPU: 1×3.00 GHz Cortex-X2 + 3×2.50 GHz Cortex-A710 + 4×1.80 GHz Cortex-A510
- Processing on Cortex-X2 with frequency switching between 2.4 GHz and 1.8 GHz
| Model Size | Max RTF (High Performance) | Max RTF (Power Efficient) |
|------------|---------------------------|---------------------------|
| 20M | 0.39-0.63 | 0.81-0.9 |
| 15M | 0.29-0.43 | 0.66-0.74 |
| 9M | 0.19-0.29 | 0.44-0.57 |
| 3M | 0.09-0.13 | 0.18-0.31 |
Results demonstrate linear increase in RTF with model size across both performance modes.
The contribution provides a comprehensive pCR adding new clause 6.2.1.7 "RTF and MACS analysis for AI based codecs" with detailed experimental results. Key additions to TR 26.940 include:
Document the experimental methodology, results, and observations in clause 6.2.1 of TR 26.940 as shown in the provided pCR.