[FS_ULBC] Analysis of AI Codec Complexity Scaling
This contribution addresses the need for establishing relevant complexity evaluation methods for the new ULBC codec standardization. Previous contributions (e.g., S4aA250264) highlighted potential gaps between theoretical complexity metrics (FLOPs) and practical on-device performance (Real-Time Factor).
This document provides a complementary analysis focusing on how complexity metrics scale with AI model architecture itself. The analysis investigates the relationship between model architecture, theoretical complexity, and traditional metrics using the publicly available DAC codec as a test case.
The analysis created seven "dummy" model variants based on the open-source DAC codec's 16kHz configuration. The approach:
Decoder rates: [8, 5, 4, 2]
Scaling Approach:
encoder_dim and decoder_dim were modifiedFrame size: 20ms (320 samples at 16kHz)
Variant Configurations:
Complexity Metrics Measured:
Implementation Notes:
- Each AI operation implemented in pure C
- Source files annotated and compiled using wmc_tool
- WMOPS highly sensitive to C implementation efficiency
- Naive implementations can yield significantly higher counts than optimized versions
Key Findings:
Key Finding: Clear relationship between AI model size (in millions of parameters) and traditional WMOPS complexity.
Observations on DAC Model:
Complete complexity metrics for all seven DAC variants (16kHz, 20ms frame):
| Variant | Enc Dim | Dec Dim | Params (M) | GFLOP counts | MFLOP/s | WMOPS Enc | WMOPS Dec |
|---------|---------|---------|------------|--------------|---------|-----------|-----------|
| enc8dec144 | 8 | 144 | 1.09 | 0.009 | 437.09 | 333.92 | 760.53 |
| enc12dec288 | 12 | 288 | 2.89 | 0.028 | 1397.63 | 648.23 | 2732.96 |
| enc16dec384 | 16 | 384 | 4.94 | 0.050 | 2481.98 | 1060.79 | 4724.38 |
| enc24dec576 | 24 | 576 | 10.76 | 0.112 | 5578.38 | 2228.92 | 10399.00 |
| enc32dec768 | 32 | 768 | 18.90 | 0.198 | 9911.72 | 3693.56 | 18093.30 |
| enc40dec960 | 40 | 960 | 29.34 | 0.310 | 15482.00 | 5599.48 | 28019.70 |
| enc64dec1536 | 64 | 1536 | 74.50 | 0.792 | 39614.50 | 13675.30 | 70766.69 |
Data demonstrates rapid scaling of all metrics as encoder and decoder dimensions increase.
Based on the DAC model variant analysis:
Linear Relationship: For the DAC model, there is a clear linear relationship between Theoretical Complexity (MFLOP/s), Model Parameters, and measured WMOPS. As MFLOP/s or parameter count increases, WMOPS increases linearly, provided C coding style remains consistent.
Quadratic Growth: Increasing model's internal dimensions causes complexity to grow quadratically. Even small dimension increases lead to disproportionately large jumps in MFLOP/s and WMOPS.
Implementation Dependency: WMOPS score depends heavily on source C code efficiency.
It is proposed to capture the above analysis into 3GPP TR 26.940.