[FS_ULBC] Influence of code optimization on WMOPS
This contribution investigates the impact of C code implementation choices on WMOPS (Weighted Million Operations Per Second) measurements for neural audio codecs, specifically in the context of the ULBC (Ultra-Low Bitrate Codec) study. The source examines whether WMOPS, traditionally used for 3GPP speech codec complexity evaluation, is suitable for neural audio codecs given that actual C implementation can significantly affect WMOPS measurements.
The source conducted experiments on Conv1D and Conv1DTranspose operators, which are extensively used in DAC (Discrete Audio Codec) for audio feature dimension manipulation:
Key Results:
- Conv1D: 441 WOPS (non-optimized) → 301 WOPS (optimized) = 68.25% ratio
- Conv1DTranspose: 554 WOPS (non-optimized) → 260 WOPS (optimized) = 46.93% ratio
Finding: The same optimization strategy yields significantly different optimization ratios for different operators.
Using the optimized and non-optimized operator implementations, the source measured WMOPS for two DAC configurations (enc16dec384 and enc64dec1536) and compared against previously reported results [4]:
Total WMOPS:
- enc16dec384: 13,320.35 (non-opt) → 8,152.01 (opt) → 5,785.17 (reported in [4])
- enc64dec1536: 201,552.55 (non-opt) → 123,966.49 (opt) → 84,441.99 (reported in [4])
Encoder WMOPS:
- enc16dec384: 3,411.08 (non-opt) → 2,621.98 (opt) → 1,060.79 (reported in [4])
- enc64dec1536: 50,089.70 (non-opt) → 39,604.59 (opt) → 13,675.30 (reported in [4])
Decoder WMOPS:
- enc16dec384: 9,847.22 (non-opt) → 5,484.21 (opt) → 4,724.38 (reported in [4])
- enc64dec1536: 151,291.59 (non-opt) → 84,255.25 (opt) → 70,766.69 (reported in [4])
The source draws two critical observations:
Main Conclusion: If WMOPS is adopted as a complexity metric for ULBC, results will be highly influenced not only by model design but also by the actual C code implementation, potentially making comparisons between different codec proposals inconsistent.
The source proposes to document the experimental findings and observations as a new clause 7.6.5 "WMOPS analysis on DAC" in TR 26.940, with three sub-clauses:
- 7.6.5.1: On operator level
- 7.6.5.2: On full-model level
- 7.6.5.3: Observation
This would capture the implementation-dependency issues of WMOPS measurements for neural audio codecs in the technical report.