Meeting: TSGS4_135_India | Agenda Item: 7.8
[FS_ULBC] Analysis of AI Codec Real-Time Performance (RTF) and Complexity Scaling
vivo Mobile Communication Co., Xiaomi Technology, Spreadtrum, Bytedance
pCR
Agreement
As part of the study on the new Ultra Low Bitrate Speech Codec (ULBC) [1], it is necessary to establish complexity constraints that reflect real-world device capabilities. Previous contributions have analyzed theoretical complexity using static metrics such as FLOPs and WMOPS [2] [5]. However, static metrics often fail to capture system-level bottlenecks, such as memory bandwidth pressure and thermal constraints on mobile System-on-Chips (SoCs). This contribution presents a comprehensive performance analysis of a neural audio codec (based on the Descript Audio Codec architecture) running on a representative mid-range mobile platform. By sweeping across model sizes (1M to 74M parameters) and sample rates (8, 16, 32 kHz), we evaluate the correlation between theoretical complexity and the Real-Time Factor (RTF).
| TDoc | S4-260445 |
| Title | [FS_ULBC] Analysis of AI Codec Real-Time Performance (RTF) and Complexity Scaling |
| Source | vivo Mobile Communication Co., Xiaomi Technology, Spreadtrum, Bytedance |
| Agenda item | 7.8 |
| Agenda item description | FS_ULBC (Study on Ultra Low Bitrate Speech Codec) |
| Doc type | pCR |
| For action | Agreement |
| Abstract | As part of the study on the new Ultra Low Bitrate Speech Codec (ULBC) [1], it is necessary to establish complexity constraints that reflect real-world device capabilities. Previous contributions have analyzed theoretical complexity using static metrics such as FLOPs and WMOPS [2] [5]. However, static metrics often fail to capture system-level bottlenecks, such as memory bandwidth pressure and thermal constraints on mobile System-on-Chips (SoCs). This contribution presents a comprehensive performance analysis of a neural audio codec (based on the Descript Audio Codec architecture) running on a representative mid-range mobile platform. By sweeping across model sizes (1M to 74M parameters) and sample rates (8, 16, 32 kHz), we evaluate the correlation between theoretical complexity and the Real-Time Factor (RTF). |
| Release | Rel-20 |
| Specification | 26.94 |
| Version | 0.4.0 |
| Related WIs | FS_ULBC |
| download_url | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260445.zip |
| For | Agreement |
| Spec | 26.94 |
| Type | pCR |
| Contact | Wang Dong |
| Uploaded | 2026-02-12T13:52:15.543000 |
| Contact ID | 107237 |
| TDoc Status | agreed |
| Is revision of | S4-260155 |
| Reservation date | 12/02/2026 08:42:42 |
| Agenda item sort order | 20 |