Meeting: TSGS4_135_India | Agenda Item: 7.8
[FS_ULBC] Analysis of AI Codec Complexity Scaling
vivo Mobile Communication Co.,
pCR
Agreement
For the standardization of the new ULBC codec [1], establishing a relevant method for evaluating complexity is essential. Previous contributions (e.g., S4aA250264 [2]) have highlighted the potential gap between theoretical complexity metrics (e.g., FLOPs) and practical, on-device performance (e.g., Real-Time Factor). A complementary aspect to this discussion is understanding how these complexity metrics scale, not just with frame size, but with the AI model's architecture itself. As AI-based codecs may be proposed with different model sizes or "operating points" (e.g., trading off quality for complexity), it is crucial to understand the relationship between model architecture, theoretical complexity, and traditional metrics. To investigate this, this contribution provides a complexity analysis of a publicly available AI codec (DAC [3]), where different "dummy" variants of the model were created by scaling the model's internal latent dimensions (DAC.encoder_dim and DAC.decoder_dim). The analysis maps the relationship between model parameters, theoretical FLOPs, and traditional WMOPS, providing data to help inform the setting of a reasonable complexity constraint framework.
| TDoc | S4-260158 |
| Title | [FS_ULBC] Analysis of AI Codec Complexity Scaling |
| Source | vivo Mobile Communication Co., |
| Agenda item | 7.8 |
| Agenda item description | FS_ULBC (Study on Ultra Low Bitrate Speech Codec) |
| Doc type | pCR |
| For action | Agreement |
| Abstract | For the standardization of the new ULBC codec [1], establishing a relevant method for evaluating complexity is essential. Previous contributions (e.g., S4aA250264 [2]) have highlighted the potential gap between theoretical complexity metrics (e.g., FLOPs) and practical, on-device performance (e.g., Real-Time Factor). A complementary aspect to this discussion is understanding how these complexity metrics scale, not just with frame size, but with the AI model's architecture itself. As AI-based codecs may be proposed with different model sizes or "operating points" (e.g., trading off quality for complexity), it is crucial to understand the relationship between model architecture, theoretical complexity, and traditional metrics. To investigate this, this contribution provides a complexity analysis of a publicly available AI codec (DAC [3]), where different "dummy" variants of the model were created by scaling the model's internal latent dimensions (DAC.encoder_dim and DAC.decoder_dim). The analysis maps the relationship between model parameters, theoretical FLOPs, and traditional WMOPS, providing data to help inform the setting of a reasonable complexity constraint framework. |
| Release | Rel-20 |
| Specification | 26.94 |
| Version | 0.4.0 |
| Related WIs | FS_ULBC |
| download_url | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260158.zip |
| For | Agreement |
| Spec | 26.94 |
| Type | pCR |
| Contact | Wang Dong |
| Uploaded | 2026-02-03T13:43:09.967000 |
| Contact ID | 107237 |
| Revised to | S4-260444 |
| TDoc Status | revised |
| Is revision of | S4-251793 |
| Reservation date | 03/02/2026 12:42:27 |
| Agenda item sort order | 20 |