TDoc: S4-260445

Meeting: TSGS4_135_India | Agenda Item: 7.8

Back to Agenda
Document Information
Title

[FS_ULBC] Analysis of AI Codec Real-Time Performance (RTF) and Complexity Scaling

Source

vivo Mobile Communication Co., Xiaomi Technology, Spreadtrum, Bytedance

Type

pCR

For

Agreement

Release

Rel-20

Specification

26.94

3GPP Document
View on 3GPP
Abstract

As part of the study on the new Ultra Low Bitrate Speech Codec (ULBC) [1], it is necessary to establish complexity constraints that reflect real-world device capabilities. Previous contributions have analyzed theoretical complexity using static metrics such as FLOPs and WMOPS [2] [5]. However, static metrics often fail to capture system-level bottlenecks, such as memory bandwidth pressure and thermal constraints on mobile System-on-Chips (SoCs). This contribution presents a comprehensive performance analysis of a neural audio codec (based on the Descript Audio Codec architecture) running on a representative mid-range mobile platform. By sweeping across model sizes (1M to 74M parameters) and sample rates (8, 16, 32 kHz), we evaluate the correlation between theoretical complexity and the Real-Time Factor (RTF).

TDoc S4-260445
Title [FS_ULBC] Analysis of AI Codec Real-Time Performance (RTF) and Complexity Scaling
Source vivo Mobile Communication Co., Xiaomi Technology, Spreadtrum, Bytedance
Agenda item 7.8
Agenda item description FS_ULBC (Study on Ultra Low Bitrate Speech Codec)
Doc type pCR
For action Agreement
Abstract As part of the study on the new Ultra Low Bitrate Speech Codec (ULBC) [1], it is necessary to establish complexity constraints that reflect real-world device capabilities. Previous contributions have analyzed theoretical complexity using static metrics such as FLOPs and WMOPS [2] [5]. However, static metrics often fail to capture system-level bottlenecks, such as memory bandwidth pressure and thermal constraints on mobile System-on-Chips (SoCs). This contribution presents a comprehensive performance analysis of a neural audio codec (based on the Descript Audio Codec architecture) running on a representative mid-range mobile platform. By sweeping across model sizes (1M to 74M parameters) and sample rates (8, 16, 32 kHz), we evaluate the correlation between theoretical complexity and the Real-Time Factor (RTF).
Release Rel-20
Specification 26.94
Version 0.4.0
Related WIs FS_ULBC
download_url https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260445.zip
For Agreement
Spec 26.94
Type pCR
Contact Wang Dong
Uploaded 2026-02-12T13:52:15.543000
Contact ID 107237
TDoc Status agreed
Is revision of S4-260155
Reservation date 12/02/2026 08:42:42
Agenda item sort order 20
Comments
You must log in to post comment