[FS_ULBC]pCR on Existing codec technologies
This pCR proposes updates to Clause 7.1 of TR 26.940, which documents existing codec technologies for evidence that design criteria can be met and for comparison/evaluation purposes. The document adds information about recently emerged ultra-low bit-rate voice codecs (below 1 kbps) as reference for further work.
The pCR significantly expands Table 7.1.1-1 "List of existing codec technologies" by adding multiple categories of codecs beyond the existing 3GPP IMS codecs. The table includes the following parameters for each codec:
These codecs support real-time operation:
- LPCNet: 1.6 kbps, WB, 40ms frames, 25ms delay
- LyraV2 (SoundStream): 3.2-9.2 kbps, WB, 20ms frames
- EnCodec: 1.5-24 kbps, 24kHz/FB, 0-1000ms delay, 13.3ms frames
- Mimi-Codec: 0.55-1.1 kbps, 24kHz, 80ms frames, 0ms delay
- TS3: 0.64-0.8 kbps, WB, 20ms frames, 0ms delay
- TAAE: 0.4-0.7 kbps, WB, 20-40ms frames, 0ms delay
- LMCodec2: Parameters TBD
These codecs are designed for offline/non-real-time applications:
- DAC: 0.5-3 kbps, WB/24kHz, 244-366ms delay
- DAC-IBM: 0.75-3 kbps, 24kHz, 366ms delay
- SNAC: 0.98 kbps, 24kHz, 1000ms delay, 80ms frames
- SpeechTokenizer: 0.5-1.0 kbps, WB, full-signal delay
- SemantiCodec: 0.31-1.4 kbps, WB, 10-40ms frames, full-signal delay
- FunCodec: 0.25-1.0+ kbps, WB, 20-40ms frames
- WavTokenizer: 0.25-0.9 kbps, 24kHz, 25-40ms frames
- BigCodec: 1.04 kbps, WB, 12.5ms frames
- FocalCodec: 0.16-0.65 kbps, WB, 20-80ms frames
- ALMTokenizer: 0.41 kbps, WB, 13.3ms frames
- XY-Tokenizer: 1 kbps, WB, 20ms frames
- LongCat-Audio-Codec: 0.43-0.87 kbps, WB, 60ms frames
- AcademiCodec: Parameters TBD
- MuCodec: 0.35-1.35 kbps, FB
The pCR includes several important notes:
An editor's note indicates that more codecs may be added to the table in future revisions.
The pCR demonstrates significant industry progress in ultra-low bitrate speech coding, particularly:
- Multiple AI-based solutions achieving sub-1 kbps bitrates
- Wide range of delay characteristics (0ms to 1000ms)
- Various bandwidth support (NB to FB)
- Different availability levels for specifications and software implementations