Proposed design constraints for noise suppression, DTX, and non-speech inputs
This contribution addresses design constraints for the ULBC (Ultra-Low Bit-rate Communication) over GEO channel solution, building upon previous discussions from S4-251881 and S4-251786. The document focuses on three key areas:
- Noise suppression handling
- Discontinuous transmission (DTX) framework
- Robustness to non-speech inputs
The contribution emphasizes that emergency calls represent a critical use case for ULBC over GEO, particularly when terrestrial network (TN) service coverage is unavailable. Key considerations include:
- Background signals may contain critical contextual information (e.g., voices, environmental sounds indicating danger)
- Post-call analysis requirements (ASR transcripts, emergency response evaluation, criminal investigations)
- Need for full situational awareness rather than aggressive noise suppression
The document identifies several technical challenges:
The contribution updates the original proposal from S4-251881 by:
- Maintaining the requirement for disableable noise suppression within the codec
- Adding specific SNR ranges for stationary (5-15 dB) and non-stationary (10-25 dB) noise
- Deferring specific noise type definitions for future discussion
- Linking noise suppression behavior primarily to performance requirements
The document proposes updates to Table 6.2-1 in draft TR 26.940 with three new/modified constraint parameters:
Requirement: If noise suppression is supported as part of the candidate codec, it must be possible to disable it to preserve background signals.
Editor's Notes:
- EN1: Requirement to disable may be considered in connection with specific operating bit rate(s)
- EN2: Solution behavior w.r.t. potential noise suppression is primarily enforced via performance requirements; default operation for tests is with noise suppression disabled
Requirement: The candidate codec shall provide a framework for:
- Voice Activity Detection (VAD)
- Discontinuous Transmission (DTX)
- Comfort Noise Generation (CNG)
- Operation with DTX on or DTX off
Editor's Note: Operation relating to DTX on and disabling/enabling potential noise suppressor may need clarification
Requirement: The candidate codec shall be robust to:
- Noisy speech with stationary noise (5-15 dB SNR)
- Noisy speech with non-stationary noise (10-25 dB SNR)
- Background signals during and between speech segments
- Other non-speech input signals
Editor's Notes:
- EN1: May need to be in performance requirements
- EN2: Relevant background signals to be further defined as part of performance requirements, including both stationary and non-stationary types
Balanced approach to noise suppression: Recognizes both the need for flexibility in noise suppression (for speech quality) and the critical requirement to preserve background signals (for emergency use cases)
Mandatory DTX framework: Establishes VAD/DTX/CNG as a required feature rather than optional, with explicit on/off control
Quantified robustness requirements: Provides specific SNR ranges for different noise conditions that the codec must handle
Testing methodology guidance: Proposes default testing with noise suppression disabled, while allowing performance requirements to govern overall behavior
Several editor's notes indicate areas requiring further work:
- Specific operating bit rates where noise suppression disable requirement applies
- Clarification of DTX and noise suppression interaction
- Final placement of robustness requirements (design constraints vs. performance requirements)
- Definition of specific background signal types for testing
- Speech quality requirements (to be addressed separately in performance requirements)