S4-260228 - AI Summary

DaCAS-2: Test methodologies and requirements v0.6

AI-Generated Summary AI

DaCAS-2: Test Methodologies and Requirements v0.56

Document Overview

This document defines test methodologies and requirements for Device-Assisted Capture Audio Systems (DaCAS) as part of the DaCAS Work Item. It is structured around four main objectives: defining minimum performance requirements for raw microphone signals, evaluating immersive audio capture example solutions, verifying/revising requirements based on example solution performance, and potential alignment with TS 26.260 and 26.261.

Requirements for Raw and Compensated Microphone Signals

Raw Microphone Signals

The document proposes requirements and recommendations for raw microphone signals (Table 1), covering:

Frequency Response: Minimum captured frequency should be below 100 Hz (-3dB point), with flat response above minimum frequency and below resonances
Resonances: Shall be above 6 kHz (should be above 8 kHz)
SNR: Shall be at least 54 dB (should be at least 60 dB)
Sensitivity: Recommended level below -25 dBFS (±1 dB) with 1 kHz sine signal @ 94 dB SPL
Directivity: Recommended omnidirectional characteristics with clear documentation
ADC Bit Depth: Shall be at least 16 bits (should be 24 bits)
Acoustical Overload Point: Shall be at least 120 dB SPL @ 10% THD (should be 130 dB SPL)

Editor's Note: Decision needed on normative vs. informative requirements.

Compensation on Microphone Signals

Compensation is considered optional for example solution development. If proponents provide compensated signals, they shall provide:
- Compensation filter specifications with relevant data
- Instructions on filter application to raw microphone signals

Compensated Microphone Signal Requirements

Table 2 defines requirements for compensated signals:
- Compensated Frequency Response: Must be within required/recommended masks (Tables 3-4) when applied to signals used for filter design
- Phase Properties: Should compensate sound source direction-independent phase differences
- Masks defined for frequencies 100 Hz to 16 kHz with tighter tolerances in recommended vs. required masks

Compensation Processing Methods

Diffuse Field Based Compensation Method

Integrated from S4aA250054, this method includes:

Derivation of Compensation Filters

Recording Environment Requirements:
- Quiet room with low reverberation
- 7.1.4 surround loudspeaker layout (or known layout with front/rear/side/height differentiation)
- Diffuse uncorrelated pink noise signals (omitting subwoofer)
- Device positioned landscape orientation, camera toward center speaker
- 10 seconds noise recordings + 15 seconds silence for noise floor estimation

Processing Steps:
1. Gain Matching: Derive broadband relative gain G(ch) per channel to align outlier microphones
2. Equalization Estimation:
- Model theoretical impulse responses from device geometry
- Create simulated device microphone signals
- Compare simulated vs. real recordings to estimate port resonance equalization
- Calculate: port(ch,f) = sim(ch,f) – dev(ch,f)
- Convert to banded spectrum and model resonances parametrically
3. Noise Floor Estimation: Generate banded per-channel noise floor estimate NF(ch,b) from silence recording

Compensation Processing (DFT domain):
- Convert input DFT to banded spectrum spec(ch,b)
- Calculate noise floor compensation gains: g(ch,b) = (spec(ch,b) – NF(ch,b))/(spec(ch,b) + eps)
- Apply combined gain: G(ch) * EQ(ch,f) * g(ch,f)

Editor's Note: Clarification needed on linear vs. log scale for applied gain and method status.

IMPro Method

Integrated from S4aA260008, based on Integrated Microphone Pressure frequency response measurement:

IMPro Calculation

Integrated microphone pressure frequency response calculated by:

Equation (1): Dividing measured integrated microphone output signal response by probe signal output response at reference point at sound port inlet, with probe microphone calibration

Compensation Filter Target Response

For M microphones, compensated output defined as convolution of raw signal with equalization filter. Target equalization filter compensates integrated microphone response within target frequency response mask to correspond to delayed pressure signal at sound inlet.

Equation (2): Target frequency response in frequency domain within target mask range

Proposed Calibration Method

Steps:
1. Perform IMPro measurements for all device microphones
2. Prepare UE software for raw microphone recording
3. Setup loudspeaker and device (0.5-2m distance)
4. Prepare sine sweep stimulus (~30dB above background noise)
5. Calibrate probe microphone(s)
6. Perform IMPro measurement (pressure at sound inlet + DUT recording)
7. Time-align signals
8. Calculate integrated microphone pressure frequency response per microphone
9. Design linear equalization filters to align responses within masks
10. Implement equalization filters in UE software
11. Process raw signals with equalization filters
12. Verify compensated signals satisfy frequency response mask requirements

Evaluation Procedure

Integrated from S4aA260008, the evaluation procedure includes:

Overview

Self-evaluation approach for example solutions (mandatory)
Optional cross-evaluation on best-effort basis by interested proponents with relaxed documentation guidelines
High-level documentation guidelines followed
At least informal assessment against TS 26.261 requirements

Self-Evaluation Procedure

Proponent processes recordings with example solution
Proponent performs objective evaluation using DaCAS-2 evaluation scripts
Subjective evaluation recommended (methods TBD)
Detailed documentation of procedure and results provided

Optional Cross-Evaluation

Any 3GPP member company may evaluate one or more example solutions
Cross-evaluation lab runs evaluation script with example solution output signals
May evaluate provided processed signals or run solution with recordings
Detailed documentation of procedure and results provided

Editor's Note: Clarification needed on cross-evaluation scope, documentation in specification, and minimum dataset.

Documentation Guidelines

Evaluation reports shall include:
- Target device(s) used
- Description of example solution input signals
- Details on output signals including IVAS input format(s)
- Example solution output signals (provided)
- Evaluation considering IVAS input format characteristics
- Evaluation results and observations

For self-evaluation, additional tests for realistic scenarios not covered in TS 26.260 may be included with full documentation and recording availability.

Test Methodologies

Objective Test Methodologies

Recording Setups and Scenarios

Single Source Scenario (Table 1):
- Sound Source: High-quality loudspeaker compliant with TS 26.260 clause 4.0.2
- Source Signal: British English single talk (ITU-T P.501), Male/Female, 20Hz-20kHz, 35.4s, -27 dB RMS, 48 kHz, 16-bit
- Calibration: 75 dB SPL playback, equalized spectrum within ±1 dB (100-200 Hz) and ±0.5 dB (200 Hz-20 kHz)
- Acoustic Environment: Anechoic chamber OR acoustically treated room (ETSI TS 103 224 or ITU-T BS.1116 compliant)
- Positioning:
- Hand-held/Headset: 1-1.5m distance, elevation 0°
- Table-mounted: per TS 26.260 clause 5.4.2.5, elevation 26.6°
- Azimuth angles: 0°, ±30°, ±60°, ±90°

Multi-Source Scenario for ISM Evaluation (Tables 2):

Recording procedure:
1. Record sound sources individually (reference signals)
2. Sum individual recordings to obtain final input signals

Scenario X-1 (Table-mounted):
- UE lying flat on table, screen up
- Source distance: 0.5-1m (equal for both sources)
- Source height: 0.4m relative to UE
- Azimuth angle combinations: [-90°, 90°], [-110°, 70°], [-110°, 90°]
- Overlap pattern: source 1 only (25%) → source 2 only (25%) → both sources (50%)
- Applicable only for smartphone-type devices

Scenario X-2 (Hand-held):
- UE in hand-held landscape orientation, screen toward sources
- Source distance: 0.3-0.5m (equal for both sources)
- Source height: 0m relative to UE
- Azimuth angle combinations: [-30°, 30°], [-45°, 45°], [-30°, 45°]
- Same overlap pattern as X-1
- Applicable only for smartphone-type devices

Test Methods

Delay:
- Assess algorithmic delay only (input to example solution → output signal)
- Mitigates testing inaccuracies and acoustic path impact
- Dependencies on platform where example solution runs recognized

Loudness:
- Use recordings from single source scenario (azimuth=0°, elevation=0°)
- Process with example solution
- Analyze according to TS 26.260 clause 5.6.2
- Editor's Note: ITU-R BS.1770/P.700 via binaural rendering under consideration (pending ATIAS discussion)

Frequency Response:

For Stereo, SBA, MASA capture:
- Use recordings from single source scenario (azimuth=0°, elevation=0°)
- Process with example solution
- Analyze according to TS 26.260 clause 5.6.3

For ISM capture:
- Use recordings from scenarios X-1 and X-2
- Process with example solution
- For each object, calculate frequency response per TS 26.260 clause 5.6.3.2 using corresponding individual sound source recording as reference

Directional Information:
- Based on TS 26.260 clause 5.6.4 for Stereo, SBA, MASA formats
- Assessment directly on example solution output (excluding transmission assumptions)
- Use recordings from single source scenario for all defined sound source directions
- Process with example solution
- Compute directional measurement and metric per TS 26.260 clause 5.6.4
- Editor's Note: For other formats, intermediate rendering to supported format via IVAS reference renderer could be considered

Test Script for Objective Evaluation

General:
- Programming language: Python
- Available at: forge.3gpp.org/rep/sa4/audio/dacas
- Editor's Notes:
- Licensing, requirements, environment to be added
- Missing components (database reading, loudness test rendering, final report generation, reference signal handling, format support) to be added after DaCAS-2 details finalized
- Updated version to be uploaded to Audio subgroup repo

Core Functions:
- read_wav_file: Read WAV files (16-, 24-, or 32-bit depth); PCM support may be added
- estimate_delay_whole: Calculate delay between channels across whole signal (based on TS 26.260 Annex C)
- p56_active_level: Estimate speech active level (ITU-T P.56)
- compute_panorama: Estimate stereo panorama (TS 26.260 methodology)
- frequency_response: Compute 1/12-octave bandwidth spectrum (ISO-3 R40, 100 Hz-12 kHz)
- p79_slr: Compute Send Loudness Rating (TS 26.260)

Editor's Notes:
- Format support to be added
- IVAS reference renderer to be added
- Decision needed on ITU-R BS.1770/P.700 via binaural rendering

Processing Approach:
- Example solutions process recording database into IVAS input format files with optional metadata
- Scripts read entire audio signal, convert to floating-point, perform offline evaluations
- Current version includes stereo directional information analysis support

Subjective Test Methodologies

Section placeholder - content TBD for:
- Test conditions
- Recording setups and scenarios
- Recording database
- Test methods

Requirements

Requirements for Objective Test Methodologies

Placeholders for requirements on:
- Delay
- Loudness
- Frequency response
- Directional information

Requirements for Subjective Test Methodologies

Content TBD

Revision History

v0.1 (2025-05-22, SA4#132): Initial version
v0.2 (2025-06-16, Audio SWG AH): Added compensation method and single source recording scenario
v0.3 (2025-07-22, SA4#133-e): Added agreed content from multiple contributions
v0.4 (2025-11-17, Audio SWG AH): Added agreed content on IMPro method
v0.5 (2025-11-19, SA4#134): Added agreed content and test script descriptions
v0.6 (2026-02-08, Audio SWG AH): Added agreed content on evaluation procedure

Document Information

TDoc:
S4-260228

Source:
Nokia (Editor)

Type:
discussion

For:
Agreement

Original Document:
View on 3GPP

Title: DaCAS-2: Test methodologies and requirements v0.6

Agenda item: 7.5

Agenda item description: DaCAS (Diverse audio CApturing System for UEs)

Doc type: discussion

For action: Agreement

Related WIs: DaCAS

Contact: Arvi Lintervo

Uploaded: 2026-02-03T21:30:39.920000

Contact ID: 98678

Revised to: S4-260295

TDoc Status: revised

Is revision of: S4-252071

Reservation date: 03/02/2026 20:31:55

Agenda item sort order: 17