# VIDEO SWG Ad-Hoc Report (Post SA4#134)

## Meeting Overview

This report covers two VIDEO SWG ad-hoc sessions held between SA4#134 and SA4#135:
- **Session 1**: December 16, 2025 (15:00-18:00 CET)
- **Session 2**: January 27, 2026 (15:00-18:00 CET)

The sessions were conducted online with 22 and 18 participants respectively.

---

## 1. Maintenance Work - VOPS

### S4aV250070: Annotation Schema for VOPS and FS_AVFOPS_MED

**Status**: Agreed

**Key Contributions**:
- Proposed annotation schema (JSON format) and animation example for video signals used in conformance testing
- Addresses the intersection of VOPS and Advanced Video Formats (AV) conformance activities
- Proposes moving conformance activities into the study phase to align completion with normative work

**Technical Details**:
- Introduced annotation schema for documenting test bitstreams
- Covers requirements for bitstream generation and references to existing testing specifications
- Reference encoders are informative only, not normative
- Big Buck Bunny identified as initial test sequence candidate
- JSON files to be added to 3GPP Forge as baseline

**Discussion Points**:
- Clarified that encoder specifications remain outside normative scope
- Focus on providing conformant bitstreams with documentation of generation methods
- Encoder references needed for reproducibility but not as normative elements

---

## 2. FS_AVFOPS_MED (Study on Advanced Video Formats and Operation Points)

### 2.1 S4aV250069: Video with Changeable Background

**Status**: Endorsed

**Scenario Description**:
- Enables users to change recorded video backgrounds using auxiliary alpha channel
- Uses MV-HEVC for alpha channel signaling

**Technical Aspects**:
- Reviews previous 3GPP work (none found in existing specs)
- References external developments:
  - MV-HEVC for multi-view video coding
  - CICP draft for alpha channel signaling
  - ISO base format amendment for alpha maps

**Open Issues**:
- **Alpha signal definition**: Lack of clear, codec-independent specification for alpha channels
- Alpha channel represents transparency levels, but conventions vary by system
- CICP work ongoing but not yet providing complete definition
- Action identified: Need precise, codec-agnostic definition of alpha signals and interpretation

### 2.2 S4aV250071: Refocusable Video

**Status**: Endorsed

**Scenario Description**:
- Recording video with ability to change focus plane after capture
- Uses three tracks: blurred video, sharp video, and depth map

**Technical Aspects**:
- References Android and Apple ecosystem implementations
- Ongoing work in CICP for depth map code points
- ISO file format amendments in progress

**Open Issues**:
- **Signal mapping**: Need clarity on mapping between signals in specifications and Android/Apple APIs
- **Depth interpretation**: Complexity of depth normalization and application across scenarios
- **Interoperability**: Need for clear definitions of how depth maps are defined and used
- More documentation required on APIs and signal mapping

### 2.3 S4aV250072: Video with Semantic Segmentation Map

**Status**: Endorsed

**Scenario Description**:
- Combines video signal with semantic segmentation map
- Segmentation map assigns class identifiers (e.g., face, hair, clothes) to pixel values
- Applications: AR filtering, video indexing

**Technical Aspects**:
- JVET work extending MV-HEVC auxiliary layers to support segmentation planes
- Codec-agnostic approach for broader applicability
- Each segmentation model has its own category set
- Application must know which model was used to interpret IDs correctly

**Key Technical Challenges**:
- **Semantic vocabulary**: Need defined vocabulary or model reference for segmentation map values
- **Coding artifacts**: Robustness achieved by mapping value ranges to class IDs (number of classes typically much lower than possible sample values)
- **Model reference**: Essential to specify which segmentation model was used for interoperability
- **Signal definition**: Need to define signals and semantics independently of coding solutions, not relying on SEI messages or codec-specific mechanisms

**Discussion Outcomes**:
- Focus on defining signals independently of coding solutions
- Further work needed on requirements, model references, and semantic definitions
- Use case valuable despite implementation concerns

---

## 3. Avatar Phase 2 Work (Avatar_Ph2_MED)

### 3.1 S4aV260005: Corrective CR for TS 26.813

**Status**: Endorsed

**Details**:
- Content agreeable but needs revision for SA4#135 in Goa
- Template from portal to be used
- Corrections and clarifications to existing Avatar specification

### 3.2 S4aV260004: Work Plan for Avatar_Ph2_MED

**Status**: Agreed

**Work Plan Structure**:
- Objectives organized by dependencies and complexity
- Some evaluations can only be done after earlier steps complete
- Prioritization based on interest and technical dependencies

**Process Agreements**:
- **CR submission approach**: Flexibility maintained for early CRs if solutions complete and group agrees
- **Parallel work**: Normative work should not run in parallel with study (discouraged practice)
- **Early implementation**: CRs can be issued early if consistent, but doesn't automatically start normative work
- **FFS items**: Can be addressed early with intermediate CRs if appropriate
- **Corrections vs. features**: Clarifications/corrections can be made anytime if justified; new features should wait for study completion

**Key Issues Clarification**:
- Listed "key issues" are actually objectives from the work item, not newly identified problems
- Study planned to run until September 2026
- Possibility of starting normative work earlier if editors' notes addressed

### 3.3 S4aV260006: Consolidated CR for FS_avatar_ph2_MED

**Status**: Endorsed as basis for further work

**Approach**:
- Base CR to track changes across study
- Contributions as candidate text against new or existing sections
- Consolidation at each meeting
- CRs sent for approval when ready

**Editorial Requirements**:
- Cannot attach full draft TRs to CRs
- Must use correct 3GPP styles
- References must be up-to-date
- Only modified clauses included in final CR
- TR version not referenced in document
- "Avatar Phase 1" terminology not used in TR
- Change history of TR not used to integrate co-CRs

**Open Question**:
- Security and SA3 objective (possibly objective 7) listed in time plan but corresponding section not visible in CR document

---

## 4. Process and Administrative Matters

### Document Handling Decisions

- **Terminology**: Use "inputs" and "discussion papers" rather than "pCR" (pseudo-CR)
- **Structure**: Discussion papers with candidate text preferred over single large CRs
- **Consolidation**: Multiple contributors can submit text against base CR, consolidated at meetings
- **Tracking**: Changes tracked in document and through section changes

### IPR and Antitrust

Standard IPR and antitrust clauses read at opening of each session, with reminders about:
- Obligation to inform Organizational Partners of Essential IPRs
- Compliance with competition laws
- Consensus-based decision making

---

## Summary of Agreed/Endorsed Documents

| Document | Title | Status | Target |
|----------|-------|--------|--------|
| S4aV250070 | Annotation schema for VOPS/FS_AVFOPS_MED | Agreed | Baseline for conformance work |
| S4aV250069 | Video with changeable background | Endorsed | CR to TR 26.966 |
| S4aV250071 | Refocusable video | Endorsed | CR to TR 26.966 |
| S4aV250072 | Video with semantic segmentation map | Endorsed | CR to TR 26.966 |
| S4aV260004 | Avatar Phase 2 work plan | Agreed | Study organization |
| S4aV260005 | Corrective CR for 26.813 | Endorsed | Revision for Goa |
| S4aV260006 | Consolidated CR for Avatar Phase 2 | Endorsed | Basis for further work |

---

## Key Technical Gaps Identified

1. **Alpha channel definition**: Need codec-independent specification for alpha signal interpretation
2. **Depth map interoperability**: Mapping between specifications and platform APIs unclear
3. **Semantic segmentation**: Model reference and vocabulary definition required for interoperability
4. **Signal definitions**: Need codec-agnostic definitions independent of SEI or codec-specific mechanisms