Meeting: TSGS4_135_India | Agenda Item: 9.5
6 documents found
| TDoc Number | Source | Title | Summarie |
|---|---|---|---|
| Xiaomi Communications |
[FS_AVFOPS-MED] Work Plan for Advanced Video Formats and Operation Points (FS_AVFOPS)
|
Work Plan for Advanced Video Formats and Operation Points (FS_AVFOPS_MED)Document OverviewThis document (S4-260130) from Xiaomi proposes a work plan for the Study Item on Advanced Video Formats and Operation Points (FS_AVFOPS_MED). The study focuses on identifying and documenting new video representation formats and their integration into 3GPP specifications, particularly TS 26.265. Study ObjectivesThe study item encompasses eight main objectives: Objective 1: New Representation Formats
Objective 2: Video Signal Generation FeasibilityStudy feasibility of generating video signals corresponding to the hypothetical optical system (interoperability point 2) by: - Sub-objective 2a: Document requirements of real-time capturing systems to meet video signal characteristics (interface points 1a and 1b), including: - Temporal alignment of captured pictures - Frame rate - Bit depth - Sub-objective 2b: Identify different types of video processing functions applied to video source signals from typical UE capturing systems, including the possibility of AI-based image generation (interoperability point 2) Objective 3: Compression Options
Objective 4: Transport System Integration
Objective 5: Traffic Characteristics
Objective 6: Bitstream Conformance Environment
Objective 7: Gap IdentificationIdentify gaps in specifications and provide guidance: - In 3GPP specifications: Especially on operation points, provide guidance for potential normative work - In other SDO specifications (e.g., MPEG): Coordinate possible actions with relevant SDOs Objective 8: Coordination with External Organizations
Study Working MethodsContribution ProcessThe study adopts the following working methods: - Individual CRs created for specific aspects of work (e.g., one CR per new scenario) - A merged CR will be created at the end of the study to compile all changes to TR 26.966 - Company proposals to progress CRs submitted as "discussion" TDocs - Upon agreement, the responsible person of the CR implements the agreement Current CR StatusThree CRs are currently in progress: | CR | Title | Latest TDoc | Clause | Responsible | |---|---|---|---|---| | 0001r12 | New scenario: Video with changeable background | S4-252016 / S4aV250069 | 5.6 (new) | Emmanuel Thomas | | 0002r12 | New scenario: Refocusable video | S4aV250071 / S4-252018 | 5.7 (new) | Emmanuel Thomas | | 0003 | New scenario: Video with semantic segmentation map | S4aV250072 | 5.8 (new) | Emmanuel Thomas | Detailed Work Plan TimelineSA4#134 (November 17-21, 2025, Dallas, US)
Video SWG AHG Telco (December 16, 2025)
SA4#135 (February 9-13, 2026, Goa, IN) - Current Meeting
Video SWG AHG Telcos (TBD 2026)Three telcos planned (dates TBD): - First two telcos: Continue work on objectives 1, 2, and 3 - Third telco: - Continue work on objectives 1, 2, and 3 - Start work on objectives 4, 5, 6, and 7 - Host: Qualcomm SA#111 (March 10-13, 2026, Fukuoka, JP)
SA4#135-bis-e (April 13-17, 2026)
SA4#136 (May 11-15, 2026, Montreal, CA)
SA#112 (June 9-12, 2026, Singapore, SG)
ProposalThe document proposes to agree on the work plan provided in the timeline above. |
|
| Xiaomi Communications |
[FS_AVFOPS_MED] New scenario: Video with changeable background
|
Summary of 3GPP Technical Document S4-260131Document Information
Main PurposeThis CR proposes adding a new scenario to TS 26.966 addressing video with changeable background functionality, progressing objective 1 on identifying relevant new representation formats not yet documented in TS 26.265. Technical ContributionsNew Scenario #5: Video with Changeable Background (Clause 5.6)Overview (5.6.1)The CR introduces a scenario addressing the growing use case of mobile video editing where users: - Record and edit videos directly on their devices - Upload source video to cloud services for editing, or - Use local applications to generate final video The key technical requirement is video compositing where: - One video is overlaid on another visual content - Alpha blending is performed using an alpha channel - Pixel-accurate transparency information is required - Alpha channel is typically carried in lossless manner The CR includes four illustrative figures showing: 1. Original video frame 2. Associated alpha plane 3. Video frame with alpha plane applied 4. Alpha blended video frame with background image Review of Previous Work (5.6.2)The CR notes that coded representation of alpha auxiliary channels as part of video bitstream has not been addressed in 3GPP specifications until now. Review of Related Work (5.6.3)MV-HEVC Support (5.6.3.1)
ISO/IEC 23091-2:2025 CICP Video
MIAF and HEIF Support (5.6.3.2)
Alpha Plane Interpretation Rules: - Minimum sample value: Transparency (co-located pixel is transparent) - Maximum sample value: Opacity (co-located pixel is opaque) - Alpha value: Normalized between 0.0 and 1.0 - Sample values divided by maximum value (e.g., 255 for 8-bit) provides multiplier for master image intensity - Requirement: Encoded alpha planes must use full sample range (0-255 for 8-bit) SMPTE RP 157:2012 (5.6.3.3)Defines signal properties for key/alpha/matte signals (terms used interchangeably): Properties: - Black level: Complete transparency - White level: Complete opacity - Transfer function: Out of scope, but assumed linear; black/white levels conform to image format specifications - Alpha value: Normalized 0.0 (fully transparent) to 1.0 (fully opaque) - Sample mapping: Co-located with corresponding fill luminance or RGB samples (zero pixel offset) - Timing: Timed coincident with associated fill video signal ISO/IEC 23008-2 HEVC / ITU-T H.265 (5.6.3.4)
ISO/IEC 14496-12 ISOBMFF (5.6.3.5)
SMPTE ST 268-1:2014 DPX Format (5.6.3.6)
Code Points Supporting Alpha: - Value 4: Alpha (matte) - Value 51: R, G, B, Alpha (A) - Value 52: A, B, G, R - Value 101: CB, Y, A, CR, Y, A (4:2:2:4) - Value 103: CB, Y, CR, A (4:4:4:4) Functional Requirements (5.6.4)The CR establishes functional analysis framework based on: Hardware Impact AssessmentTwo possibilities: 1. Existing hardware support: Reference to example hardware products 2. No existing hardware support: Discussion/description with justifications on expected hardware implementation impact, or reference to existing demos Codec Capabilities
Impact
Revision History
|
|
| Xiaomi Communications |
[FS_AVFOPS_MED] New scenario: Refocusable video
|
Summary of 3GPP Technical Document S4-260143Document Information
Purpose and ScopeThis CR proposes adding a new scenario (Scenario #6) on Refocusable Video to TR 26.966, addressing objective 1 of identifying relevant new representation formats not yet documented in TS 26.265. Main Technical Contributions5.7.1 Overview - Use Case DescriptionThe CR introduces the concept of refocusable video, which enables post-capture modification of depth of field effects (bokeh). Key points:
5.7.2 Previous Work in 3GPPIdentifies gap: coded representation of depth maps as part of video bitstream has not been addressed in 3GPP specifications. 5.7.3 Review of Related WorkComprehensive survey of depth map representation across multiple standards bodies: 5.7.3.1 ISO/IEC 23091-2:2025 CICP Video
5.7.3.2 ISO/IEC 23000-22 MIAF and ISO/IEC 23008-12 HEIF
5.7.3.3 SMPTE ST 2087:2016 Depth Map RepresentationDefines comprehensive depth map data representation with key definitions: Terminology: - Reference Camera: Camera corresponding to viewpoint (can be virtual) - Depth Map: Array of depth values corresponding to image pixels - Depth Value: Distance in meters from reference camera to object surface, measured parallel to optical axis - Relative Depth Value: Offset and scaled representation of depth value Two representations specified:
5.7.3.4 ISO/IEC 23008-2 HEVC / ITU-T H.265
5.7.3.5 ISO/IEC 14496-12 ISOBMFF
5.7.3.6 SMPTE ST 268-1:2014 DPX FormatDigital Picture Exchange Format v2.0 for moving pictures: Depth component support: - Code value 8: Depth (Z) component Transfer characteristics: - Code 11: Z (depth) – linear - Code 12: Z (depth) – homogeneous (requires distance to screen and angle of view in user-defined section) 5.7.4 Functional RequirementsOutlines analysis framework based on:
References AddedThe CR adds 9 new normative/informative references covering: - Android AOSP camera bokeh documentation - JVET documents on CICP extensions - ISO/IEC standards (MIAF, ISOBMFF amendments) - SMPTE standards (RP 157, ST 268-1, ST 2087) - Google Dynamic Depth specification - Android MP4-AT file format Impact Assessment
|
|
| Xiaomi Communications |
[FS_AVFOPS_MED] New scenario: Video with semantic segmentation map
|
Summary of 3GPP Technical Document S4-260174Document Information
PurposeThis CR proposes adding a new scenario to TS 26.966 addressing video with semantic segmentation maps, progressing objective 1 on identifying relevant new representation formats not yet documented in TS 26.265. Main Technical Contributions5.8 Scenario #57: Video with Semantic Segmentation Maps5.8.1 Overview and Use Case DescriptionSemantic Segmentation Fundamentals - Technique where every pixel in an image is classified into one or more semantic classes - Example classes from Android ARCore Scene Semantics API: sky, building, tree, road, vehicle, sidewalk, terrain, structure, water, object, person - Enables AR applications with advanced video processing (sky replacement, realistic lighting effects) Mobile Implementation Context - Real-time capture of segmentation maps alongside camera view is commonly available on recent mobile devices - Leverages high-capacity camera/video pipelines and AI frameworks with hardware optimizations - Specialized models exist for specific content types (e.g., multi-class selfie segmentation) Multi-class Selfie Segmentation Model - Provides 7 classes for selfie shots: - Background - Hair - Body-skin - Face-skin - Clothes - Others (accessories) Use Cases - Video effects (hair replacement, face filtering) - Video indexing - AI search 5.8.1.2 Example Image Segmentation Method on Mobile PlatformProcessing Pipeline Three main steps identified: 1. Frame acquisition 2. AI inference 3. Generation of segmentation map Implementation Details - Uses Google Media Pipe framework API for image segmentation - AI model performs inference on camera frames - Output format: 2D array of unsigned 8-bit integers - Each value represents estimated category for each input pixel Class Identifier Mapping For multi-class selfie segmentation: - 0: background - 1: hair - 2: body-skin - 3: face-skin - 4: clothes - 5: others (accessories) Efficiency Considerations - Direct class identifier representation is inefficient (only 6 values out of 255 used) - Mapping class identifiers to sample value ranges improves: - Transport efficiency - Robustness to encoding artifacts Example Mapping Table | Class ID | Assigned Value | Sample Range | |----------|---------------|--------------| | 0 | 21 | 0-42 | | 1 | 64 | 43-85 | | 2 | 107 | 86-128 | | 3 | 150 | 129-171 | | 4 | 193 | 172-214 | | 5 | 235 | 215-255 | 5.8.2 Review of Previous Work
5.8.3 Review of Related Work5.8.3.1 In ISO/IEC 23008-2 HEVC / ITU-T H.265Current Status in JVET - Encoding of semantic maps not currently enabled by MV-HEVC standard - JVET developing possible MV-HEVC extension with: - New auxiliary layer type called "segmentation plane" - Picture segmentation information SEI message for interpreting decoded samples as class identifiers - Reference: JVET-AN2032 (40th Meeting, Geneva, October 2025) 5.8.4 Functional RequirementsAssessment Framework Two aspects for functional analysis:
References
|
|
| Xiaomi Communications |
[FS_AVFOPS_MED] Updates to possible solutions and mapping to scenarios
|
Change Request Summary: FS_AVFOPS_MED Solutions and Mapping UpdatesDocument Information
PurposeThis CR updates the possible solutions related to new use cases, specifically adding solutions for Scenario #5: Video with changeable background. Main Technical Contributions1. Updated Solution-to-Scenario Mapping (Clause 6.0)The CR extends Table 6.0-1 to include two new solutions for Scenario #5:
These solutions address video with changeable background use cases. 2. Solution #5.1: Multi-layer HEVC with Auxiliary Alpha Layer (New Clause 6.10)High-level DescriptionThis solution leverages HEVC multi-layer extensions to carry alpha planes as auxiliary channels:
Technical ApproachAuxiliary Picture Signalling:
- Uses Profile ConsiderationsTwo possible approaches identified for further study: 1. Multiview profiles (though only one view is present) 2. Combination of non-Multiview profile for base layer with monochrome profile for auxiliary layer Open Issues: - Different chroma subsampling between layers - Different encoding configurations - Spatial resolution differences - Bit depth variations 3. Solution #5.2: Multi-HEVC Bitstreams with Alpha Signalling Using CICP (New Clause 6.11)High-level DescriptionThis solution uses two independent HEVC bitstreams: 1. First bitstream: Video content 2. Second bitstream: Alpha plane sequence Technical ApproachAlpha Plane Signalling:
- Alpha plane carried as a single-layer HEVC bitstream
- Current HEVC specification lacks explicit signalling for alpha plane sequences
- Proposed solution: Use specific code points in VUI information
- Reference to potential CICP extension [x2] for signalling via VUI Parameters:
- Signalling through Profile ConsiderationsSince bitstreams are independent: - Video content bitstream: Any HEVC profile - Alpha plane bitstream: - Monochrome profiles (Monochrome, Monochrome 10, Monochrome 12, Monochrome 16) - 4:2:0 profiles References Added
Items for Further StudyBoth solutions (#5.1 and #5.2) have evaluation sections marked as "For further study", indicating: - Performance evaluation pending - Profile compatibility analysis needed - Implementation considerations to be determined |
|
| Xiaomi Communications |
[FS_AVFOPS_MED] Permanent document on conformance v1.1.0
|
3GPP TSG-SA WG4 Meeting #135 - AVFOPS Permanent Document v2.0.0Document InformationSource: Xiaomi (PD editor) Main Technical Contributions1. Introduction and ScopeThis permanent document consolidates all conformance-related material for video operation points (VOPS), gathering requirements, frameworks, and test content submitted to SA4 meetings. The document has evolved from VOPS work item to FS_AVFOPS study item. 2. Conformance Framework (Clause 4)2.1 Sample Bitstream Platform Overview (4.1)The platform architecture consists of: - Database: Contains descriptions of available sample bitstreams - Hosting server(s): Store submitted bitstreams - Public portal: Enables external users to search and download bitstreams - Bitstream validator: Validates compliance with TS 26.265 constraints prior to upload The database is proposed as a git repository on web-based platforms (GitHub/GitLab) using JSON/markup files. Each bitstream links to TS numbers and profiles via URNs. 2.2 Bitstream Validator (4.2)Repository location: https://forge.3gpp.org/rep/sa4/ts-26.265/conformance/bitstream-validator Key capabilities: - Validates bitstream compliance with video coding specifications and profiles - Validates compliance with TS 26.265 bitstream constraints - Uses reference decoder (JVET) for codec conformance checking - Implements programmatic constraint validation via XML schema Technical approach: 1. Parse input bitstream and generate XML dump of syntax elements 2. Express VOPS constraints as XML schema (XSD 1.1) 3. Validate XML bitstream description against constraint schemas Usage workflow: ``` Generate XML descriptionpython -m sa4_bitstream_validator dump bitstream_path description.xml Validate against operation point schemapython -m sa4_bitstream_validator validate description.xml bitstream_rules/operation_point.xsd ``` Advantages: - Codec-agnostic constraint expression - No programming knowledge required for constraint definition - Reusable bitstream descriptions for database 2.3 VOPS Operation Points as XML Schema (4.2.4)Constraints defined using XSD 1.1 with 3. Framework Development Status (4.6)Comprehensive status tracking table provided covering: 3.1 3GPP Video Representation Formats (4.6.2)
3.2 Common Bitstream Constraints (4.6.3)AVC Bitstreams: - Motion-vector constraints: None implemented - Rate constraints: None implemented HEVC Bitstreams: - Progressive constraints: Done - VUI constraints: Work-in-progress (done but not tested with bitstreams) - Frame-packing constraints: None implemented Specific VUI constraint validations include:
- Note: Timing information constraints proposed for removal (marked as issues) 3.3 Decoding Capabilities (4.6.4)Status for various decoder profiles: - AVC decoders (FullHD, UHD, 8K): None implemented - HEVC decoders (HD, FullHD, 8K): None implemented - MV-HEVC-Main-Dual-layers-UHD420-Dec: Work-in-progress - MV-HEVC-Ext-Dual-layers-UHD420-Dec: None implemented - HEVC-Frame-Packed-Stereo-Dec: None implemented 3.4 Video Operation Points (4.6.5)
4. Key Changes in Version 1.1.0 (SA4#135)Alignment with TS 26.265 V19.1.0: - Validation of multi-layer parameters in VPS - Validation of ScalabilityId constraint added - Validation of VUI-specific constraints for MV-HEVC operation points - Validation of three_dimensional_reference_displays_info SEI message MV-HEVC Stereo Common Bitstream Requirements (6.3.6.2): Implemented validations:
- 5. Conformance Material (Clause 5)5.1 HEVC Conformance MaterialSource Content: - Polytech Nantes database: 31 sequences, 1920x1080, 10-bit 4:2:2 YUV at 25 fps (availability issues noted) Compressed Bitstreams:
Reference Software: - HM reference software for HEVC - HTM reference software for MV-HEVC and 3D-HEVC extensions 6. External References and BackgroundAnnex A provides background on DASH-IF conformance suite approach, noting that existing tools (DASH-IF, GPAC/MP4Box) can parse NAL units and generate XML dumps but do not implement comprehensive video bitstream validation against 3GPP operation point constraints. SummaryThis document represents significant progress in establishing a comprehensive conformance framework for 3GPP video operation points. The main achievements include:
The work-in-progress items focus primarily on MV-HEVC stereo operation points, with most AVC and single-layer HEVC operation points awaiting implementation. |
Total Summaries: 6 | PDFs Available: 6