Summary of 3GPP Technical Document S4-260239
Document Overview
This is a pseudo-CR to TR 26.958 v0.1.1 addressing viewport-adaptive delivery workflows for large-scale 3D Gaussian Splatting (3DGS) scenes in the context of FS_3DGS_MED study. The contribution focuses on enabling delivery of massive 3DGS environments (e.g., city-scale digital twins) to mobile devices with constrained resources.
Problem Statement
Large-scale 3DGS scenes (as defined in clause 5.4) cannot be fully loaded into mobile device memory due to:
- Bandwidth limitations
- Memory constraints
- Rendering capacity restrictions
Static delivery workflows would result in:
- Excessive latency
- Immediate resource saturation
- Inability to deliver complete scenes
Simple capability negotiation alone is insufficient for these use cases.
Main Technical Contributions
Viewport-Adaptive Workflow (Clause 9.2.3)
The document proposes a new clause 9.2.3 introducing a viewport-adaptive workflow that extends existing capability negotiation mechanisms by incorporating continuous spatial feedback.
Core Mechanism
- Dynamic Spatial Context: UE continuously transmits 6DoF pose and Field of View (FoV) to server
- Metadata Format: Adheres to formats defined in TR 26.928 (XR services)
- Rendering Budget Management: Server optimizes 3DGS stream relative to user's perspective while staying within negotiated rendering budget
Spatial Optimization Strategies (Clause 9.2.3.2)
Two approaches are defined:
Tiled Environments with LOD
- Environment partitioned into spatial tiles
- Multiple levels of detail (LOD) per tile
- Server selects appropriate LOD based on:
- Proximity to user
- Visibility within frustum
- LOD Distribution:
- High-density tiles (e.g., LOD 4) for viewport center
- Lower-density tiles (e.g., LOD 1-3) for peripheral/distant areas
- Concentrates point budget where user is looking
Unstructured Scenes
- Real-time frustum culling, pruning, and merging
- High point density in center of FoV
- Aggressive simplification in peripheral zones
- Dynamic primitive removal/merging for non-visible areas
Server-Centric Decision Workflow (Clause 9.2.3.3)
Two-Phase Approach:
Static Initialization Phase
- Hardware Capabilities Assessment: UE evaluates resources via system APIs or OpenXR
- Capability Reporting: UE transmits comprehensive capability report to server
- Server-Side Capability Decision: Server defines global rendering budget (max point count, SH degree) for session
Dynamic Delivery Phase
- Viewpoint and FoV Determination: UE calculates current 6DoF pose and camera frustum
- Viewpoint and FoV Information: UE sends spatial metadata to server
- Content Adaptation Based on FoV: Server selects visible spatial tiles and adapts content (pruning, merging, LOD selection, quantization) to fit budget and user's view
- Optimized 3DGS Data: Server streams adapted content payload (N points) to UE
- Local Adaptation: UE performs final on-device adjustments if necessary
- 3DGS Rendering: UE renders the scene
Key Characteristic: Server maintains control over rendering budget throughout session based on initial capability assessment.
Client-Centric Decision Workflow (Clause 9.2.3.4)
UE-Driven Approach:
Initialization Phase
- Hardware Assessment Analysis: UE performs internal audit of hardware capabilities
- Decision of Best Representation Format: UE selects optimal configuration (max point count, SH degree)
- 3DGS Format Request: UE requests content from server, specifying desired format parameters (point budget, SH degrees, quantization)
Delivery Phase
- Viewpoint and FoV Determination: UE calculates current spatial position and FoV
- Viewpoint and FoV Information: UE sends spatial metadata to server
- Content Adaptation Based on FoV: Server filters scene spatially (frustum culling/tile selection) and adapts data to match format requested in step 3
- Optimized 3DGS Data: Server delivers visible content conforming to requested parameters
- Local Adaptation: UE applies final local refinements for runtime stability
- 3DGS Rendering: UE renders received content
Key Characteristic: UE explicitly requests specific representation format during initialization; server's role restricted to spatial operations while adhering to UE-imposed format constraints.
Alignment with Existing Specifications
- Builds upon capability negotiation described in clause 9.2.2
- Aligns with viewport-dependent streaming principles from TR 26.928 (XR services)
- Addresses use case defined in clause 5.4 (Large 3DGS scenes)
Proposal
The document proposes to agree the changes introducing clause 9.2.3 and its subclauses (9.2.3.1-9.2.3.4) to TR 26.958, including two workflow diagrams (Figures 5 and 6) and one illustration of tile/LOD selection (Figure 4).