S4-260168 - AI Summary

[FS_3DGS_MED] Pseudo-CR on 3DGS renderer and performance benchmarking

Back to Agenda Download Summary
AI-Generated Summary AI

Summary of 3GPP Change Request S4-260168

Document Information

  • Source: Tencent
  • Title: Pseudo-CR on 3DGS renderer and performance benchmarking
  • Specification: 3GPP TR 26.958 v0.1.1
  • Study: FS_3DGS_MED (3D Gaussian Splats for mobile)

Main Objective

This change request proposes adding technical content to TR 26.958 regarding a reference implementation of a 3DGS player for mobile platforms, including mobile renderer features and preliminary experimental benchmark results obtained on commercial mobile devices.

Technical Contributions

1. Mobile Renderer Architecture (Section 12.4.1)

The document proposes a hybrid architecture for the 3DGS mobile player:

  • Native Layer (C++):
  • Implements core rendering using OpenGL ES 3.2
  • Tile-based rasterizer inspired by original 3DGS method
  • CPU sorting or Compute Shaders for parallel sorting (e.g., Radix sort)
  • Vertex and Fragment shaders for rendering

  • Application Layer (Java/Kotlin):

  • UI management
  • AR runtime lifecycle for camera tracking
  • Resource management

  • Capabilities:

  • Supports standard .ply file loading
  • Real-time interaction (rotation, translation, scaling)
  • Benchmarking mode with dynamic parameter variation

2. Rendering Process Details (Section 12.4.1 - second subsection)

Key technical aspects of the mobile rendering pipeline:

  • Depth Sorting: Critical back-to-front sorting performed by CPU each frame for proper alpha blending (unlike Z-buffer-based mesh rendering)
  • Sorting Implementation: CPU-based Radix Sort preferred over GPU Compute Shaders on mobile for thermal balance and driver compatibility
  • Data Management:
  • Gaussian attributes loaded into VRAM at startup
  • FP32 textures/buffers for precision in covariance and color calculations
  • Only sorted indices transferred CPU→GPU per frame
  • Vertex shader uses texelFetch for direct reads from persistent buffers
  • Minimizes CPU-GPU bandwidth while maintaining visual fidelity

3. Benchmark Methodology (Section 12.4.2)

Proposed benchmarking approach:

  • Dynamic parameter modification during runtime
  • Thermal management API usage for consistent clock speeds
  • AR runtime disabled during benchmarking for fair comparison
  • Variable parameters:
  • Number of Gaussians: 5,000 to 485,436 points
  • Spherical Harmonics degree: 0 (diffuse only) to 3 (full view-dependence)

4. Experimental Results (Section 12.4.3)

Test Configuration

  • Device: Google Pixel 9a (Tensor G4, mid-range, March 2025)
  • Application: Tencent 3DGS mobile player
  • Build: Release mode with optimizations
  • Test duration: 30 seconds per configuration for thermal stability
  • Model: bicycle.ply (485,436 points)
  • Power measurement: Android Battery Manager API

Impact of Number of Points (SH degree=3)

Key findings from Table 1 and Figure 2:

  • 5,000 points: 355 FPS, 24% CPU, 6% GPU, 1.45W
  • 150,000 points: 56 FPS, 47% CPU, 88% GPU, 1.47W (approaching GPU saturation)
  • 200,000 points: 45 FPS, 48% CPU, 99% GPU, 1.33W (GPU saturated)
  • 485,436 points: 19 FPS, 55% CPU, 100% GPU, 1.22W

Conclusion: GPU saturation occurs at ~150k points (87% load) and full saturation at 200k points. Beyond saturation, frame rate decreases linearly with point count.

Impact of Spherical Harmonics Degree (485k points)

Key findings from Table 2 and Figure 3:

  • SH Degree 0: 20.41 FPS, 55% CPU, 100% GPU, 1.45W
  • SH Degree 3: 18.05 FPS, 55% CPU, 100% GPU, 0.99W
  • Performance impact: ~10.8% FPS reduction from degree 0 to 3

Conclusion: Moderate frame rate impact when increasing SH degree from 0 to 3.

5. Overall Analysis (Section 12.4.2.3)

Key conclusions:

  • Real-time rendering of complex 3DGS scenes is feasible on current-generation mobile hardware
  • Scene complexity management required (< 200k visible points recommended)
  • Performance variations observed between identical experiments due to:
  • Background processes
  • Dynamic power management
  • Results should be considered as trends rather than fixed values

Editor's note: Additional benchmarks planned to evaluate impact of other improvements (memory optimization, quantization, sorting algorithms, etc.)

Rationale for Change

  • Provides concrete data to validate real-time 3DGS feasibility on mobile hardware
  • Identifies performance bottlenecks (CPU sorting, memory transfer, GPU rasterization, power consumption)
  • Supports study objectives for reference implementations and performance characteristics
  • Guides future specification work with empirical evidence
Document Information
Source:
Tencent Cloud
Type:
pCR
For:
Agreement
Original Document:
View on 3GPP
Title: [FS_3DGS_MED] Pseudo-CR on 3DGS renderer and performance benchmarking
Agenda item: 9.6
Agenda item description: FS_3DGS_MED (Study on 3D Gaussian splats)
Doc type: pCR
For action: Agreement
Release: Rel-20
Specification: 26.958
Version: 0.1.1
Related WIs: FS_3DGS_MED
Spec: 26.958
Contact: Julien Ricard
Uploaded: 2026-02-03T21:41:18.937000
Contact ID: 109076
Revised to: S4-260385
TDoc Status: revised
Reservation date: 03/02/2026 15:13:57
Agenda item sort order: 41