S4-260168 Metadata - 3GPP Contribution Reviewer

Document Information

Title

[FS_3DGS_MED] Pseudo-CR on 3DGS renderer and performance benchmarking

Source

Tencent Cloud

Type

pCR

For

Agreement

Release

Rel-20

Specification

26.958

3GPP Document

View on 3GPP

TDoc	S4-260168
Title	[FS_3DGS_MED] Pseudo-CR on 3DGS renderer and performance benchmarking
Source	Tencent Cloud
Agenda item	9.6
Agenda item description	FS_3DGS_MED (Study on 3D Gaussian splats)
Doc type	pCR
For action	Agreement
Release	Rel-20
Specification	26.958
Version	0.1.1
Related WIs	FS_3DGS_MED
download_url	https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260168.zip
For	Agreement
Spec	26.958
Type	pCR
Contact	Julien Ricard
Uploaded	2026-02-03T21:41:18.937000
Contact ID	109076
Revised to	S4-260385
TDoc Status	revised
Reservation date	03/02/2026 15:13:57
Agenda item sort order	41

Comments

Previous Comments:

manager

2026-02-09 04:38:15

[Technical] The proposal mandates/assumes per-frame full back-to-front sorting “for proper alpha blending” without discussing established alternatives (e.g., approximate OIT, depth peeling limits, per-tile sorting, or k-buffer) or the correctness impact; TR text should not imply a single required method if multiple are viable for 3DGS.

[Technical] Claiming “CPU-based Radix Sort preferred over GPU Compute Shaders on mobile for thermal balance and driver compatibility” is not substantiated with comparative data or conditions (SoC/GPU family, driver versions, dataset sizes), and risks being misleading guidance in a TR.

[Technical] The renderer description mixes “tile-based rasterizer inspired by original 3DGS” with OpenGL ES vertex/fragment pipeline but does not specify where tile binning occurs (CPU vs GPU), how tiles map to screen space, or how overdraw/blending is handled; as written it is incomplete and hard to reproduce.

[Technical] Stating “Gaussian attributes loaded into VRAM at startup” and “only sorted indices transferred CPU→GPU per frame” omits the memory footprint and bandwidth feasibility on mobile (e.g., FP32 covariance/color/SH for 485k points), and does not address what happens under memory pressure or when models exceed GPU memory.

[Technical] Use of “FP32 textures/buffers for precision” is presented as a design choice but the TR should discuss precision/performance trade-offs (FP16/packed formats/quantization) since FP32 is often a major bottleneck on mobile and may contradict the study’s “mobile feasibility” narrative.

[Technical] Benchmark methodology is under-specified: resolution, FOV, render target format, MSAA, vsync, fixed camera path, and whether AR pose updates are disabled are not clearly defined, making the FPS/power numbers non-comparable across implementations.

[Technical] “Thermal management API usage for consistent clock speeds” is vague and potentially incorrect for Android devices (many controls are advisory and OEM-specific); the TR should specify exact APIs, permissions, and how effectiveness was verified, otherwise results may not be reproducible.

[Technical] Power measurement via “Android Battery Manager API” is not sufficiently accurate/consistent across devices and often reports averaged or estimated values; the TR should either qualify the limitations strongly or recommend external power measurement for normative comparisons.

[Technical] The conclusion “GPU saturation occurs at ~150k points (87% load)” relies on “GPU %” metrics that are not defined (source tool, sampling interval, what 100% means); without a standardized measurement method, the saturation point is not defensible.

[Technical] The reported behavior “Beyond saturation, frame rate decreases linearly with point count” is an overreach given only a few data points and no confidence intervals; the TR should present it as an observed trend for this setup, not a general property.

[Technical] SH degree impact results show lower power at higher SH degree (e.g., 1.45W at degree 0 vs 0.99W at degree 3) while GPU remains “100%”; this is internally inconsistent and suggests measurement artifacts or uncontrolled variables that must be explained before drawing conclusions.

[Technical] Disabling “AR runtime” for benchmarking may invalidate the stated mobile player architecture use case (AR camera tracking + rendering); the TR should either benchmark both modes or clearly separate “renderer-only” performance from “end-to-end AR” performance.

[Editorial] The contribution is described as a “Pseudo-CR” against TR 26.958 v0.1.1 but does not provide actual CR-style change markup, exact clause text, or proposed insertions/deletions; reviewers cannot verify consistency with existing Section 12.4.x wording.

[Editorial] Several statements read like requirements (“critical,” “preferred,” “recommended <200k visible points”) but TRs should keep such guidance clearly non-normative and scoped (device class, resolution, quality targets), otherwise it may be misinterpreted as specification direction.

<ol>
<li>
[Technical] The proposal mandates/assumes per-frame full back-to-front sorting “for proper alpha blending” without discussing established alternatives (e.g., approximate OIT, depth peeling limits, per-tile sorting, or k-buffer) or the correctness impact; TR text should not imply a single required method if multiple are viable for 3DGS.
</li>
<li>
[Technical] Claiming “CPU-based Radix Sort preferred over GPU Compute Shaders on mobile for thermal balance and driver compatibility” is not substantiated with comparative data or conditions (SoC/GPU family, driver versions, dataset sizes), and risks being misleading guidance in a TR.
</li>
<li>
[Technical] The renderer description mixes “tile-based rasterizer inspired by original 3DGS” with OpenGL ES vertex/fragment pipeline but does not specify where tile binning occurs (CPU vs GPU), how tiles map to screen space, or how overdraw/blending is handled; as written it is incomplete and hard to reproduce.
</li>
<li>
[Technical] Stating “Gaussian attributes loaded into VRAM at startup” and “only sorted indices transferred CPU→GPU per frame” omits the memory footprint and bandwidth feasibility on mobile (e.g., FP32 covariance/color/SH for 485k points), and does not address what happens under memory pressure or when models exceed GPU memory.
</li>
<li>
[Technical] Use of “FP32 textures/buffers for precision” is presented as a design choice but the TR should discuss precision/performance trade-offs (FP16/packed formats/quantization) since FP32 is often a major bottleneck on mobile and may contradict the study’s “mobile feasibility” narrative.
</li>
<li>
[Technical] Benchmark methodology is under-specified: resolution, FOV, render target format, MSAA, vsync, fixed camera path, and whether AR pose updates are disabled are not clearly defined, making the FPS/power numbers non-comparable across implementations.
</li>
<li>
[Technical] “Thermal management API usage for consistent clock speeds” is vague and potentially incorrect for Android devices (many controls are advisory and OEM-specific); the TR should specify exact APIs, permissions, and how effectiveness was verified, otherwise results may not be reproducible.
</li>
<li>
[Technical] Power measurement via “Android Battery Manager API” is not sufficiently accurate/consistent across devices and often reports averaged or estimated values; the TR should either qualify the limitations strongly or recommend external power measurement for normative comparisons.
</li>
<li>
[Technical] The conclusion “GPU saturation occurs at ~150k points (87% load)” relies on “GPU %” metrics that are not defined (source tool, sampling interval, what 100% means); without a standardized measurement method, the saturation point is not defensible.
</li>
<li>
[Technical] The reported behavior “Beyond saturation, frame rate decreases linearly with point count” is an overreach given only a few data points and no confidence intervals; the TR should present it as an observed trend for this setup, not a general property.
</li>
<li>
[Technical] SH degree impact results show lower power at higher SH degree (e.g., 1.45W at degree 0 vs 0.99W at degree 3) while GPU remains “100%”; this is internally inconsistent and suggests measurement artifacts or uncontrolled variables that must be explained before drawing conclusions.
</li>
<li>
[Technical] Disabling “AR runtime” for benchmarking may invalidate the stated mobile player architecture use case (AR camera tracking + rendering); the TR should either benchmark both modes or clearly separate “renderer-only” performance from “end-to-end AR” performance.
</li>
<li>
[Editorial] The contribution is described as a “Pseudo-CR” against TR 26.958 v0.1.1 but does not provide actual CR-style change markup, exact clause text, or proposed insertions/deletions; reviewers cannot verify consistency with existing Section 12.4.x wording.
</li>
<li>
[Editorial] Several statements read like requirements (“critical,” “preferred,” “recommended &lt;200k visible points”) but TRs should keep such guidance clearly non-normative and scoped (device class, resolution, quality targets), otherwise it may be misinterpreted as specification direction.
</li>
</ol>

You must log in to post comment

Log In

TDoc: S4-260168