[FS_3DGS_MED] Pseudo-CR on Enhanced Scenario for Avatar Communication Use Case
Source: Pengcheng Laboratory, China Mobile Com. Corporation
Meeting:
TSGS4_135_India
Agenda Item:
9.6
| Agenda item description | FS_3DGS_MED (Study on 3D Gaussian splats) |
|---|---|
| Doc type | pCR |
| For action | Agreement |
| Release | Rel-20 |
| Specification | 26.958 |
| Version | 0.1.1 |
| Related WIs | FS_3DGS_MED |
| download_url | Download Original |
| For | Agreement |
| Spec | 26.958 |
| Type | pCR |
| Contact | chaofan he |
| Uploaded | 2026-02-03T12:25:40.917000 |
| Contact ID | 107635 |
| TDoc Status | noted |
| Reservation date | 03/02/2026 12:20:56 |
| Agenda item sort order | 41 |
Review Comments
[Technical] The proposal introduces a “static 3DGS representation” that “follows mesh deformation” at the receiver, but it does not specify a deformation model for Gaussians (e.g., per-Gaussian skinning weights, attachment to mesh surface, or a learned deformation field), making interoperability and feasibility unclear.
[Technical] “Spatial alignment” between the deformable mesh and 3DGS is asserted without defining the coordinate frames, calibration requirements, and how alignment is maintained under pose/expression changes; this is a core missing element for a normative scenario description.
[Technical] The transmission strategy lacks a concrete definition of what constitutes the “base avatar” payload versus “animation parameters” (parameter sets, units, ranges, timing model), so the claimed bandwidth/latency benefits cannot be evaluated or compared to other TR 26.958 scenarios.
[Technical] The document assumes SMPL‑X/FLAME parameter extraction in real time but does not address model licensing/IP, standardization suitability, or whether the scenario is intended to be model-agnostic; referencing specific proprietary/de facto models may conflict with 3GPP’s technology-neutral TR positioning.
[Technical] “3DGS updated at lower frequency than animation parameters” is underspecified: no triggers (appearance change, lighting change, topology change), update granularity (full set vs patches), or drift/consistency handling are described, which is critical for interactive bidirectional use.
[Technical] The receiver rendering is described as “composite” (mesh shading + 3DGS appearance) but no compositing rules are given (occlusion, depth ordering, alpha blending, shadowing), risking ambiguous visual results and undermining the scenario’s reproducibility.
[Technical] “Viewpoint adaptation supported within application-defined constraints” is too vague for a TR scenario; it should at least state whether free-viewpoint is expected, what baseline view range is assumed, and how artifacts are handled when extrapolating beyond capture coverage.
[Technical] The capture assumptions (“one or more cameras”) omit key constraints that drive feasibility (mono vs multi-view, depth availability, required resolution/frame rate, lighting), which are necessary to justify real-time parameter extraction and 3DGS generation.
[Technical] The proposal does not discuss error resilience and synchronization between the low-latency animation stream and the lower-rate 3DGS updates (e.g., timestamping, buffering, late/early update handling), which is essential for interactive communication scenarios.
[Technical] There is no discussion of how identity personalization is handled (e.g., per-user mesh/3DGS creation, enrollment time, update cadence), yet “base avatar transmitted once” implies a prior creation pipeline that should be captured in the scenario.
[Editorial] As a “Pseudo-CR,” the contribution summary does not indicate the exact TR 26.958 clause(s) to be updated, nor does it provide proposed text; without clause-level changes, SA4 cannot efficiently assess consistency with existing scenarios and terminology.
[Editorial] Several terms are introduced without definition or alignment to TR terminology (e.g., “deformation propagation,” “appearance contributions,” “application-defined constraints”), which should be tightened to avoid multiple interpretations across implementers.
[Editorial] The summary claims “efficient bandwidth utilization” but provides no qualitative comparison point (e.g., versus full 3DGS streaming or mesh+texture video), making the motivation read as aspirational rather than supported by scenario requirements.