Unknown
S4-260120 / TSGS4_135_India / 9.8 / Qualcomm Atheros, Inc. / [FS_Avatar_Ph2_MED] 3D Gaussian Splatting...
Next Edit
S4-260120

[FS_Avatar_Ph2_MED] 3D Gaussian Splatting Avatar Methods for Real-Time Communication

Source: Qualcomm Atheros, Inc.
Meeting: TSGS4_135_India
Agenda Item: 9.8

All Metadata
Agenda item description FS_Avatar_Ph2_MED (Study on Avatar communication Phase 2)
Doc type discussion
download_url Download Original
Type discussion
Contact Imed Bouazizi
Uploaded 2026-02-03T21:49:01.057000
Contact ID 84417
Revised to S4-260353
TDoc Status revised
Reservation date 03/02/2026 05:29:47
Agenda item sort order 43
Review Comments
manager - 2026-02-09 04:55


  1. [Technical] The claim “No changes to animation stream required” (Proposed Architecture Step 2) is not substantiated for all cited methods: mesh-embedded Gaussians may need additional per-Gaussian binding metadata (triangle ID, barycentric coords, local frame/covariance transport rules) and potentially additional animation parameters for non-mesh components (hair/teeth/tongue/eyes), which are not clearly covered by the existing ARF Animation Stream Format.




  2. [Technical] Backward compatibility via “store mesh-embedded Gaussians as auxiliary data within glTF/ARF containers” (Step 1) is underspecified: ARF/glTF needs a normative extension mechanism (schema, MIME/box, or glTF extension) defining attribute semantics, coordinate frames, units, and default behaviors; otherwise different decoders will interpret the same auxiliary data differently.




  3. [Technical] Determinism is overstated: “Explicit methods naturally deterministic given fixed floating-point rules” ignores that GPU raster/compute pipelines, floating-point contraction, sorting ties in depth-ordered alpha compositing, and parallel reduction order can yield non-bit-exact results across vendors; conformance would need explicit ordering rules and error tolerances, not just “fixed floating-point rules.”




  4. [Technical] The proposed “depth-ordered alpha compositing” rendering model is central to 3DGS but no interoperability-critical details are given (sorting key definition, handling of equal depths, tile-based sorting, prefiltering, blending equation, color space), making it hard to assess whether ARF can standardize a decoder-independent rendering outcome.




  5. [Technical] The document asserts “direct ARF compatibility” for GaussianBlendshape/SplattingAvatar, but does not map their control parameters to specific ARF constructs (e.g., which blendshape set, naming/ID mapping, ranges, neutral definition, coordinate conventions), risking a mismatch between research model parameters (FLAME/SMPL-X) and ARF-defined animation semantics.




  6. [Technical] The “40 KB/s for real-time animation” streaming figure (Step 4) is presented without assumptions (number of joints, blendshape count, sampling rate, quantization, overhead, RTP/transport framing), and may be misleading given typical face blendshape streams can exceed this depending on rate and precision.




  7. [Technical] Compression proposals (SPZ, L-GSC, HAC++, Compact3D) are listed without clarifying whether they are geometry-only, attribute-aware (SH coefficients, opacity), support random access/partial decode, or preserve required precision for stable splat rendering; Objective 7 evaluation needs criteria tied to ARF use cases (latency, progressive LOD, error metrics).




  8. [Technical] The “graceful fallback: mesh-only renderers can ignore Gaussian extension and still animate” is only valid if the base avatar always includes a complete mesh representation; several 3DGS approaches are not mesh-complete (e.g., hair volumes), so the fallback behavior and minimum mesh requirements should be stated.




  9. [Technical] Non-rigid elements (hair/clothing/accessories) are acknowledged as a challenge, but the proposed ARF integration does not define how “secondary Gaussians” are driven (extra bones, physics, per-frame deltas, or optional streams), which is likely the dominant interoperability gap for full-body avatars.




  10. [Technical] The classification “Hybrid methods… can still be driven by blendshape parameters with MLP weights distributed as part of base avatar” glosses over runtime dependencies: even small MLPs require a standardized inference graph, activation functions, quantization, and tensor layouts; without aligning to an existing standardized neural model format/profile, “portable” decoding is not ensured.




  11. [Editorial] Several performance/quality numbers (FPS, PSNR, training time, storage like “~3.5 MB”) are presented as facts but lack citations, test conditions, and hardware baselines; SA4 contributions typically need references or at least a consistent evaluation setup to avoid cherry-picked comparisons.




  12. [Editorial] Terminology is inconsistent and sometimes ambiguous (e.g., “Gaussian Head Avatar” vs “GaussianHead”; “3DGS-Avatar” appears once without definition; “AAUs” is used without expansion in this document), which will hinder readers trying to relate items to known papers/spec terms.




  13. [Editorial] The document repeatedly states “ARF compatibility” but does not reference specific clauses of ISO/IEC 23090-39 or the corresponding 3GPP study text (FS_Avatar_Ph2_MED Objective 3/7) where gaps exist; adding explicit clause-level mapping would make the contribution actionable for SA4.



Sign in to add comments.