TDoc: S4-260177

Meeting: TSGS4_135_India | Agenda Item: 9.8

Back to Agenda
Document Information
Title

[FS_Avatar_Ph2_MED] Interoperability guidance for ARF

Source

Qualcomm Atheros, Inc.

Type

discussion

3GPP Document
View on 3GPP
TDoc S4-260177
Title [FS_Avatar_Ph2_MED] Interoperability guidance for ARF
Source Qualcomm Atheros, Inc.
Agenda item 9.8
Agenda item description FS_Avatar_Ph2_MED (Study on Avatar communication Phase 2)
Doc type discussion
download_url https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260177.zip
Type discussion
Contact Imed Bouazizi
Uploaded 2026-02-03T21:49:01.107000
Contact ID 84417
TDoc Status agreed
Reservation date 03/02/2026 15:42:46
Agenda item sort order 43
Comments
Previous Comments:
manager
2026-02-11 07:04:44


  1. [Technical] The proposal asserts “ARF document is the normative description for interpreting an animation stream” and then proposes adding normative “shall” text to TS 26.264, but it does not clarify the normative split between TS 26.264 and ISO/IEC 23090-39 (ARF); without explicit referencing/wording, this risks contradicting 3GPP’s usual approach where TS 26.264 normatively references external specs rather than redefining their behavior.




  2. [Technical] The receiver procedure relies on “preamble.supportedAnimations” and “SupportedAnimations list indices,” but TS 26.264 clause 5.6.1 (and the referenced ARF structures) need to be checked for whether index-based addressing is stable/defined; if ARF uses IDs/URNs rather than positional indices, this guidance could create non-interoperable implementations.




  3. [Technical] “Mapping indices refer to parameter identifiers in the animation stream (ShapeKey.id … target joint index … target landmark index)” mixes identifier spaces (IDs vs indices) and animation types; the guidance should explicitly define, per animation profile, what the parameter identifier is (string ID, ordinal, semantic name) and how it is carried in the bitstream, otherwise mapping tables cannot be applied deterministically.




  4. [Technical] The proposal introduces LinearAssociation/NonLinearAssociation behavior (weighted sums, LUTs, interpolation modes, combination modes, clamping) but does not state the exact normative computation rules (ordering, clamping ranges, handling of NaN/out-of-range, coordinate units for landmarks, joint rotation representation), which is essential for interoperability if TS/TR text is to be actionable.




  5. [Technical] Defaulting rules (“0.0 for blendshape weights, bind pose for joints, neutral position for landmarks”) are underspecified and may be wrong for some profiles (e.g., joint animation may require hold-last-sample, or bind pose may cause visible popping); if included in TS 26.264, it should align with existing decoder behavior expectations for missing parameters.




  6. [Technical] The “Subset” scenario says “Unmapped target parameters default to neutral values,” but the more common interoperability issue is the inverse (stream has parameters not present in target); guidance should also specify receiver behavior for unknown/extra incoming parameters (ignore vs error) to avoid divergent implementations.




  7. [Technical] NonLinearAssociation examples reference INTERPOLATION_CUBICSPLINE and pow(input,0.5) approximation, but it’s unclear whether ARF actually defines these interpolation modes and LUT semantics for all animation types; if not already in ISO/IEC 23090-39, this guidance risks inventing capabilities not supported by the container.




  8. [Technical] The “Blink” example combines left/right with COMBINATION_SUM and then “clamp to [0,1]”; clamping is not stated as part of the mapping object semantics, and different rigs may expect additive >1.0 behavior—this needs explicit alignment with ARF-defined value domains per parameter.




  9. [Technical] The “MouthOpenSmile” example depends on “Smile (12 after linear mapping)” implying chained mappings (linear then non-linear); the proposal does not specify whether mapping stages can be composed, how dependencies are resolved, or whether cycles are allowed—this is critical for deterministic receiver processing.




  10. [Technical] Landmark mapping is treated similarly to scalar blendshape weights, but landmarks are vectors (2D/3D) and may require per-component mapping, coordinate space definition, and temporal filtering; the current text (“apply LUT … before writing 2D or 3D coordinate”) is too vague to ensure interoperable landmark animation.




  11. [Technical] The proposal places “Sender responsibility” on the “avatar owner,” but TS 26.264 needs precise role terminology (sender, encoder, content provider, ARF author) and must consider cases where the stream sender is not the ARF author (e.g., third-party capture streaming to a known avatar).




  12. [Editorial] The contribution proposes “Document the content of sections 2 and 3 in TR 26.813” but does not provide exact draft text, clause numbers, or change-marked edits; as written it is not directly actionable for rapporteurs and makes it hard to assess consistency with existing TR structure.




  13. [Editorial] “Remove the corresponding note from TS 26.264 and declare it as resolved” is ambiguous because the exact note text is not quoted; the contribution should cite the precise existing note in clause 5.6.1 and propose replacement wording to avoid accidental removal of related guidance.




  14. [Editorial] Several terms are used without definition or consistent capitalization (e.g., “SupportedAnimations” vs “supportedAnimations,” “AnimationInfo” vs “animationInfo,” “Mapping Objects”); if this is to become spec text, it should match the exact field names and terminology used in ISO/IEC 23090-39 and TS 26.264.



You must log in to post comment