Avatar-udpate to section 6.3.4
Source: InterDigital New York
Meeting:
TSGS4_135_India
Agenda Item:
9.8
| Agenda item description | FS_Avatar_Ph2_MED (Study on Avatar communication Phase 2) |
|---|---|
| Doc type | discussion |
| For action | Agreement |
| Release | Rel-20 |
| Specification | 26.813 |
| download_url | Download Original |
| For | Agreement |
| Spec | 26.813 |
| Type | discussion |
| Contact | Gaelle Martin-Cocher |
| Uploaded | 2026-02-03T21:28:19.900000 |
| Contact ID | 91571 |
| Revised to | S4-260285 |
| TDoc Status | revised |
| Reservation date | 03/02/2026 21:15:23 |
| Agenda item sort order | 43 |
Review Comments
[Technical] The contribution claims a “comprehensive update to section 6.3.4” but provides no proposed spec text, clause numbering, or delta against the current TS/TR, making it impossible to assess normative impact, backward compatibility, or whether the update is editorial vs. technical.
[Technical] It states ARF is “standardized as ISO/IEC 23090-39” while also saying it has reached CDIS stage; these are inconsistent maturity claims and should be aligned (e.g., CDIS vs DIS/FDIS/IS) to avoid incorrect referencing in 3GPP specs.
[Technical] The AAU header definition (“AAU type (7-bit code)”) is underspecified: no bit/byte packing, no reserved values, no extension mechanism, and no endianness/bit order, which is critical for interoperable bitstream parsing.
[Technical] The timestamp is described as “32-bit timestamp in ticks” but wraparound behavior, epoch/reference (relative vs absolute), monotonicity, and reordering/jitter handling are not defined; these are essential for real-time delivery and ISOBMFF track mapping.
[Technical] “Timescale value (32-bit float, ticks per second)” is a poor choice for a timescale (precision/rounding/non-integer ambiguity) and conflicts with common media container practice (integer timescale); this will create interoperability issues when mapping to ISOBMFF timescales.
[Technical] Joint animation samples transmit a full 4×4 matrix as “16 floats” per joint (and optional velocity matrix), which is extremely bandwidth-heavy and not aligned with typical skeletal animation representations (TRS/quaternion + translation); if this is intended, constraints and compression/quantization profiles must be specified.
[Technical] The LBS formula uses “global transformation matrix Mⱼ” but does not clarify whether Mⱼ already includes inverse bind matrices, whether matrices are in joint local vs model space, and how skinning relates to the “inverse bind matrices” mentioned in Skeleton; ambiguity here will yield different rendered results across implementations.
[Technical] Blendshape deformation is given only for vertex positions; normals/tangents handling (recompute vs blendshape-provided deltas) is not addressed despite earlier listing normals/textures/maps as part of the model, risking inconsistent shading across clients.
[Technical] The “optional authenticationFeatures (encrypted facial and voice feature vectors with public key URI)” raises privacy/security and key management questions (trust model, rotation, revocation, algorithm identifiers) and is not tied to any 3GPP security framework; as written it is not implementable or reviewable for compliance.
[Technical] The container description (“ARF document in ISOBMFF item in top-level MetaBox” and “additional items for each component”) lacks the required ISOBMFF signaling details (item types/brands, item references, protection signaling, track vs item mapping), so “partial access” is asserted without a defined mechanism.
[Technical] The “stored format” description mentions “sample grouping” for sequences like “smile/wave/dance” but does not define how sequences are indexed, timed, or referenced (e.g., sample groups, edit lists, metadata keys), which is necessary for interoperable retrieval and playback.
[Editorial] Multiple references to “Figure 12/13/14” are included but the figures are not provided; if this is intended as a spec update, the contribution should either include the figures or remove figure-dependent statements.
[Editorial] Terminology is inconsistent and sometimes non-standard (“workgroup” vs WG, “Committee Draft International Standard (CDIS)” vs ISO stage naming, “ISOBMFF” vs “ISO Base Media File Format”), which should be normalized to avoid confusion in 3GPP text.
[Editorial] The document reads like a tutorial/overview (requirements lists, “ongoing exploration experiments”) rather than a precise update to a specific 3GPP section; it should clearly separate informative background from the exact proposed replacement/addition text for Section 6.3.4.