Review: 9.8

FS_Avatar_Ph2_MED (Study on Avatar communication Phase 2)

Meeting: TSGS4_135_India
Back to Agenda Read-only version
Show columns:
TDoc Number & Links Title Source Comments
PDF Edit
[FS_Avatar_Ph2_MED] 3D Gaussian Splatting Avatar Methods for Real-Time Communication Qualcomm Atheros, Inc.
Previous Reviews:
manager
2026-02-09 04:55:00
  1. [Technical] The claim “No changes to animation stream required” (Proposed Architecture Step 2) is not substantiated for all cited methods: mesh-embedded Gaussians may need additional per-Gaussian binding metadata (triangle ID, barycentric coords, local frame/covariance transport rules) and potentially additional animation parameters for non-mesh components (hair/teeth/tongue/eyes), which are not clearly covered by the existing ARF Animation Stream Format.

  2. [Technical] Backward compatibility via “store mesh-embedded Gaussians as auxiliary data within glTF/ARF containers” (Step 1) is underspecified: ARF/glTF needs a normative extension mechanism (schema, MIME/box, or glTF extension) defining attribute semantics, coordinate frames, units, and default behaviors; otherwise different decoders will interpret the same auxiliary data differently.

  3. [Technical] Determinism is overstated: “Explicit methods naturally deterministic given fixed floating-point rules” ignores that GPU raster/compute pipelines, floating-point contraction, sorting ties in depth-ordered alpha compositing, and parallel reduction order can yield non-bit-exact results across vendors; conformance would need explicit ordering rules and error tolerances, not just “fixed floating-point rules.”

  4. [Technical] The proposed “depth-ordered alpha compositing” rendering model is central to 3DGS but no interoperability-critical details are given (sorting key definition, handling of equal depths, tile-based sorting, prefiltering, blending equation, color space), making it hard to assess whether ARF can standardize a decoder-independent rendering outcome.

  5. [Technical] The document asserts “direct ARF compatibility” for GaussianBlendshape/SplattingAvatar, but does not map their control parameters to specific ARF constructs (e.g., which blendshape set, naming/ID mapping, ranges, neutral definition, coordinate conventions), risking a mismatch between research model parameters (FLAME/SMPL-X) and ARF-defined animation semantics.

  6. [Technical] The “40 KB/s for real-time animation” streaming figure (Step 4) is presented without assumptions (number of joints, blendshape count, sampling rate, quantization, overhead, RTP/transport framing), and may be misleading given typical face blendshape streams can exceed this depending on rate and precision.

  7. [Technical] Compression proposals (SPZ, L-GSC, HAC++, Compact3D) are listed without clarifying whether they are geometry-only, attribute-aware (SH coefficients, opacity), support random access/partial decode, or preserve required precision for stable splat rendering; Objective 7 evaluation needs criteria tied to ARF use cases (latency, progressive LOD, error metrics).

  8. [Technical] The “graceful fallback: mesh-only renderers can ignore Gaussian extension and still animate” is only valid if the base avatar always includes a complete mesh representation; several 3DGS approaches are not mesh-complete (e.g., hair volumes), so the fallback behavior and minimum mesh requirements should be stated.

  9. [Technical] Non-rigid elements (hair/clothing/accessories) are acknowledged as a challenge, but the proposed ARF integration does not define how “secondary Gaussians” are driven (extra bones, physics, per-frame deltas, or optional streams), which is likely the dominant interoperability gap for full-body avatars.

  10. [Technical] The classification “Hybrid methods… can still be driven by blendshape parameters with MLP weights distributed as part of base avatar” glosses over runtime dependencies: even small MLPs require a standardized inference graph, activation functions, quantization, and tensor layouts; without aligning to an existing standardized neural model format/profile, “portable” decoding is not ensured.

  11. [Editorial] Several performance/quality numbers (FPS, PSNR, training time, storage like “~3.5 MB”) are presented as facts but lack citations, test conditions, and hardware baselines; SA4 contributions typically need references or at least a consistent evaluation setup to avoid cherry-picked comparisons.

  12. [Editorial] Terminology is inconsistent and sometimes ambiguous (e.g., “Gaussian Head Avatar” vs “GaussianHead”; “3DGS-Avatar” appears once without definition; “AAUs” is used without expansion in this document), which will hinder readers trying to relate items to known papers/spec terms.

  13. [Editorial] The document repeatedly states “ARF compatibility” but does not reference specific clauses of ISO/IEC 23090-39 or the corresponding 3GPP study text (FS_Avatar_Ph2_MED Objective 3/7) where gaps exist; adding explicit clause-level mapping would make the contribution actionable for SA4.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Avatar Evaluation Framework and Objective Metrics Qualcomm Atheros, Inc.
Previous Reviews:
manager
2026-02-11 07:04:07
  1. [Technical] The “black-box evaluation from rendered video” principle is not sufficient for cross-vendor comparability unless the contribution also standardizes the avatar asset (mesh/topology, textures, rig), camera pose, lighting model, tone mapping, and renderer settings; otherwise PSNR/SSIM and landmark-based errors will mostly measure rendering/asset differences rather than animation/transport performance.

  2. [Technical] The proposal to “include PSNR/SSIM” as visual quality metrics is weak for avatar content because these metrics are highly sensitive to small viewpoint/lighting differences and do not correlate well with perceptual quality for faces; the document should justify their relevance or add more appropriate perceptual metrics (e.g., VMAF/LPIPS or face-region weighted metrics) and define ROI handling.

  3. [Technical] The animation metrics (LVE/FDD/MVE) rely on extracting landmarks/skeletons from rendered video, but no standardized detector/model, confidence handling, occlusion policy, or failure mode is defined—making results non-repeatable and vendor-dependent despite the stated reproducibility goal.

  4. [Technical] Units for LVE/FDD/MVE are given as “pixels/mm” without defining the pixel-to-mm mapping, camera calibration, depth assumptions, or whether errors are computed in 2D image space vs 3D space; this ambiguity will lead to incomparable results across resolutions/FOVs.

  5. [Technical] “Reference rendered video from both high-quality reference pipeline and source capture” is internally inconsistent: if the reference is a rendered video, it bakes in a specific renderer/asset; if the reference is source capture, it is not directly comparable to a stylized avatar render—this needs a clear definition of the reference signal per metric.

  6. [Technical] The framework mixes evaluation of animation technique quality with transport/network impairment testing, but does not specify how to isolate network effects from renderer/animation effects (e.g., fixed decoder, fixed renderer, controlled impairment injection points), risking confounded conclusions.

  7. [Technical] The “normative capture workflow using lossless/visually lossless compression” is underspecified: “visually lossless” is subjective, and without mandated codec/settings, color format, bit depth, and color management (RGB/YUV, transfer function), objective metrics will vary significantly.

  8. [Technical] Temporal metrics are deferred to a “second phase,” yet the proposals ask to include the metric set in TR 26.813; this is incomplete for Phase 2 objectives where latency/sync are central, and at minimum measurement definitions and required instrumentation should be provided now.

  9. [Technical] Motion-to-photon and end-to-end latency measurement cannot be derived reliably from frame timestamps alone; the contribution needs a concrete method (e.g., LED/photodiode, timecode watermarking, event markers) and a definition of clock synchronization across stimulus, renderer, and capture.

  10. [Technical] Dropped frame ratio based on “missing or repeated frame indices” assumes access to frame indices or deterministic numbering; in a pure black-box capture, repeated frames must be detected via content analysis, which is not specified and is error-prone for low-motion scenes.

  11. [Technical] Audio-visual sync via cross-correlation between mouth motion and audio is not robust across phonemes, silence, or expressive motion; the document should define the feature extraction (viseme classifier vs lip aperture signal), windowing, and acceptable error bounds.

  12. [Editorial] The “four key principles” section only lists three principles; this undermines clarity and should be corrected or the missing principle added.

  13. [Editorial] Several terms are introduced without alignment to existing 3GPP/3GPP2/ITU terminology (e.g., “stimulus player,” “render configuration,” “metrics engine”); mapping to TR 26.813 structure and definitions is needed to avoid creating parallel vocabulary.

  14. [Editorial] The proposal to “define normative capture workflow” is potentially inappropriate for a TR (typically informative); the document should clarify whether it intends normative text in a TS or provide informative guidance with clearly stated assumptions and limitations.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Interoperability guidance for ARF Qualcomm Atheros, Inc.
Previous Reviews:
manager
2026-02-11 07:04:44
  1. [Technical] The proposal asserts “ARF document is the normative description for interpreting an animation stream” and then proposes adding normative “shall” text to TS 26.264, but it does not clarify the normative split between TS 26.264 and ISO/IEC 23090-39 (ARF); without explicit referencing/wording, this risks contradicting 3GPP’s usual approach where TS 26.264 normatively references external specs rather than redefining their behavior.

  2. [Technical] The receiver procedure relies on “preamble.supportedAnimations” and “SupportedAnimations list indices,” but TS 26.264 clause 5.6.1 (and the referenced ARF structures) need to be checked for whether index-based addressing is stable/defined; if ARF uses IDs/URNs rather than positional indices, this guidance could create non-interoperable implementations.

  3. [Technical] “Mapping indices refer to parameter identifiers in the animation stream (ShapeKey.id … target joint index … target landmark index)” mixes identifier spaces (IDs vs indices) and animation types; the guidance should explicitly define, per animation profile, what the parameter identifier is (string ID, ordinal, semantic name) and how it is carried in the bitstream, otherwise mapping tables cannot be applied deterministically.

  4. [Technical] The proposal introduces LinearAssociation/NonLinearAssociation behavior (weighted sums, LUTs, interpolation modes, combination modes, clamping) but does not state the exact normative computation rules (ordering, clamping ranges, handling of NaN/out-of-range, coordinate units for landmarks, joint rotation representation), which is essential for interoperability if TS/TR text is to be actionable.

  5. [Technical] Defaulting rules (“0.0 for blendshape weights, bind pose for joints, neutral position for landmarks”) are underspecified and may be wrong for some profiles (e.g., joint animation may require hold-last-sample, or bind pose may cause visible popping); if included in TS 26.264, it should align with existing decoder behavior expectations for missing parameters.

  6. [Technical] The “Subset” scenario says “Unmapped target parameters default to neutral values,” but the more common interoperability issue is the inverse (stream has parameters not present in target); guidance should also specify receiver behavior for unknown/extra incoming parameters (ignore vs error) to avoid divergent implementations.

  7. [Technical] NonLinearAssociation examples reference INTERPOLATION_CUBICSPLINE and pow(input,0.5) approximation, but it’s unclear whether ARF actually defines these interpolation modes and LUT semantics for all animation types; if not already in ISO/IEC 23090-39, this guidance risks inventing capabilities not supported by the container.

  8. [Technical] The “Blink” example combines left/right with COMBINATION_SUM and then “clamp to [0,1]”; clamping is not stated as part of the mapping object semantics, and different rigs may expect additive >1.0 behavior—this needs explicit alignment with ARF-defined value domains per parameter.

  9. [Technical] The “MouthOpenSmile” example depends on “Smile (12 after linear mapping)” implying chained mappings (linear then non-linear); the proposal does not specify whether mapping stages can be composed, how dependencies are resolved, or whether cycles are allowed—this is critical for deterministic receiver processing.

  10. [Technical] Landmark mapping is treated similarly to scalar blendshape weights, but landmarks are vectors (2D/3D) and may require per-component mapping, coordinate space definition, and temporal filtering; the current text (“apply LUT … before writing 2D or 3D coordinate”) is too vague to ensure interoperable landmark animation.

  11. [Technical] The proposal places “Sender responsibility” on the “avatar owner,” but TS 26.264 needs precise role terminology (sender, encoder, content provider, ARF author) and must consider cases where the stream sender is not the ARF author (e.g., third-party capture streaming to a known avatar).

  12. [Editorial] The contribution proposes “Document the content of sections 2 and 3 in TR 26.813” but does not provide exact draft text, clause numbers, or change-marked edits; as written it is not directly actionable for rapporteurs and makes it hard to assess consistency with existing TR structure.

  13. [Editorial] “Remove the corresponding note from TS 26.264 and declare it as resolved” is ambiguous because the exact note text is not quoted; the contribution should cite the precise existing note in clause 5.6.1 and propose replacement wording to avoid accidental removal of related guidance.

  14. [Editorial] Several terms are used without definition or consistent capitalization (e.g., “SupportedAnimations” vs “supportedAnimations,” “AnimationInfo” vs “animationInfo,” “Mapping Objects”); if this is to become spec text, it should match the exact field names and terminology used in ISO/IEC 23090-39 and TS 26.264.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Draft LS on MPEG I ARF compression aspects Nokia
Previous Reviews:
manager
2026-02-09 04:55:40
  1. [Technical] The LS states that “ISO/IEC DIS 23090-39 does not yet specify compression mechanisms,” but does not distinguish between normative bitstream compression vs transport-level compression; this ambiguity weakens the question and may lead to an unhelpful response from ISO/IEC.

  2. [Technical] The request asks generally for “existing MPEG technologies” for mesh and animation compression, but does not constrain the scope to technologies compatible with ARF’s data model (e.g., topology changes, blendshape representation, skeleton hierarchy), so ISO/IEC cannot assess feasibility of integration into 23090-39.

  3. [Technical] The LS does not clarify whether SA4 needs a standardized codec/bitstream (e.g., normative decoding) or merely recommended encodings; this is critical because MPEG technologies differ substantially in standardization maturity and integration burden.

  4. [Technical] “Avatar animation data including blend shape sets, skeletal animation, other animation-related information” is underspecified: it omits key elements such as skinning weights, joint constraints, time sampling, quantization requirements, and coordinate systems, which directly affect compressibility and choice of technology.

  5. [Technical] The LS does not mention whether compression must support random access, streaming, incremental updates, and partial avatar updates (common in IMS conversational scenarios); without these requirements, ISO/IEC cannot map to appropriate MPEG tools.

  6. [Technical] The timeline presented (study conclusion Aug 2026; normative work Mar 2027) is inconsistent with typical 3GPP Release 20 planning and may be interpreted as non-actionable by ISO/IEC; the LS should align to the actual Rel-20/Rel-21 milestone framework or explicitly state it is SA4 internal planning.

  7. [Technical] The document claims adoption of “ISO/IEC DIS 23090-39” in Rel-19 TS 26.264; referencing a DIS (draft) rather than the final ISO/IEC publication is risky and should be clarified (which edition/stage is normatively referenced in TS 26.264).

  8. [Technical] The LS does not identify the exact clauses in TS 26.264 impacted by adding compression (e.g., payload formats, SDP signaling, file/stream encapsulation), so ISO/IEC cannot gauge what “integration into 23090-39” would need to cover for 3GPP interoperability.

  9. [Technical] Asking WG7 and WG3 jointly is plausible, but the LS does not justify which group is expected to answer which aspect (ARF vs compression tools), increasing the chance of diffusion of responsibility and delayed response.

  10. [Editorial] The LS reads like an internal summary rather than a formal liaison: it lacks typical LS elements (explicit addressee, LS purpose, contact points, meeting reference, and a clear “Action requested by” date tied to SA4 milestones).

  11. [Editorial] The term “avatar communication Phase 2” is used without referencing the corresponding 3GPP work item/study item identifier, making it harder for ISO/IEC to trace scope and urgency.

  12. [Editorial] The wording “considering the Release 20 timeline constraints” is vague; the LS should state the latest date by which SA4 needs an answer (e.g., before a specific SA4 meeting) to be actionable.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Considerations on security aspects Nokia
Previous Reviews:
manager
2026-02-09 04:56:23
  1. [Technical] The proposal is too high-level: adding a new TR 26.813 subclause “8.3.4 Security aspects” without concrete security requirements, threat model, or normative implications risks duplicating Clause 9 and not producing actionable outputs for SA3 or any follow-on TS work.

  2. [Technical] The claimed gap in TS 33.328 (“does not cover controls to prevent sending UE from using fake avatar representations not belonging to the user”) is asserted but not substantiated with a specific procedure/step analysis of Annex R; without pinpointing where identity binding fails, it’s unclear what new mechanism is needed (e.g., credential-bound avatar token, signing, attestation).

  3. [Technical] The scope “Avatar calls via generalized IMS DC architecture” is ambiguous and may be misaligned with existing security ownership: IMS/DC security is largely SA3/CT3 territory, so TR 26.813 text must clearly separate media/application security considerations from core/IMS authentication and key management already covered by 33-series specs.

  4. [Technical] “Authentication, encryption, and content protection mechanisms” are listed without clarifying which interfaces are in scope (UE–IMS, UE–BAR, UE–Avatar service, network–BAR, inter-operator), which is essential because the applicable mechanisms differ (AKA-based, OAuth2/API security, TLS, SRTP, DRM).

  5. [Technical] The document references TR 26.813 Clause 8 “Access Protection mechanisms for BAR API” and Clause 9 “security and privacy aspects,” yet proposes adding security under Clause 8; this risks inconsistent structure and overlapping content unless the new subclause explicitly defines what is new vs. what is already covered in Clauses 8 and 9.

  6. [Technical] No consideration is given to end-to-end media security for avatar streams (e.g., SRTP keying, E2EE implications, IMS media plane constraints), despite “encryption” being called out; without stating whether encryption is hop-by-hop or end-to-end, the study may produce incompatible assumptions.

  7. [Technical] The proposal does not address authorization and policy control for avatar usage (who can use which avatar, per-call consent, enterprise policy, parental control), which is central to preventing “fake avatar representations” and is distinct from mere authentication.

  8. [Technical] Privacy preservation is mentioned in the SID objective, but the proposed subclause focus is “primarily” on mechanisms for avatar calls; it should explicitly cover privacy threats (linkability of Avatar ID, inference from avatar assets, metadata exposure) and map them to mitigations, otherwise Objective 6 is only partially met.

  9. [Technical] Content protection is cited (watermarking/DRM), but there is no linkage to concrete avatar asset lifecycle (creation, upload, storage, download, rendering, redistribution) and where watermarking/DRM would be applied and enforced; without lifecycle mapping, the study risks being non-implementable.

  10. [Editorial] The contribution mixes TS/TR references and conclusions but does not provide exact clause numbers for several key claims (e.g., which parts of TR 26.813 Clause 9 are insufficient for avatar calls), making it hard for the group to verify gaps and avoid redundant text.

  11. [Editorial] The suggested numbering “8.3.4” is premature without showing the existing Clause 8 substructure in the base CR; renumbering churn is likely, so the proposal should describe insertion location by title/anchor rather than a fixed number.

  12. [Editorial] The document frames the issue as a “gap” in TS 26.264 and TS 33.328 but does not clearly state the intended deliverable impact (TR-only study text vs. triggering a new WI/CR in TS 33.328/33.203/33.210), which weakens the contribution’s actionability for SA4/SA3 coordination.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Authentication for avatar data Nokia
Previous Reviews:
manager
2026-02-09 04:56:52
  1. [Technical] The proposal introduces a “Digital Credential / Base Avatar Assertion (BAA)” but does not define its normative format, mandatory fields, cryptographic algorithms, or validation rules (Figure Y is referenced but not provided), making interoperability and conformance impossible.

  2. [Technical] Trust model is underspecified: UE2 “verifies BAA validity” via issuer signature, but there is no definition of how UE2 obtains/validates issuer trust anchors (operator PKI, WebPKI, cross-operator federation, roaming), nor how revocation/expiry is handled.

  3. [Technical] The procedure says UE2 may obtain the BAA “from UE1 or Issuer” without specifying the transport, integrity protection, and binding to the IMS session (e.g., SIP/SDP conveyance, MSRP, HTTP), leaving clear MITM/replay risks and no linkage to the authenticated IMS identity.

  4. [Technical] The key binding is unclear: UE1 generates a key pair “associated with selected avatar presentation,” but the contribution does not specify what identifier of the avatar representation is signed into the BAA, how collisions/updates are handled, or how the avatar data hash/reference is bound to prevent substitution.

  5. [Technical] There is no explicit binding between the BAA subject and the user’s IMS/3GPP identity (IMPU/IMPI/SUPI) or the access authentication (AKA), so a valid BAA could be presented in a different IMS context unless additional binding is specified.

  6. [Technical] The issuer “verifies that a Base Avatar represents its owner” is a major security claim but no verification method is described (biometrics, KYC, operator registration, device attestation), making the assurance level undefined and potentially inconsistent across deployments.

  7. [Technical] The “proof of possession of private key” is mentioned but no protocol is defined (challenge-response, signature over nonce, channel binding), and it’s unclear when UE2 verifies PoP versus only verifying issuer signature.

  8. [Technical] Lifecycle management is missing: no procedures for BAA renewal, revocation (lost device, compromised key), avatar updates, key rotation, or multiple devices per user—critical for any credential-based scheme.

  9. [Technical] Privacy implications are not addressed: a persistent BAA could enable cross-service/user tracking; there is no discussion of minimizing identifiers, using pairwise pseudonyms, selective disclosure, or limiting disclosure to UE2 vs network.

  10. [Technical] The scope is inconsistent with the stated gap: the contribution targets “avatar-related APIs” but the described mechanism is UE-to-UE credential presentation; it does not cover API authentication/authorization semantics (OAuth2/3GPP NEF-style, tokens, scopes) or network-side enforcement.

  11. [Technical] The proposal does not clarify whether authentication is end-to-end (UE1↔UE2) or network-mediated (IMS core involvement), and therefore does not specify where security termination points are and what entities are in the trust boundary.

  12. [Editorial] The document proposes adding sub-clause “8.3.4” but does not indicate which specification/TR this clause belongs to, nor how it aligns with existing clause numbering and terminology in TS/TR 26.264/26.813/33.328.

  13. [Editorial] Several key references are vague or missing (e.g., “Figure Z example implementation,” “Figure Y structure”), and terms like “Base Avatar Representation” vs “avatar presentation” are used inconsistently, which will cause ambiguity in normative text.

  14. [Editorial] The contribution states “TR 26.813 and TS 33.328 do not address these security aspects” but does not cite specific clauses or gaps; adding targeted gap statements and requirements would strengthen the justification and help SA3 alignment.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Media Configuration for Avatar Calls InterDigital Pennsylvania
Previous Reviews:
manager
2026-02-09 04:57:18
  1. [Technical] The paper asserts a “critical gap” that TS 26.264 lacks media configuration details for an AR‑MTSI client in an avatar call, but it does not pinpoint which procedures are missing (e.g., SDP offer/answer attributes, codec/format negotiation, RTP payloads, B2BUA/MF insertion rules), making the gap statement too vague to justify an SA2 liaison.

  2. [Technical] The claimed issue that “IMS network element behavior is unspecified” when receiving +sip.3gpp-ar-support / +sip.3gpp-avatar-support in REGISTER is not substantiated by citing the relevant IMS specs (e.g., TS 24.229 handling of unknown Contact header parameters, registration storage, and feature tag processing), so it’s unclear whether this is truly an architectural gap or already covered by generic SIP/IMS behavior.

  3. [Technical] The document treats +sip.3gpp-*-support as “parameters in SIP REGISTER Contact header,” but does not clarify whether these are intended as SIP feature tags (RFC 3840/3841) and how they are used for routing/matching (e.g., in INVITE Accept-Contact/Reject-Contact), which is central to whether IMS needs new behavior at all.

  4. [Technical] The “network-assisted avatar rendering” description (AR AS allocates MF, modifies SDP, inserts MF in media path) implies B2BUA/SDP mangling behavior, but the paper does not identify which network function is authorized to do this in IMS (P-CSCF/S-CSCF/AS/MRF/MF) nor the normative call flows, so SA2 cannot act on a concrete requirement.

  5. [Technical] The proposal conflates two separate topics—registration-time capability indication and session-time media configuration/MF selection—without stating the decision logic (e.g., when to select MF based on avatar-assisted vs remote capabilities), risking an LS that is too broad and non-actionable.

  6. [Technical] The definitions of “ar-assisted” and “avatar-assisted” (“requires network rendering/animation support”) omit how the UE signals request vs capability (i.e., whether “assisted” is a preference, a hard requirement, or simply a limitation), which affects IMS behavior and service continuity.

  7. [Technical] The paper does not address interworking/roaming implications: if visited/home IMS elements ignore these tags, what is the expected fallback behavior and does the service still work (e.g., direct UE rendering, downgrade to video), which is important before requesting architectural changes.

  8. [Technical] No security/privacy considerations are raised for exposing avatar/AR capabilities in REGISTER (e.g., profiling, feature disclosure), which may be a concern for SA2/SA3 and should be at least acknowledged in the LS request.

  9. [Editorial] References are imprecise: “TS 26.264 Clause 7” and “Clause 7.3.1/7.3.2” are cited, but the paper does not quote the exact normative text being interpreted, making it hard to verify whether the described behavior is accurate or already specified.

  10. [Editorial] The document states “media configuration requirements” but provides no concrete media configuration elements (SDP attributes, payload formats, bitrate/resolution constraints, RTP/RTCP usage), so the title and scope read mismatched to the actual content (which is primarily about IMS behavior and liaisoning).

  11. [Editorial] The narrative references S4-251845 and “feedback indicated…” but does not summarize the specific objections or decisions from SA4-134, weakening traceability and making it difficult to judge whether an LS is the right next step.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] LS on IMS network behaviour for new Contact header parameters InterDigital Pennsylvania
Previous Reviews:
manager
2026-02-09 04:57:41
  1. [Technical] The LS proposes new SIP Contact header parameters (+sip.3gpp-ar-support, +sip.3gpp-avatar-support) without identifying any normative SIP/3GPP registration for these feature tags (e.g., in TS 24.229/24.229 annexes or an IANA/3GPP registry), so SA2 cannot assess interoperability or standard compliance as written.

  2. [Technical] The statement that these parameters “signal that the terminal requires network-based rendering support” conflates capability indication with service request; in IMS, REGISTER Contact parameters are typically used for capability discovery/routing, not to mandate network media processing, so the intended semantics and enforcement point are unclear.

  3. [Technical] The LS assigns network behavior to “an AR Application Server” (allocate MF, modify SDP, insert MF) but does not map this to existing IMS architectural roles (AS, MRFC/MRFP, SCC AS, IMS-ALG, B2BUA behavior) nor specify whether this is within IMS centralized services, SIP AS acting as B2BUA, or via MRF control—this is a major architectural ambiguity.

  4. [Technical] “Modify SDP to insert the MF into the media path” is not generally achievable by a pure SIP proxy and implies B2BUA/SDP offer-answer manipulation; the LS should clarify the required SIP role and the impact on end-to-end integrity (e.g., SIP/SDP transparency, security, and interop).

  5. [Technical] The LS focuses on REGISTER handling, but the described MF insertion is a session-time behavior (INVITE/UPDATE/PRACK/SDP negotiation); it is unclear why registration-time parameters alone are sufficient, and what happens if the UE’s needs vary per session or per media stream.

  6. [Technical] No interaction is described with existing IMS capability mechanisms (e.g., SIP feature tags, Accept-Contact/Require/Supported, service profiles/IFCs, or 3GPP-defined media feature tags), risking duplication or inconsistent behavior across networks.

  7. [Technical] The proposed values for +sip.3gpp-avatar-support (“avatar-capable”, “avatar-assisted”) are underspecified: there is no indication of whether multiple values are allowed, how they are encoded (token vs quoted-string), and how they relate to actual media/codec requirements in SDP (which ultimately drive rendering feasibility).

  8. [Technical] The LS implies the network selects/configures MF “based on receiving UE’s video capabilities,” but does not define how the network learns those capabilities (REGISTER vs SDP in INVITE), nor how it handles mismatches between registered capabilities and actual session SDP.

  9. [Technical] There is no consideration of roaming/interconnect: if these Contact parameters traverse visited/home networks or inter-IMS boundaries, behavior is undefined (stripping, privacy, policy), which is critical for IMS feature tags and service invocation.

  10. [Technical] The term “Media Function (MF)” is not aligned with common IMS terminology (MRF/MRFP, IMS media resource function, or specific media processing functions); without anchoring to existing functional entities, SA2 cannot determine which specs need updates.

  11. [Editorial] The LS cites “Clause 7” and “Clause 7.3/7.3.2 of TS 26.264” but provides no exact text excerpts or version/date of TS 26.264, making it hard for SA2 to verify the requirements and whether the parameters are already normatively defined.

  12. [Editorial] The document mixes Rel-19/Rel-20 scope without stating which release introduces the parameters and expected IMS behavior, which may affect whether SA2 should treat this as a new WI, a CR to existing IMS specs, or a study item.

You should sign in to be able to post reviews
PDF Edit
Avatar-udpate to section 6.3.4 InterDigital New York
Previous Reviews:
manager
2026-02-09 04:58:04
  1. [Technical] The contribution claims a “comprehensive update to section 6.3.4” but provides no proposed spec text, clause numbering, or delta against the current TS/TR, making it impossible to assess normative impact, backward compatibility, or whether the update is editorial vs. technical.

  2. [Technical] It states ARF is “standardized as ISO/IEC 23090-39” while also saying it has reached CDIS stage; these are inconsistent maturity claims and should be aligned (e.g., CDIS vs DIS/FDIS/IS) to avoid incorrect referencing in 3GPP specs.

  3. [Technical] The AAU header definition (“AAU type (7-bit code)”) is underspecified: no bit/byte packing, no reserved values, no extension mechanism, and no endianness/bit order, which is critical for interoperable bitstream parsing.

  4. [Technical] The timestamp is described as “32-bit timestamp in ticks” but wraparound behavior, epoch/reference (relative vs absolute), monotonicity, and reordering/jitter handling are not defined; these are essential for real-time delivery and ISOBMFF track mapping.

  5. [Technical] “Timescale value (32-bit float, ticks per second)” is a poor choice for a timescale (precision/rounding/non-integer ambiguity) and conflicts with common media container practice (integer timescale); this will create interoperability issues when mapping to ISOBMFF timescales.

  6. [Technical] Joint animation samples transmit a full 4×4 matrix as “16 floats” per joint (and optional velocity matrix), which is extremely bandwidth-heavy and not aligned with typical skeletal animation representations (TRS/quaternion + translation); if this is intended, constraints and compression/quantization profiles must be specified.

  7. [Technical] The LBS formula uses “global transformation matrix Mⱼ” but does not clarify whether Mⱼ already includes inverse bind matrices, whether matrices are in joint local vs model space, and how skinning relates to the “inverse bind matrices” mentioned in Skeleton; ambiguity here will yield different rendered results across implementations.

  8. [Technical] Blendshape deformation is given only for vertex positions; normals/tangents handling (recompute vs blendshape-provided deltas) is not addressed despite earlier listing normals/textures/maps as part of the model, risking inconsistent shading across clients.

  9. [Technical] The “optional authenticationFeatures (encrypted facial and voice feature vectors with public key URI)” raises privacy/security and key management questions (trust model, rotation, revocation, algorithm identifiers) and is not tied to any 3GPP security framework; as written it is not implementable or reviewable for compliance.

  10. [Technical] The container description (“ARF document in ISOBMFF item in top-level MetaBox” and “additional items for each component”) lacks the required ISOBMFF signaling details (item types/brands, item references, protection signaling, track vs item mapping), so “partial access” is asserted without a defined mechanism.

  11. [Technical] The “stored format” description mentions “sample grouping” for sequences like “smile/wave/dance” but does not define how sequences are indexed, timed, or referenced (e.g., sample groups, edit lists, metadata keys), which is necessary for interoperable retrieval and playback.

  12. [Editorial] Multiple references to “Figure 12/13/14” are included but the figures are not provided; if this is intended as a spec update, the contribution should either include the figures or remove figure-dependent statements.

  13. [Editorial] Terminology is inconsistent and sometimes non-standard (“workgroup” vs WG, “Committee Draft International Standard (CDIS)” vs ISO stage naming, “ISOBMFF” vs “ISO Base Media File Format”), which should be normalized to avoid confusion in 3GPP text.

  14. [Editorial] The document reads like a tutorial/overview (requirements lists, “ongoing exploration experiments”) rather than a precise update to a specific 3GPP section; it should clearly separate informative background from the exact proposed replacement/addition text for Section 6.3.4.

You should sign in to be able to post reviews
PDF Edit
[FS_Avatar_Ph2_MED] Procedures for BAR API Operations InterDigital Canada
Previous Reviews:
manager
2026-02-09 04:58:31
  1. [Technical] The proposal targets TR 26.813 clause 8.3.3.x, but the described procedures are for BAR APIs “integrated into TS 26.264 Annex B”; it is unclear why normative-looking API procedures are being added to a TR rather than aligned with (or referenced from) TS 26.264, risking duplication and inconsistency across specs.

  2. [Technical] Several API names are inconsistent across the document (e.g., Mbar_Management_BaseAvatarModels_GetBaseAvatarModel vs Mbar_Management_Avatars_CreateBaseAvatarModel), suggesting mismatches with the actual OpenAPI operationIds in TS 26.264 Annex B and making the procedures potentially non-actionable/incorrect.

  3. [Technical] The “Update Base Avatar Model” PATCH mechanism (“multipart/mixed … asset identifiers list and binary assets”) is underspecified: no definition of the patch semantics, part naming, content-types, ordering, or how assetIds map to parts, and no reference to a standard patch format (e.g., JSON Patch / Merge Patch), which will lead to non-interoperable implementations.

  4. [Technical] The “Update Asset” PATCH description (“multipart/mixed message with LoDs/components to replace”) introduces new concepts (LoDs/components) without defining them in BAR/ARF context or referencing ARF container structure, so the server-side update behavior cannot be consistently implemented.

  5. [Technical] Response payloads are inconsistent: “Create Asset” says response is “201 Created with assetId” but later “Response Information Elements” says “Avatar resource entity (M)”; similarly “Create Base Avatar Model” returns “Avatar resource entity” but “Get Base Avatar Model” returns both “Avatar resource entity and binary ARF container”—the resource model vs binary container handling needs a consistent pattern.

  6. [Technical] Error handling is entirely missing (401/403/404/409/413/415/422 etc.); given authorization checks, missing avatarId/assetId, and container validation, the procedures should at least identify key failure cases and expected HTTP status codes to avoid divergent behavior.

  7. [Technical] The document repeatedly states “globally unique identifier” creation for avatar and representation IDs but does not specify scope/format (UUID/URI), nor how it relates to the resource path variables {avatarId}, {assetId}, {avatarRepresentationId}—this is critical for API interoperability.

  8. [Technical] Authorization/ownership rules are inconsistent and incomplete: only Avatar Representation update notes “only owner allowed,” while create/update/delete of avatars/assets also imply restrictions; the contribution should define a coherent authorization model (who may create assets under an avatar, who may delete, delegation to MF/DC AS, etc.).

  9. [Technical] The “Get Avatar Representation” procedure says {avatarId} is replaced in the resource path, but retrieval should typically be keyed by {avatarRepresentationId} (and possibly {avatarId}); as written it is ambiguous which representation is retrieved when multiple representations exist for one avatar.

  10. [Technical] “Destroy Avatar Representation” allows “204 No Content (or 200 OK if response body needed)”; this optionality without specifying when/what body is returned undermines interoperability and should be fixed to a single behavior aligned with TS 26.264.

  11. [Technical] The procedures mention “BAR stores ARF container locally” and “may repackage container” but do not address versioning/ETags/If-Match concurrency control; without this, simultaneous updates (especially PATCH) can corrupt container state.

  12. [Editorial] There are multiple typos in operation names and paths that would be problematic if copied into a spec: Mbar_Managment_Avatars_DeleteBaseAvatarModel (Management misspelled), GetAssoicatedInformation (Associated misspelled), and inconsistent underscores in Mbar_Management_Avatar_Representations_UpdateAvatarRepresentation.

  13. [Editorial] The “Request Information Elements” tables mix “Security credentials (M)” with “Requestor identifier (M)” only for Update Asset; terminology and presence conditions (M/CM) are not defined in this contribution, and “CM” is used without explanation (conditional mandatory on what condition).

  14. [Editorial] The “Note: DC AS or BAR apply restrictions…” is vague and not written in 3GPP normative style (no “shall/should/may” with clear conditions), and it’s unclear whether it is informative guidance or intended normative behavior.

  15. [Editorial] The proposal text (“Document section 2 contents as new clause 8.3.3.4… Add editor’s note…”) does not include the actual CR-style change text, clause numbering context, or exact insertion points, making it hard for SA4 to assess impact and consistency with existing TR 26.813 structure.

You should sign in to be able to post reviews
PDF Edit
Avatar-udpate to section 6.3.4 InterDigital New York
Previous Reviews:
manager
2026-02-09 04:58:56
  1. [Technical] The contribution appears to introduce a fully specified AAU animation bitstream format (types, headers, payload syntax, formulas) into AVATAR §6.3.4, but it is unclear whether AVATAR is intended to normatively define MPEG ARF bitstream syntax versus only referencing ISO/IEC 23090-39/43; this risks duplicating or conflicting with MPEG normative text and creating maintenance divergence.

  2. [Technical] The AAU header definition (“7-bit AAU type code” + “payload length in bytes”) is underspecified for interoperability: no endianness, alignment/padding rules, length field size, and no explicit framing/escape mechanism are stated, making parsing ambiguous across transports and files.

  3. [Technical] The timestamp/timescale design is inconsistent and potentially incorrect: payload uses a 32-bit timestamp in “ticks” while configuration uses a 32-bit float “timescale” for ticks-per-second; using float for a clock rate is unusual and can cause rounding/non-determinism—an integer timescale (and/or rational) is typically required for exact sync.

  4. [Technical] The AAU_CONFIG “animation profile string” is too open-ended to ensure interoperability; without a registry, versioning rules, and normative constraints tied to each profile, receivers cannot reliably validate or negotiate capabilities.

  5. [Technical] The joint animation sample uses a per-joint 4×4 matrix (and optional velocity matrix) but does not specify coordinate system conventions (right/left-handed, units), matrix layout (row/column-major), composition order, or whether transforms are local-to-parent vs model-space—these are critical to avoid mismatched animation playback.

  6. [Technical] The LBS formula provided (vᵢ = Σⱼ wᵢⱼ · Mⱼ · vᵢ⁰) omits the standard bind-pose correction (inverse bind matrices) and does not clarify whether Mⱼ already includes that; given the earlier mention of inverse bind matrices, the text should align or it will mislead implementers.

  7. [Technical] Blendshape sample semantics are incomplete: no constraints on weight ranges, normalization, additive vs absolute interpretation, and no definition of how “confidence” affects rendering/estimation; this can lead to incompatible behavior across clients.

  8. [Technical] Landmark AAU dimensionality (2D vs 3D) is defined but not the reference frame (image plane vs normalized device coords vs mesh local/world), nor units and origin; without this, landmarks cannot be used consistently for overlays/registration.

  9. [Technical] Texture animation samples are described as “parametric texture weights” controlling “micro-geometry patterns, makeup, dynamic material variations,” but there is no normative mapping to material models (e.g., glTF PBR parameters) or to ARF TextureSet targets, so different implementations will interpret the same stream differently.

  10. [Technical] The container statements (“ARF document in MetaBox item, may include animation tracks”; zip-based with relative references) lack the key identifiers needed for interoperability (item types/brands, track handler/sample entry, MIME/URN mapping, and security considerations for zip path traversal), and may conflict with existing ISOBMFF conventions if not aligned.

  11. [Technical] The “preamble” addition includes “authenticationFeatures with encrypted facial/voice feature vectors and public key URI,” which raises privacy/security and regulatory implications; without specifying threat model, key management, encryption scheme, and consent/usage constraints, this is risky to introduce even as descriptive text.

  12. [Technical] The data model details (e.g., tensors “Nx16”, “NxM”, GLB references for blendshapes) read like normative schema requirements but do not define encoding (binary layout, precision, indexing, limits) or how these map to glTF/23090-14 constructs, risking inconsistent implementations.

  13. [Editorial] The update of the MPEG ARF reference from WG03N1316 to WG03N1693 and the claim that 23090-39 is at CDIS stage should be verified against the exact cited document and date; AVATAR text should avoid hard-coding maturity statements that will quickly become outdated.

  14. [Editorial] Terminology is inconsistent between “ARF document,” “preamble,” “metadata object,” “components section,” and “MetaBox item”; if these are intended as formal objects/boxes, they should be capitalized/defined consistently, otherwise phrasing should be clearly informative-only.

  15. [Editorial] Several parts read like implementation notes (“available through Python language,” “inverse kinematics system for missing joint information”) rather than specification text; AVATAR §6.3.4 should separate normative requirements from informative examples to avoid implying mandatory behavior.

You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
Work Plan for Avatar_Ph2_MED InterDigital Europe
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_Avatar_Ph2_MED] pCR on 3D Gaussian Splatting Avatar Methods for Real-Time Communication Qualcomm Atheros, Inc.
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_Avatar_Ph2_MED] pCR on Avatar Evaluation Framework and Objective Metrics Qualcomm Atheros, Inc.
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_avatar_ph2_MED] Consolidated CR InterDigital
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_Avatar_Ph2_MED] LS on MPEG I ARF compression aspects Nokia
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_Avatar_Ph2_MED] Authentication for avatar data Nokia
You should sign in to be able to post reviews
PDF Summary Proposals Critical Review
[FS_Avatar_Ph2_MED] LS on IMS network behaviour for new Contact header parameters InterDigital Pennsylvania
You should sign in to be able to post reviews

Total TDocs: 18 | PDFs: 16 | AI Summaries: 11 | AI Proposals: 11

Comments saved: 11 / 18