Meeting: TSGS4_135_India | Agenda Item: 11.1
Neural Network Based Video Codec Architecture and Support for Error Resilience
Huawei Tech.(UK) Co.. Ltd
pCR
Agreement
| TDoc | S4-260095 |
| Title | Neural Network Based Video Codec Architecture and Support for Error Resilience |
| Source | Huawei Tech.(UK) Co.. Ltd |
| Agenda item | 11.1 |
| Agenda item description | FS_6G_MED (Study on Media aspects for 6G System) |
| Doc type | pCR |
| For action | Agreement |
| Release | Rel-20 |
| Specification | 26.87 |
| Version | 0.0.1 |
| Related WIs | FS_6G_MED |
| download_url | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_135_India/Docs/S4-260095.zip |
| For | Agreement |
| Spec | 26.87 |
| Type | pCR |
| Contact | Rufail Mekuria |
| Uploaded | 2026-02-03T08:50:05.980000 |
| Contact ID | 104180 |
| TDoc Status | noted |
| Reservation date | 02/02/2026 13:42:53 |
| Agenda item sort order | 60 |
[Technical] The proposal places DVC/GRACE under “AI Traffic Characteristics” (Work topic #2d), but most of the added material is codec architecture and performance benchmarking; it would fit better under a media codec/format or “AI-based media processing” clause, otherwise the TR risks mixing traffic characterization with implementation details.
[Technical] Claims like “competitive with H.264/H.265” and “MOS up to 38% better than H.264/H.265 with AL-FEC and error concealment” are not framed with the necessary test conditions (bitrate points, resolution, latency budget, encoder presets, FEC overhead, concealment method, GOP structure), making the conclusions non-reproducible and potentially misleading for 3GPP documentation.
[Technical] The GRACE “channel-aware training” description lacks clarity on the assumed loss model (random vs burst, packetization unit, reordering, RTT/jitter) and how it maps to 3GPP radio behaviors (e.g., RLC AM/UM, PDCP reordering, HARQ), so the stated resilience benefits may not translate to 3GPP deployments as written.
[Technical] The text implies “arithmetic coding mapped to packets” and “independently decodable sub-tensors,” but does not specify the resynchronization strategy (start codes, partition headers, state reset frequency) needed to make arithmetic-coded partitions independently decodable after loss; without this, the error-resilience mechanism is underspecified.
[Technical] The proposal does not discuss how NNC bitstreams would be carried in 3GPP systems (RTP payload format, ISO BMFF sample entry, OMAF, DASH/CMAF signaling, SDP/offer-answer), which is essential if the TR is to document applicability to 6G media rather than just summarize papers.
[Technical] “Exceptional reduction in tail latency” is asserted but the causal mechanism is not tied to system-level latency contributors (encoder lookahead, buffering, retransmissions, playout delay); additionally, using Google GCC/WebRTC traces is not equivalent to 3GPP QoS flows and scheduler behavior, so the latency claim needs careful qualification.
[Technical] The “reconstruction failures due to non-bit-exact arithmetic operations in GPU frameworks” point is important but incomplete: it should explicitly distinguish training-time nondeterminism from inference-time decoder determinism requirements, and identify what must be standardized (fixed-point ops, deterministic kernels, rounding modes) to ensure interoperable decoding.
[Technical] The document mentions “Deep Render codec in FFMPEG and VLC” as evidence of industry adoption, but it is unclear whether this is the same DVC lineage, whether it is interoperable, and whether it is actually deployed; this risks overstating maturity without verifiable references.
[Technical] The proposal does not address complexity/power trade-offs in a 3GPP-relevant way (device classes, thermal limits, uplink vs downlink split, encoder/decoder placement), and the hardware statements (A40 GPU fps, “real-time on mobile”) are too vague to inform 6G feasibility.
[Editorial] Adding the DVC and GRACE papers to “normative references” is likely incorrect for a TR-style descriptive clause; these should be informative references unless the text normatively depends on them, otherwise it creates an unintended compliance implication.
[Editorial] The new clause numbering “6.2.4.X” is a placeholder and should be resolved to an actual number consistent with the TR structure; leaving “X” is not acceptable in a contribution proposing spec text.
[Editorial] Several statements are promotional or absolute (“exceptional,” “key enabler,” “realistic conditions”) and should be rewritten in neutral 3GPP style with quantified qualifiers and explicit assumptions.
[Editorial] The summary references “clauses 2 and 3” as the basis for the documentation request, but the proposed insertion is in clause 6.2.4; the contribution should align the rationale with the actual target clause(s) and ensure cross-references are consistent.
[Technical] The “content-specific due to training data dependencies” limitation is noted but not connected to operational mitigations relevant to 3GPP (model update cadence, on-device adaptation, signaling of model/version, backward compatibility), leaving a major deployment issue unaddressed.
[Editorial] If architecture diagrams are included, the contribution should ensure they are either original or properly licensed/cited and that figure captions and terminology match 3GPP conventions (e.g., avoid paper-specific module names without definitions).