Unknown
S4-260161 / TSGS4_135_India / 11.1 / China Mobile Com. Corporation / [FS_6G_MED]pCR on Embodied Video for 6G Media
Previous Next Edit
S4-260161

[FS_6G_MED]pCR on Embodied Video for 6G Media

Source: China Mobile Com. Corporation
Meeting: TSGS4_135_India
Agenda Item: 11.1

All Metadata
Agenda item description FS_6G_MED (Study on Media aspects for 6G System)
Doc type pCR
For action Agreement
Release Rel-20
Specification 26.87
Version 0.0.1
Related WIs FS_6G_MED
download_url Download Original
For Agreement
Spec 26.87
Type pCR
Contact Jiayi Xu
Uploaded 2026-02-03T12:59:10.947000
Contact ID 89460
TDoc Status noted
Reservation date 03/02/2026 12:50:08
Agenda item sort order 60
Review Comments
manager - 2026-02-10 04:17


  1. [Technical] The proposed new use case “Embodied Video Internet (EVI)” is not clearly mapped to the existing TR 26.870 study objectives and terminology; it reads like a new umbrella concept rather than a media-centric use case, and the CR should explicitly justify why it belongs in TR 26.870 (SA4 media study) versus remaining in SA1/SA2 domain.




  2. [Technical] Several KPI values appear internally inconsistent or insufficiently specified for media work: e.g., “6x 1080p @ 15Hz → 20 Mbps” and “compression ratio 240:1” are asserted without stating codec, chroma format, bit depth, target quality, or whether “Hz” means fps, making the derived bitrates non-reproducible and potentially misleading for SA4 conclusions.




  3. [Technical] The latency requirement “E2E RTT 100–300 ms” for “real-time” embodied control/offloading is not reconciled with the tighter control-loop needs implied elsewhere (e.g., 10 ms sensor intervals, motion control), and the text should distinguish clearly between (a) media transport latency for perception streams and (b) closed-loop control latency/reliability requirements.




  4. [Technical] The contribution mixes “video” requirements with non-media payloads (LiDAR, point clouds, sensor data) but does not define the scope boundary for TR 26.870 (media codecs/protocols/QoE); without scoping, the clause risks driving requirements that are more appropriate for generic data transport or edge computing studies.




  5. [Technical] “AI codec with error-tolerant capabilities (Grace method)” is introduced as a requirement but is not defined, referenced, or aligned with ongoing 3GPP/MPEG terminology (e.g., neural codecs, feature/latent compression, ROI coding); as written it is not actionable and could conflict with existing codec evaluation frameworks.




  6. [Technical] “AI-native Video Protocol” is proposed as a key requirement without identifying what is missing in existing protocol stacks (RTP/RTCP, QUIC, DASH/CMAF, WebRTC, 5G media streaming) or what protocol functions are uniquely required (e.g., semantic prioritization, multi-stream synchronization, in-network adaptation), so it reads as a vague solution statement rather than a requirement.




  7. [Technical] Reliability targets such as “>99.99%” are stated for UAV inspection and robot sensor/LiDAR traffic without defining the reliability metric (packet success probability, frame delivery, application-level inference success, within what time bound), which is critical for translating to media-layer mechanisms.




  8. [Technical] The UAV “event security” latency “≤10 ms” for 1K/4K video at ≥5/≥25 Mbps is extremely stringent and likely infeasible end-to-end for typical video pipelines (capture/encode/packetize/decode/render), unless it refers to one-way transport only; the clause should clarify the latency definition and include processing components or explicitly exclude them.




  9. [Technical] Multi-camera scenarios (6–8 cameras, mixed 1080p/4K, 15/30/60 fps) are listed, but there is no requirement discussion on synchronization (inter-camera time alignment), multi-stream correlation, or joint encoding/transport, which are central media issues for embodied perception and 3D reconstruction.




  10. [Technical] The “QoE model” section is too generic and user-centric for a machine-consumer scenario; embodied AI often optimizes task success (e.g., mAP, tracking stability, control error) rather than human QoE, so the clause should introduce task-oriented QoS/QoE (QoTask) metrics and how they relate to media impairments.




  11. [Technical] The contribution cites SA1 TR 22.870 use cases but does not ensure consistent numbering/traceability (e.g., “Use Case 6.28/6.19/6.48/6.11”) to the exact clauses/tables in TR 22.870; without precise references, the extracted KPIs risk being challenged as non-authoritative.




  12. [Editorial] Clause/table numbering appears inconsistent in the summary (“Table 2.1.3-1/2.1.3-2” under “Clause 6.1.3”), suggesting the CR may introduce numbering conflicts or incorrect cross-references in TR 26.870; numbering should follow the target document’s clause structure.




  13. [Editorial] Terms are used inconsistently or non-standardly (“15Hz” for frame rate, “E2E RTT” vs “E2E latency”, “1K” resolution), and should be normalized to 3GPP style (fps, one-way latency vs RTT, explicit pixel dimensions).




  14. [Editorial] The text frequently shifts from requirements to solution proposals (“new protocol design”, “AI codec technology”) without using normative/requirements language appropriate for a TR study clause (e.g., “may need”, “is expected to”), which could be seen as over-prescriptive for a study item.



Sign in to add comments.