Proposal 1: Add a note that this can only work for very simple cases excluding complex VLM/LLM explicitly in the text and limited to a model size, and what use cases this can be used for that can use such smaller models.

Proposal 2: Clarify end-end latency requirements and derive required bit-rate/latency and loss profiles

Proposal 3: Clarify the correct protocol usage to support this use case and the required latency, typically not HTTP/TCP.

Proposal 4: Ask SA2 how such burst can be supported and if a new QoS profile is needed or if existing.

Proposal 5: Clarify the required support of neural network codec if any for the UE

Proposal 6: Consider adding caching and model updates in the call flow to avoid downloading a model for each task.