# Summary of S4-260114: Testbed for AI Media Services Traffic Characterization

## Introduction and Motivation

This contribution from Qualcomm proposes a comprehensive testbed framework for characterizing traffic patterns and QoE metrics of generative AI services in the context of the FS_6G_MED study. The testbed addresses the need for quantitative characterization of AI-native media services under diverse network conditions, which is a key requirement for the Study on Media Aspects for 6G System.

The testbed provides end-to-end measurement capabilities for multiple AI service types:
- Chat services
- Streaming services
- Agentic tool use
- Image generation
- Multimodal analysis
- Real-time conversational AI

## Key Technical Capabilities

### Supported Metrics

The testbed captures comprehensive performance metrics including:
- **Latency metrics**: TTFT (Time To First Token), TTLT (Time To Last Token), latency percentiles
- **Traffic metrics**: UL/DL bytes and ratios, burstiness
- **Performance metrics**: Success rate, token rate, tool-call latency, streaming stall statistics
- **Protocol analysis**: All pcap-enabled analysis capabilities

### Trace Logging

Deep visibility into protocol and payload behavior is provided through trace logging functionality, which can be enabled via `TRACE_PAYLOADS=1`. This enables generation of:
- WebRTC SDP samples
- Exact computer-use request/response payloads

## Architecture and Implementation

### Modular Design

The testbed follows an orchestrator-centric architecture with clear separation of concerns:

- **orchestrator.py**: Coordinates scenario runs, applies network profiles, handles retries, and generates reports
- **scenarios/***: Implements traffic patterns for different AI service types (chat, agent, direct search, realtime, multimodal, image, video, computer use)
- **clients/***: Provides provider adapters for OpenAI®, Gemini®, DeepSeek® (OpenAI-compatible), and vLLM for self-hosted models
- **netem**: External dependency on the proposed common network emulator module [1]
- **capture/***: Provides L3/L4 pcap capture and L7 capture via mitmproxy
- **analysis/***: Logs to SQLite, computes 3GPP-aligned metrics, and generates plots

### Extensibility

The framework is designed for easy extension:
- **New scenarios**: Create a class extending `BaseScenario`, register in `scenarios/__init__.py`, and add YAML entry in `configs/scenarios.yaml`
- **New providers**: Implement a client subclassing `LLMClient` and register in the orchestrator client factory

### Self-Hosted Model Support

The testbed includes vLLM client support (`clients/vllm_client.py`) enabling evaluation of self-hosted models via OpenAI-compatible API, with the same metrics and logging pipeline as hosted providers.

## Usage and Configuration

### Configuration Management

- Scenarios and models configured in `configs/scenarios.yaml`
- Network profiles configured in `configs/profiles.yaml`

### Execution Options

- Single scenario: `python orchestrator.py --scenario chat_basic --profile 5g_urban --runs 10`
- Full matrix: `python orchestrator.py --scenario all --runs 5`
- Enable L3/L4 capture: `--capture-pcap`
- Enable L7 capture: `--capture-l7`

## Initial Results

The contribution includes preliminary evaluation results showing:
- TTFT (Time To First Token) measurements across different scenarios
- Average throughput measurements by scenario

Note: These initial results are presented as examples and are not intended for TR documentation.

## Proposal

The contribution proposes that SA4:
- **Agrees** to adopt the proposed testbed as the baseline for AI traffic characterization evaluation
- **Documents** the testbed in TR 26.870 (Study on Media Aspects for 6G System)

## References

The contribution references:
- [1] S4-260xxx: Generic Network Interface Emulator for Media Delivery Evaluation
- [2] SP-251652: New SID on Media Aspects for 6G System (FS_6G_MED)
- [3] 3GPP TR 22.870: Study on 6G Use Cases and Service Requirements
- [4] 3GPP TR 26.998: Support of XR Services