[AIML_IMS-MED] AI intermediate data format
This contribution proposes defining an intermediate data carriage format for AI/ML split inferencing, derived from TR 26.927. The document introduces:
Split inferencing, approved and mandated in 5G, is a key objective of the work item. The solution must support:
The proposed format is derived from:
Tensor characteristics are not static and may change dynamically based on:
These characteristics must be conveyed through the user plane for accurate interpretation at the receiving end.
Key Definition: Intermediate data refers to output tensor(s) computed by a sub-model executing an inference subtask up to a defined and negotiated partitioning, transferred between endpoints (device, edge, server) to serve as input to a subsequent sub-model.
Characteristics:
- May be compressed and/or encoded before transmission
- Processing shall not alter semantics required by receiving sub-model
- Non-persistent, dynamic, and context-dependent
- Characteristics (shape, size, format) vary as function of:
- Input data
- Selected inference partitioning
- Runtime configuration
Configuration Stage: Structure defined and exchanged at configuration stage, referred to as partitioning configuration.
Dynamic Factors:
- Input media size/resolution changes may alter tensor shape
- Selected partitioning identifies active partitioning among pre-configured options
- Selected compression profile (algorithm and parameters) optimized for efficiency
Required Information in Format:
- Tensor identifier
- Inferred tensor length (derived from current tensor shape)
- Partitioning identifier (referencing negotiated configuration)
- Compression profile identifier (indicating compression method)
Solution: AI Parameter Set (AIPS) defined to capture information applicable to all tensors and associated data.
Purpose: Carries metadata (tensor metadata) associated with intermediate data payload.
AIPS Lifetime:
- Starts: When decoder first receives and parses AIPS TLV unit
- Ends: When:
- New AIPS with same or different ai_parameter_set_id is received
- New session begins
- Decoder is reset
- Number of tensors or tensor shape changes
AIPS Fields (Table X.X.13-1):
| Field | Meaning |
|-------|---------|
| ai_parameter_set_id | Unique ID of AIPS |
| split_point_id or partition_id | Key identifier of split point/partition |
| num_tensors | Number of tensors |
| For each tensor: | |
| - tensor_id | Tensor identifier |
| - dtype | Data type of tensor data |
| - rank | Number of dimensions |
| - For each dimension: dimension | Size of dimension |
| - compression_profile_id | Compression profile identifier |
TLV Message Components:
- Type: Indicates payload information
- Length: Value of payload
- Payload: Data
TLV Unit Types (Table X.X.24-1):
| Type Value | Description |
|------------|-------------|
| 0 | Reserved |
| 1 | AI Parameter Set data (AIPS) |
| 2 | Intermediate data |
| 3-255 | Undefined |
Encapsulation Scenarios:
AIPS Data Encapsulation (X.X.24.2): TLV unit encapsulating AIPS data as defined in clause 1.3
Single Tensor Encapsulation (X.X.24.3):
Tensor payload contains flattened byte array, possibly compressed per AIPS compression profile ID
Multiple Tensors Encapsulation (X.X.24.4): TLV unit encapsulating more than one tensor data
Terminology Updates:
- "Split point" terminology changed to "partitioning" throughout
- "Head sub-model" and "Tail sub-model" terminology refined to "sub-model" and "subsequent sub-model"
Structural Additions:
- Addition of partition identifier (highlighted as new in original document)
- Formalization of AIPS lifetime management
- Complete TLV encapsulation framework
The document proposes: