# Summary of S4-260285: AVATAR-Update to section 6.3.4

## Document Overview
This contribution updates section 6.3.4 of the AVATAR specification to align with the current status of MPEG Avatar Representation Format (ARF) work. The document reflects progression of ISO/IEC 23090-39 from early development to Committee Draft International Standard (CDIS) stage.

## Main Technical Changes

### MPEG ARF Specification Status Update
- **Reference document updated**: From WG03N1316 to WG03N1693
- **Specification maturity**: ISO/IEC 23090-39 has reached Committee Draft International Standard (CDIS) stage
- **Scope refinement**: Editorial improvements to description of avatar representation format scope, including interchange format for computer-generated avatars, containers, and animation stream formats

### Avatar Data Model and Representation Format

#### Restructured Data Model Description
The document significantly restructures how the ARF data model components are described:

**High-level Avatar Information (Metadata Object)**:
- Name, Identifier, Age, Gender
- Holds avatar-level descriptive information for system adaptation and policy control

**Preamble Section** (new addition):
- Signature string for unique document identification
- Version string tied to specific ARF revision
- Optional authenticationFeatures with encrypted facial/voice feature vectors and public key URI
- supportedAnimations object specifying compatible facial, body, hand, landmark, and texture animation frameworks using URNs
- Optional proprietaryAnimations for vendor-specific schemes (e.g., ML-based reconstruction models)

**Components Section** (detailed expansion):
- **Skeleton**: Defines joints as scene graph nodes subset, references inverse bind matrices data item (Nx16 tensor), optional animationInfo
- **Node**: Scene graph objects with names, IDs, parent/child relations, semantic mappings, TRS or 4x4 matrix transformations
- **Skin**: Links mesh to skeleton, optional blendshape/landmark/texture sets, per-vertex joint weights tensor (NxM)
- **Mesh**: Geometric primitives with name, ID, optional path, data items containing geometry
- **BlendshapeSets**: Shape targets for base mesh, references geometry-only shapes (GLB files), optional animationInfo
- **LandmarkSets**: Vertex/face indices with barycentric weights for landmark positioning
- **TextureSets**: Material resources linked to texture targets and animation frameworks

### Container Format
- Supports partial access to avatar components
- Two container formats: ISOBMFF (ISO/IEC 14496-12) and Zip-based (ISO/IEC 21320-1)
- ISOBMFF containers: ARF document in MetaBox item, may include animation tracks
- Zip-based containers: Top-level ARF document with relative component file references

### Scene Description Integration
- Designed to work with MPEG Scene Description (ISO/IEC 23090-14) based on glTF
- Not limited to MPEG Scene Description
- ISO/IEC 23090-14 defines MPEG_node_avatar extension
- ISO/IEC 23090-39 extends MPEG_node_avatar for better ARF integration

### Reference Software (ISO/IEC 23090-43)

**Major update from "under development" to defined implementation**:

**arfref Module** (C++ and Python):
- Parsing of ARF containers
- Helper functions for asset decoding
- Partial glTF 2.0 encoding/decoding support for meshes
- Animation mapping (AnimationLink objects)
- Animation stream decoding
- Available through Python language

**arfviewer Module**:
- Avatar Animation Units (AAUs) support
- Time-sequence blendshape weights with optional confidence metrics
- Joint transformations for skeletal animation
- AAU format with chronological data blocks
- Inverse kinematics system for missing joint information
- Blendshape animator managing neutral mesh vertices and deltas with weighted summation

### Reference Client Architecture
- Based on ISO/IEC 23090-14 concepts
- Avatar pipeline as part of Media Access Function (MAF)
- Fetches ARF container and animation streams
- Reconstructs avatar and provides to Presentation Engine through 3D mesh component buffers

### Animation Bitstream Format

**Comprehensive new section detailing AAU-based animation stream format**:

#### Avatar Animation Units (AAUs) Structure
- Sequence of AAUs with header, payload, and optional padding
- **Header**: 7-bit AAU type code, AAU payload length in bytes
- **Payload**: 32-bit timestamp in "ticks" plus type-specific data

#### AAU Types Defined
- **AAU_CONFIG**: Configuration unit
- **AAU_BLENDSHAPE**: Facial animation sample
- **AAU_JOINT**: Body/hand joint animation sample
- **AAU_LANDMARK**: Landmark animation sample
- **AAU_TEXTURE**: Texture animation sample
- Reserved ranges for future extensions

#### Configuration Units
- Animation profile string (UTF-8 encoded)
- Timescale value (32-bit float) for ticks-per-second conversion
- Profile string identifies constraints and options

#### Facial Animation Samples (AAU_BLENDSHAPE)
- Target blendshape set identifier
- Per-blendshape confidence flag
- Number of blendshape entries
- Per entry: index, weight (32-bit float), optional confidence
- Deformation formula: v = v₀ + Σₖ wₖ · Δvₖ

#### Joint Animation Samples (AAU_JOINT)
- Target joint set identifier
- Per-joint velocity flag
- Number of joint entries
- Per entry: joint index, 4×4 transformation matrix, optional velocity matrix
- Linear Blend Skinning (LBS) formula: vᵢ = Σⱼ wᵢⱼ · Mⱼ · vᵢ⁰

#### Landmark Animation Samples (AAU_LANDMARK)
- Landmark set ID
- Velocity and confidence flags
- Dimensionality flag (2D vs 3D)
- Number of landmarks
- Per landmark: index, coordinates, optional velocity and confidence
- Use cases: facial tracking overlays, sensor-mesh registration, calibration

#### Texture Animation Samples (AAU_TEXTURE)
- Parametric texture weights for TextureSet targets
- Similar structure to blendshape samples
- Controls micro-geometry patterns, makeup, dynamic material variations

#### Animation Stream Delivery
- Live transmission as AAU sequences
- Storage as avatar animation tracks in ISOBMFF-based ARF containers
- Sample grouping for pre-recorded sequences (e.g., "smile," "wave," "dance")
- Dual use for real-time communication and offline authoring/replay

### Exploration Experiments

**Status changed from "initiated" to "continues"**:

- **Compression for Animation Streams**: Evaluate compression methods for facial and body animations
- **Integrating Geometry Data Components**: Specify integration into interoperable container format
- **Animation Sample Formats**: Develop structures for blend shapes, facial landmarks, animation controllers, joint transforms
- **Content Discovery and Partial Access**: Evaluate solutions
- **Animation Controllers**: Study combination of blend shape and joint animation

## Editorial Corrections
- Revision 1 corrects reference software status
- Various grammatical and formatting improvements throughout
- Consistent terminology usage (e.g., "with" instead of "to with")