Content for  TR 26.998  Word version:  18.0.0

Top   Top   Up   Prev   None
0…   4…   4.2…   4.2.2…   4.2.3…   4.3…   4.4…   4.5…   4.6…   4.6.4…   4.6.5…   4.6.8…   5   6…   6.2…   6.2.4…   6.2.5…   6.3…   6.3.4…   6.3.5…   6.4…   6.4.4   6.4.5…   6.5…   6.5.4   6.5.5   6.5.6…   6.6…   6.6.4   6.6.5…   7…   8…   8.9   9   A…   A.2   A.3…   A.4   A.5   A.6   A.7…


A.7  Use Case 22: Shared AR Conferencing Experiencep. 118

Use Case Description: Shared AR Conferencing experience
This clause describes an AR conferencing use case that allows participants to participate in a shared virtual conference room experience. For each participant, the other participants' video objects are registered on the participant's AR scene, creating the sense that the conference room is held in the physical location of the participant. The arrangement of participants (i.e. their location relative to each other) in the virtual room are the same which creates a consistent sense of conference room layout for all participants.
Bob, Alice, and Tom are participating in a virtual meeting. Bob is having the call at his home's kitchen and sees Alice and Tom with their 3D volumetric representation on his AR glasses. In Bob's view, Alice and Tom are sitting at Bob's kitchen table with Alice on left and Tom on the right-hand side. Alice is in her office conference room. She sees Bob and Tom their 3D volumetric representation on her AR glasses. For Alice, Tom is sitting on the left and Bob on the right-hand side of her conference table. Finally, Tom is having the call at the airport lounge. For him, Bob is sitting on a chair on the right and Alice is sitting on the couch on the left side. While the real worlds of Bob, Alice, and Tom are different, in all scenes the participants are seated in the same arrangement relative to each other. Therefore, when Alice turns to Tom, Tom and Bob see the consistent views of Alice as if they are in the same physical room.
Degrees of Freedom:
3DoF+ or 6DoF
AR glasses
  • The participants are wearing AR glasses that allow the 3D volumetric representation of other participants.
Requirements and QoS/QoE Considerations
The network is required to support the delivery of 3D volumetric streams for real-time conversational services:
  • Support of creating a composed scene in the network
  • Support of different volumetric user representation formats.
  • Bitrates and latencies that are sufficient to stream volumetric user representations under conversational real-time constraints.
The bandwidth and latency requirements for AR conferencing using 3D volumetric representations present a challenge to mobile networks. The complexity of the 3D volumetric representations is challenging for the endpoints and introduces additional delay for processing and rendering functions. Intermediate edge or cloud components are needed.
In the following are some indicative values of a potential solution and transmission format for different types of user representation:
  • A point cloud stream has raw bandwidth requirement of up to 2 Gbps. The transmission bandwidth is expected to be lower after encoding and optimization.
  • Preliminary data from MPEG V-PCC codec evaluation indicates compression ratios in the range of 100:1 to 300:1 [40]. For dynamic sequences of 1M points per frame this could result into an encoding bitrate of "8 Mbps with good perceptual quality" [40]. For conversational services, we expect lower compression ratios.
  • 2D/RGB+Depth: >2.7Mbps (1 camera @ 30fps with total resolution of 1080x960 [37]), >5.4Mbps (2 Camera @ 30fps with total resolution of 1080x1, 920 38].
  • 3D Mesh: ~30 Mbps @ 20-25 FPS (with a voxel grid resolution of 64x128x64 and 12-15k vertices) [39].
  • Preliminary data from 3D GPCC show that bitrates in the range of 5-50 Mbps @ 30 fps with varying octree depth and varying JPEG QP are expected [39].
Potential Standardization Status and Needs
The following aspects may require standardization work:
  • Standardized formats for 3D volumetric representation of participants on AR glasses.
  • Cloud APIs for processing and rendering of 3D volumetric streams.
  • Conversational methods for call initiation.
  • Spatial audio formats and associated metadata.
  • Metadata for Spatial characteristics of the AR environment (e.g. positioning of users).

$  Change historyp. 121

Up   Top