| Use Case Name: |
|---|
| The Haptics-enhanced media distribution is the basic scenario extending the delivery of audio-visual media with additional channel(s) for haptics. |
| Description: |
| A service provider receives from various content providers, media contents such as movies, live sport feeds, 2D video or audio content (audio books, music,) which are enhanced with haptics. The service provider distributes the content to various devices that can render and augment the Audio-Visual experience with haptics data. The haptic effects are encapsulated in additional channel(s) and delivered synchronously with the AV content to add physical feedback in the form of haptic effects. This haptic enhanced experience is created by the content provider or creator to maximize the user engagement. The user is enticed to this new type of media and consumes more content from his service provider. The UE does the rendering and content adaptation based on the UE haptic device(s) capabilities. In some cases, adaptation and selection into the available haptics data is necessary (either at the UE or by the service provider). This adaptation may include, modifying the sampling rate, limiting the number of channels, converting a force feedback signal to a vibration signal, etc. Haptic modalities included here: motion, thermal and vibrotactile. Haptic characteristics: 1-way, passive, single-user, low density and data-rates per channel but several channels. A channel may contain one or more haptic modalities. |
| Categorization |
| Type: distributed Audio Visual and haptics content. Delivery: broadcast, http streaming Device: smartphones, wearables, seats, suits. |
| Preconditions |
|
| Characteristics |
| Bit-rates = p to 8kbit/s per channel for compressed parametric signals, up to 64 kbit/s per channel for time sampled signals, depending on the density of the signal. Message size = For parametric compressed signals, the burst metadata message size is, in average, up to 550 bits per channel, while data packets are up to 2500 bits. with many silent units per channel. For time sample signal, the signal is continuous (no silent unit, or burst) with metadata message size, in average, up to 400 and the message size is, in average, up to 11600bits per channel Number of Channels = low to high, typically 1 to 32 with non-continuous packets or continuous signal (depending on the format) Delay = The haptic media streams is synchronized with the AV media. The perceived delay or asynchronicity between Haptic media and Video or Audio needs to be within the tolerable delay for passive experience. Format: both signal (time samples) and parametric. |
| Feasibility |
Enabling technologies have reached a sufficient maturity in terms of:
|
| Requirements and interoperability considerations |
| Haptic data should be able to be transmitted and rendered with various applications, environments and devices. Support for a haptics media type is considered with one to several data channels. RTP, ISOBMFF or DASH support for haptics media is needed. Adaptation to the rendering capabilities should be supported. User adaptation could be provided. |
| Potential Standardization Status and Needs |
| Use Case Name: |
|---|
|
Haptic-enhanced Communication
Including:
clause 5.9 of TR 22.856 "Synchronized predictive avatars"
clause 5.11 of TR 22.856 "IMS-based 3D Avatar Communication"
TR 26.813 UC1 "Avatar Communication"
|
| Description: |
| This use case addresses communication between people, directly or through an avatar representation. In the simple case, haptics is a new channel added to the traditional communication means, such as i) on text messaging services to add vibrations to emojis, ii) on top of an audio call to enhance the context or just for adding effects, iii) in addition to A/V in a video call to increase the emotional impact. In a more complex scenario, this use case is about one-to-one and multi-party communication (which may be an IMS multimedia telephony call using AR/MR/VR) with spatial audio and haptics rendering, where avatars, audio and haptics of each participant in avatar call are transmitted and spatially rendered in the direction of their geolocation. An avatar call is similar to a video call in that both are visual, interactive, provide live feedback to participants regarding their emotions, attentiveness and other social information. An avatar may interact with the environment, another avatar or object, and, through direct or indirect communication, relay haptics feedback to one or more avatars in the shared space. Similarly, avatar interaction could include throwing a ball towards another participant or using a stick to interact with environmental objects or through another object interact with thermal sinks (cold / hot sources) within the environment or scene through another object. Each participant is equipped with display devices (phones, AR glasses, etc.) with external or built-in headphones and haptics renderer. Haptic modalities included here: mostly vibrotactile. Haptic characteristics: Includes 2-way communication (talks, social media, phone calls…) and 1-way (alerts messages, information). Multi-users, low to high data-rates per channel, limited number of channels. |
| Categorization |
| Type: 2D/3D audio-video sessions Delivery: Conversational, Split Device: headphones, smartphones, smartwatches |
| Preconditions |
|
| Characteristics |
| Bit-rates = up to 5kbit/s per channel depending on density. Uplink and downlink. Message size = For parametric compressed signals, the burst metadata message size is, on average, up to 550 bits per channel, while data packets are, in average, up to 2500 bits. with many silent units per channel. For time sample signal, the signal is continuous (no silent unit, or burst) with metadata message size, in average, up to 400 and the message size is, in average, up to 1600 bits per channel Number of Channels = limited, between 1 and 4 with sparse packets. Delay = The haptic media streams is synchronized with the AV media. The perceived delay or asynchronicity between Haptic media and Audio needs to be lower than 1 audio frame Format: parametric |
| Feasibility |
Enabling technologies have reached a sufficient maturity in terms of:
|
| Requirements and interoperability considerations |
| Haptic data should be distributed with low latency and rendered with various applications, environments and devices. RTP or ISOBMFF and DASH support for haptics media is needed. Adaptation to the rendering capabilities should be supported. QoS parameters needs to be developed and address asynchronicity threshold. |
| Potential Standardization Status and Needs |
| Use Case Name: |
|---|
|
Immersive entertainment (live events, sport, gaming, movies, music)
Including:
clause 5.6 of TR 22.847 "Live Event Selective Immersion"
clause 5.6 of TR 22.856 "Mobile Metaverse for Immersive Gaming and Live Shows"
clause 5.22 of TR 22.856 "Mobile Metaverse Live Concert"
|
| Description: |
| Use cases from SA2 include complex scenario for the mobile live metaverse entertainment. This use case built upon the SA2 use cases, and considers simpler experiences covering traditional 2D and future mobile immersive applications The considered media including here are: immersive movies, immersive TV series, social networks, gaming... The later could be multi-party or single user. In this scenario users are playing content (local or streamed) with video (2D or 3D immersive), audio (potentially spatialized) and haptics. The haptic signal is synchronized with the content to add physical feedback in the form of some haptic effects and is also generated from triggers and actions in a gaming environment. The user is using at least a smartphone with vibrotactile actuators, or more advanced devices such as gaming controllers, XR glasses and potentially multi wearable devices to spatialize haptic effects. The gaming use case is the most challenging, as low delay interactive experiences are required. Haptic modalities included here: thermal, kinaesthetic and vibrotactile, but acceleration can be used for some events, especially gaming. Haptic characteristics: 2-way, interactive, spatialized, multi-users, potentially significant data-rates per channel and several channels. |
| Categorization |
| Type: VR, AR, mobile Delivery: broadcast, streaming, real time interactive communication Device: HMD, headphone, cushion, chair, suits, gaming controller, mobile phone |
| Preconditions |
|
| Characteristics |
| Bit-rates = up to 8kbit/s for high density compressed parametric signal per channel. Up to 64 kbps for high density time sample signal per channel. Message size = For parametric compressed signals, the burst metadata message size is, in average, up to 550 bits per channel, while data packets are in average up to 2500 bits. with many silent units per channel. For time sample signal, the signal is continuous (no silent unit, or burst) with metadata message size, in average up to 400 and the message size is in average up to 1600 bits per channel. Number of Channels = low to high, typically 1 to 32 with non-continuous packets or continuous signal (depending on the format). Live applications could be continuous if direct sensors information is recorded and distributed. Delay = The haptic media is synchronized with the AV stream. The perceived delay should be lower than 1 video frame (< 25 ms), except for gaming (< 15ms). Format: both signal (time samples) and parametric. |
| Feasibility |
Enabling technologies have reached a sufficient maturity in terms of:
|
| Requirements and interoperability considerations |
| Haptic data should be delivered with low delay and low latency and rendered with various applications, environments and devices. RTP, ISOBMFF or DASH support for haptics media is needed. Adaptation to the rendering capabilities should be supported. User adaptation could be provided. |
| Potential Standardization Status and Needs |
| Use Case Name: |
|---|
|
Immersive multi-modal XR and metaverse (VR, AR, multi-users gaming)
From:
clause 5.1 of TR 22.847 "Immersive multi-modal Virtual Reality (VR) application"
TR 26.813 UC3 "Multi-user Gaming"
clause 5.7 of TR 22.856 "AR Enabled Immersive Experience"
clause 5.12 of TR 22.856 "Virtual humans in metaverse"
|
| Description: |
| Immersive multi-modal VR application describes the case of a human interacting with virtual entities in a remote environment such that the perception of interaction with a real physical world is achieved. Users are supposed to perceive multiple senses (vision, sound, touch) for full immersion in the virtual environment. Virtual humans (or digital representations of humans, also referred to as 'avatars' in this use case) are simulations of human beings on computers. There is a wide range of applications using avatars, such as games, film and TV productions, financial industry (smart adviser), telecommunications (avatars), etc. In the coming era, the technology of virtual humans is one of foundations of mobile metaverse service. A virtual human can be a digital representation of a natural person in a mobile metaverse service, which is driven by the natural person, or a virtual human also can be a digital representation of a digital assistant driven by AI model. Mobile metaverse services offer an important opportunity for socialization and entertainment, where user experience of the virtual world and the real world combine. This use case focuses on the scenario of a natural person's digital embodiment in a metaverse as a location agnostic service experience. A virtual human is customized according to a user's personal characteristics and shape preferences. Users wear motion capture devices, vibrating backpacks, haptic gloves, VR glasses to drive the virtual human in a meta-universe space for semi-open exploration. The devices mentioned above are 5G UEs, which need to collaborate with each other to complete the actions of user and get real-time feedback. Haptic modalities included here: force, motion, thermal and vibrotactile. Haptic characteristics: Bi-directional, interactive, spatialized, multi-users, low latency, high data-rates. |
| Categorization |
| Type: VR /AR/XR Delivery: streaming, Split, conversational Device: HMD, glove, suit, AR/VR controller, motion platform |
| Preconditions |
|
| Characteristics |
| Bit-rates = up to 8 kbit/s per channel for compressed parametric signals. Message size = For parametric compressed signals, the burst metadata message size is, in average, up to 550 bits per channel, while data packets are in average, up to 2500 bits. with many silent units per channel. For time sample signal, the signal is continuous (no silent unit, or burst) with metadata message size, in average, up to 400 and the message size is, in average, up to 1600 bits per channel Number of Channels = high, typically 6 to 32 with non-continuous packets. Delay = The haptic media streams is synchronized with the AV media for rendering. The perceived delay or asynchronicity between Haptic media and Video or Audio needs to be lower than 1 audio frame? Format: mostly parametric (esp. virtual scenes) |
| Feasibility |
Enabling technologies have reached a sufficient maturity in terms of:
|
| Requirements and interoperability considerations |
| Haptic data should be able to be transmitted with low delay and low latency and rendered with various applications, environments and devices. RTP, ISOBMFF or DASH support for haptics media is needed Adaptation to the rendering capabilities should be supported. User adaptation could be provided. |
| Potential Standardization Status and Needs |
|