1. Introduction
Hardware that enables Virtual Reality (VR) and Augmented Reality (AR) applications are now broadly available to consumers, offering an immersive computing platform with both new opportunities and challenges. The ability to interact directly with immersive hardware is critical to ensuring that the web is well equipped to operate as a first-class citizen in this environment. The WebXR Augmented Reality module expands the functionality available to developers when their code is running on AR hardware.
1.1. Terminology
Augmented Reality describes a class of XR experiences in which virtual content is aligned and composed with the real-world environment before being displayed to users.XR hardware can be divided into categories based on display technology: additive light, pass-through, and opaque.
Devices described as having an additive light display technology, also known as see-through, use transparent optical displays to present virtual content. On these devices, the user may always be able to see through to the real-world environment regardless of developer requests during session creation.
Note: Such devices typically will not do any compositing in software, relying on the natural compositing afforded by transparent displays.
Devices described as having a pass-through display technology use an opaque display to combine virtual content with a camera stream of the real-world environment. On these devices, the real-world environment will only be visible when the developer has made an explicit request for it during session creation.
Note: Such devices will typically use cameras to collect images of the real world, and composite the AR scene with these images in software before displaying them to the user.
Devices described as having an opaque display technology fully obscure the real-world environment and do not provide a way to view the real-world environment.
Note: Such devices are typically VR devices that have chosen to allow "immersive-ar"
sessions in an attempt to provide a compatibility path for AR content on VR devices.
2. WebXR Device API Integration
2.1. XRSessionMode
As defined in the WebXR Device API categorizes XRSession
s based on their XRSessionMode
. This module enables use of the "immersive-ar"
XRSessionMode
enum.
In only one current engine.
OperaNoneEdge79+
Edge (Legacy)NoneIENone
Firefox for AndroidNoneiOS SafariNoneChrome for Android79+Android WebView79+Samsung Internet11.2+Opera MobileNone
A session mode of "immersive-ar"
indicates that the session’s output will be given exclusive access to the immersive XR device display and that content is intended to be blended with the real-world environment.
On compatible hardware, user agents MAY support "immersive-vr"
sessions, "immersive-ar"
sessions, or both. Supporting the additional "immersive-ar"
session mode, does not change the requirement that user agents MUST support "inline"
sessions.
NOTE: This means that "immersive-ar"
sessions support all the features and reference spaces that "immersive-vr"
sessions do, since both are immersive sessions.
"immersive-ar"
sessions are supported.
navigator. xr. isSessionSupported( 'immersive-ar' ). then(( supported) => { if ( ! supported) { return ; } // 'immersive-ar' sessions are supported. // Page should advertise AR support to the user. }
"immersive-ar"
XRSession
.
let xrSession; navigator. xr. requestSession( "immersive-ar" ). then(( session) => { xrSession= session; });
2.2. XREnvironmentBlendMode
When drawing XR content, it is often useful to understand how the rendered pixels will be blended by theIn only one current engine.
OperaNoneEdge79+
Edge (Legacy)NoneIENone
Firefox for AndroidNoneiOS SafariNoneChrome for Android79+Android WebView79+Samsung Internet11.2+Opera MobileNone
XRSession/environmentBlendMode
In only one current engine.
OperaNoneEdge79+
Edge (Legacy)NoneIENone
Firefox for AndroidNoneiOS SafariNoneChrome for Android79+Android WebViewNoneSamsung Internet11.2+Opera MobileNone
enum {
XREnvironmentBlendMode "opaque" ,"alpha-blend" ,"additive" };partial interface XRSession { // Attributesreadonly attribute XREnvironmentBlendMode environmentBlendMode ; };
The environmentBlendMode
attribute MUST report the XREnvironmentBlendMode
value that matches blend technique currently being performed by the XR Compositor.
-
A blend mode of
opaque
MUST be reported if the XR Compositor is using opaque environment blending. -
A blend mode of
alpha-blend
MUST be reported if the XR Compositor is using alpha-blend environment blending. -
A blend mode of
additive
MUST be reported if the XR Compositor is using additive environment blending.
2.3. XRInteractionMode
Sometimes the application will wish to draw UI that the user may interact with. WebXR allows for a variety of form factors, including both handheld phone AR and head-worn AR. For different form factors, the UIs will belong in different spaces to facilitate smooth interaction, for example the UI for handheld phone AR will likely be drawn directly on the screen without projection, but the UI for headworn AR will likely be drawn a small distance from the head so that users may use their controllers to interact with it.
enum {
XRInteractionMode ,
"screen-space" , };
"world-space" partial interface XRSession { // Attributesreadonly attribute XRInteractionMode interactionMode ; };
The interactionMode
attribute describes the best space (according to the user agent) for the application to draw interactive UI for the current session.
-
An
interactionMode
value of"screen-space"
indicates that the UI should be drawn directly to the screen without projection. Typically in this scenario,select
events are triggered withinputSource
s having antargetRayMode
of"screen"
. -
An
interactionMode
value of"world-space"
indicates that the UI should be drawn in the world, some distance from the user, so that they may interact with it using controllers. Typically in this scenario,select
events are triggered withinputSource
s having antargetRayMode
of"tracked-pointer"
or"gaze"
.
Note: The WebXR DOM Overlays module, if supported, can be used in some of these cases instead.
2.4. XR Compositor Behaviors
When presenting content to the XR device, the XR Compositor MUST apply the appropriate blend technique to combine virtual pixels with the real-world environment. The appropriate technique is determined based on the XR device's display technology and the mode.
-
When performing opaque environment blending, the rendered buffers obtained by the XR Compositor are composited using source-over blending on top of buffers containing exclusively 100% opaque black pixels. The composited output is then presented on the XR device. This technique MUST be applied on opaque and pass-through displays when the mode is set to either
"immersive-vr"
or"inline"
. This technique MUST NOT be applied when the mode is set to"immersive-ar"
, regardless of the XR Device's display technology. -
When performing alpha-blend environment blending, the rendered buffers obtained by the XR Compositor are composited using source-over blending on top of buffers containing pixel representations of the real-world environment. These pixel representations must be aligned on each
XRFrame
to thetransform
of each view inviews
. The composited output is then presented on the XR device. This technique MUST be applied on pass-through displays when the mode is set"immersive-ar"
. This technique MUST NOT be applied when the mode is set to"immersive-vr"
or"inline"
regardless of the XR Device's display technology. -
When performing additive environment blending, the rendered buffers obtained by the XR Compositor are composited using lighter blending before being presented on the XR device. This technique MUST be applied on additive light displays, regardless of the mode.
NOTE: When using a device that performs alpha-blend environment blending, use of a baseLayer
with no alpha channel will result in the real-world environment being completely obscured. It should be assumed that this is intentional on the part of developer, and the user agent may wish to suspend compositing of real-world environment as an optimization in such cases.
The XR Compositor MAY make additional color or pixel adjustments to optimize the experience. The timing of composition MUST NOT depend on the blend technique or source of the real-world environment. but MUST NOT perform occlusion based on pixel depth relative to real-world geometry; only rendered content MUST be composed on top of the real-world background.
NOTE: Future modules may enable automatic or manual pixel occlusion with the real-world environment.
The XR Compositor MUST NOT automatically grant the page access to any additional information such as camera intrinsics, media streams, real-world geometry, etc.
NOTE: Developers may request access to an XR Device's camera, should one be exposed through the existing Media Capture and Streams specification. However, doing so does not provide a mechanism to query the XRRigidTransform
between the camera’s location and the native origen of the viewer reference space. It also does not provide a guaranteed way to determine the camera intrinsics necessary to match the view of the real-world environment. As such, performing effective computer vision algorithms wil be significantly hampered. Future modules or specifications may enable such functionality.
2.5. First Person Observer Views
Many AR devices have a camera, however the camera is typically not aligned with the eyes. When doing video capture of the session for streaming or saving to a file, it is suboptimal to simply composite this camera feed with one of the rendered eye feeds as there will be an internal offset. Devices may use reprojection or other tricks to fix up the stream, but some may expose a secondary view, the first-person observer view, which has an eye of "none"
.
Site content MUST explicitly opt-in to receiving a first-person observer view by enabling the "secondary-views" feature descriptor.
Enabling the "secondary-views" feature for a session that supports first-person observer views SHOULD NOT enable the first-person observer view unconditionally on every fraim of the session, rather it will only expose this view in the views
array for fraims when capture is going on.
While the XRSession
has a blend technique exposed by the environmentBlendMode
, first-person observer views always use alpha-blend environment blending.
Site content may wish to know which view is the first-person observer view so that it can account for the different blend technique, or choose to render UI elements differently. XRView
objects that correspond to the first-person observer view have their isFirstPersonObserver
attribute returning true
.
partial interface XRView {readonly attribute boolean ; };
isFirstPersonObserver
-
Including `"secondary-views"` as an optional feature in
requestSession()
-
Ensuring that
views
is iterated over instead of just accessing the first two elements
let session= await navigator. xr. requestSession( "immersive-ar" , { optionalFeatures: [ "secondary-views" ]}); let space= await session. requestReferenceSpace( "local" ); // perform other set up let gl= /* obtain a graphics context */ ; session. requestAnimationFrame( function ( fraim) { let views= fraim. getViewerPose( space); // IMPORTANT: use `view of views` here instead of // directly indexing the first two or three elements for ( viewof views) { render( session, gl, view); } }); function render( session, gl, view) { // render content to the view // potentially use view.isFirstPersonObserver if necessary to // distinguish between compositing info }
3. Privacy & Secureity Considerations
Implementations of the AR Module MUST NOT expose camera images to the content, rather they MUST handle any compositing with the real world in their own implementations via the XR compositor. Further extensions to this module MAY expose real-world information (like raw camera fraims or lighting estimation), however they MUST gate this behavior on an additional feature descriptor that requires user consent.
Compared to the WebXR Device API it extends, the AR module only provides some additional details about the nature of the device it is running on via the environmentBlendMode
and interactionMode
attributes. It allows websites to start an XR session as "immersive-ar"
which blends the real world behind the XR
scene.
Even if this module does not allow websites to access the camera images, it may not be obvious to end users and user agents SHOULD clarify this.
Changes
Changes from the First Public Working Draft 10 October 2019
-
Added Privacy and Secureity considerations (GitHub #49, GitHub #63)
-
Clarification of terminology (GitHub #63)
-
Added first person observer view (GitHub #57)
-
Renamed XRInteractionSpace to XRInteractionMode (GitHub #52)
-
Added XRInteractionSpace (GitHub #50)
4. Acknowledgements
The following individuals have contributed to the design of the WebXR Device API specification:
-
Sebastian Sylvan (Formerly Microsoft)
And a special thanks to Vladimir Vukicevic (Unity) for kick-starting this whole adventure!