A step-by-step guide to green screen, mixed-reality video production for VR
Virtual reality (VR) delivers an incredible gaming experience for the player in the headset, but it’s a hard one to share. The first-person player perspective has a limited field of view, and can be jumpy, creating a dissatisfying viewing experience for anyone not in the headset, whether they’re watching it live on a screen or on a prerecorded video.
Green screen, mixed-reality video is an innovative technique that brings the external viewer into the VR universe very effectively. It does this by showing a third-person, in-game perspective that encompasses both the player and the game environment, creating a highly immersive 2D video solution. It’s a demanding but rewarding megatask that VR developers should definitely explore if they want to present VR experiences to their very best advantage in an increasingly competitive market.
Figure 1: Screenshot from the Kert Gartner-produced, mixed-reality trailer for Job Simulator*.
This technique can be used live for events or online streams, as well as for trailers and videos, by developers or YouTubers. Josh Bancroft and Jerry Makare from the Developer Relations Division at Intel have been working with the technique in-house and for event demos for the last year, developing relationships with others in the field, and honing their skills. The goal of this step-by-step, how-to guide is to share their knowledge with you, the VR development community, and to equip you to show your amazing creations to the world in the best way possible.
Figure 2: Cosplayer immersed in Circle of Saviors* using green screen, mixed-reality technology at the Tokyo Game Show in 2016.
This guide focuses on the workflow that Josh and Jerry have the most experience with—VR games built in Unity* for HTC Vive* using the SteamVR* plugin for mixed-reality enablement; MixCast VR Studio* for calibration; and Open Broadcaster Software* (OBS) for chroma key, compositing, and encoding for streaming and/or recording. The ways to successfully recreate the technique are numerous, with other hardware and tools available for each stage of the process, some of which is cited in this guide.
The content of this guide is as follows:
Hardware: An overview of recommendations for the physical kit needed to handle this megatask, including PCs, VR headset, camera, lenses, video capture, studio space, green screen, and lighting.
Software: Recommendations on software needs, including enabling the VR application itself, calibration of the in-game and physical cameras, capture and compositing software, and encoding software for streaming and/or recording.
Step-by-step: The calibration, compositing, encoding, and recording stages broken down into easy to follow stages.
Resources: This guide contains many links to further information, and further resources, to give you everything you need to produce amazing mixed-reality videos of VR games.
Hardware Steps
First things first: PC hardware
The usual and most accessible setup for performing the mixed-reality video megatask requires two well-powered PCs—but the same result, or better, can be achieved by one single PC powered by the Intel® Core™ X-series processor family, with an Intel® Core™ i9+ processor of 12 or more cores. The extra cores give a single PC enough processing power to handle the VR title, mixed-reality rendering and capture, and encoding for high-quality recording and/or streaming.
The VR application needs to run smoothly with a bit of headroom for the other process that are explained in detail later in this guide; namely, generating the third-person view in-game, and doing the live video capture and compositing. Running the VR application on the same machine as the capture and compositing is advisable in order to avoid latency issues.
In a two-PC setup, the second machine takes the composited video signal from the first, captures it in turn, and performs the encoding task for streaming and/or recording. This task is relatively heavy in processing terms, so the more encoding power you have, the higher quality the results will be. This is where the extra cores in the Intel Core X-series processor family come in handy.
The way you implement the process depends on the hardware available, and there's a lot of flexibility in how you manage the load balancing. But, if you have to split the work across multiple systems, the way that we found to be most efficient is to split it as evenly as possible in the manner described above.
Behind the mask: VR hardware
A key component, of course, is the VR headset itself, and the sensors that come with it. For this project, we used HTC Vive hardware because a lot of the work required to enable a VR title for this kind of mixed-reality production is built in to the SteamVR Unity plugin, and many Vive titles built in Unity will just work with this process. Useful resources related to HTC Vive mixed-reality support can be found on the Steam Developer forums.
For Oculus Rift*, support is being added for third-person, mixed-reality capture, but, at the time of writing, developers need to do some programming to enable mixed-reality support in their title. Related documentation can be found on the Oculus website.
The sensors are key—not least because an additional one is required to mount on the physical camera that the mixed-reality process requires. The HTC Vive Tracker* is ideal for this, or you can use a third Vive hand controller.
Figure 3: SteamVR* needs to see a third controller (bottom left) in order for the mixed-reality functionality to work.
HTC Vive hardware is used for the purposes of this guide.
Live action: Video capture
The mixed-reality process involves filming the player with a camera, with the resulting video signal fed to the PC performing the compositing. The video needs to be captured at least at the resolution and frame rate that the final video needs to be. A good target for high-quality video is 1080p at 60 frames per second.
To capture the live video, a high-quality video camera, such as a digital single-lens reflex (DSLR) or mirrorless camera, delivers the best results. You can use USB or Peripheral Component Interconnect (PCI) High-Definition Multimedia Interface (HDMI) capture devices, from companies such as Magewell, Elgato and Blackmagic Design, which plug into a USB 3.0 port or PCI slot, and have an HDMI port to plug the camera into. This lets your external camera appear as a webcam to the compositing software.
There are also internal capture cards that fit into a PCI slot that we have used in the past (from Blackmagic, and others). However, these tend to require more drivers, as well as taking up PC space and fitting time. And, of course, they won’t work with a laptop.
Any video capture device—including a regular webcam—that supports the minimum resolution and frame rate that you want your final video to be, and shows up in your chosen compositing software, will work—but a higher quality camera will deliver better results.
Another hardware requirement is a 4K monitor. As explained later, the window rendered on the desktop is a quartered window, and you only capture and record a quarter of it at a time. These quarters then become the layers that are composited to create the final mixed-reality video. For a final video with 1080p resolution, the full resolution of the desktop must be at least four times that; that is, a minimum of 4K. You can use a lower resolution monitor, but remember that the final output resolution of your mixed-reality video will be one quarter of the size at which you render the quartered split-screen window.
In the studio: Setting up the space
For the studio, you need a bit more space than you would for just the VR play area. The minimum size that Vive requires is approximately two meters by two meters, plus the additional space needed for the physical camera to move while avoiding any collisions.
You need to be able to put up as much green screen as possible in the space. In theory, a single piece of green screen fabric behind the player could work, but that severely limits where you can point the camera. You can’t point it off the green screen, because that will show whatever is in the room outside the screen, and break the mixed-reality illusion.
Ideally, three of the four walls and the floor should be covered with green screen, but it depends on how much space you have, and the stands you use for the screen.
We have used a number of configurations, but one portable solution that has worked well for a number of demos is the Lastolite* panoramic background, which creates a three-wall green screen space 4m wide and 2.3m high.
Other than the surface coverage, another thing to watch is that the fabric is pulled flat to avoid shadowing caused by ripples, and that the lighting is as even as possible (more on this later), as these factors have an impact on the ability of the chroma key filter to remove the green around the player.
Looking good: Camera spec
There’s a lot of flexibility regarding the camera you use, depending on what you have access to, but it’s important to use as good a quality camera as possible. The camera must have an HDMI output, and be able to shoot at the resolution and frame rate that you need, which should be a minimum of 1080p at 60 frames per second. Any good camera with video capabilities will work well—such as a DSLR or mirrorless camera, a camcorder or, ideally, a pro video camera.
We have used cameras including the Sony a7S*, which is a full-frame mirrorless camera, a professional Sony FS700* video camera, and a Panasonic GH5*, but many other brands and models perform equally well. An inexpensive webcam isn’t going to cut it, however, if you’re looking for high-quality results.
Looking better: Camera and tracker positioning
For this process, the third VR controller, or tracker, needs to be mounted securely on the camera in such a way that it cannot move in relation to the camera, as this will throw off the calibration. Attaching a controller securely can present problems because of its shape and hand-friendly design. We have used a number of ingenious ways to fix the controller to the camera, including ever-faithful duct tape, clamps, and a custom rig they built that goes through the center hole of the controller, and uses big washers, rubber gaskets, and a long bolt to secure it. The more recent and much more compact Vive Tracker has made this process easier, with inexpensive shoe mounts that allow easy, rigid fixing to a camera body. Using a cold shoe mount also usually aligns the tracker closely with the centerline of the camera lens, which helps with calibration.
Figure 4: The Vive Tracker* has a standard ¼” 20 screw fitting on its underside for easy mounting.
Making the tripod-mounted camera and attached tracker as level as possible before you start the calibration stage (which is explained later) is very important. A useful trick for this is to use the built-in compass/bubble-level app on your smartphone, if it has one. Placing the smartphone on top of the tracker, or camera, will tell you how many degrees off level the camera is in any direction, providing easy adjusting to get the camera perfectly flat and level. Having the camera and attached tracker perfectly level at the start of the calibration process reduces the amount of adjustment needed in the three positional (X, Y, and Z), and three rotational (rX, rY, and rZ), axes later on, and also makes calibration far simpler and more accurate.
Once the calibration process is complete and the physical and virtual cameras are in lock step, the physical camera can be moved freely within the bounds of the green space, including removing it from the tripod. As long as the tracker and camera don’t move in relation to each other you can move the camera however you want, and it will be tracked in 3D space, just like your hands in VR. A stabilizer is recommended for professional results when holding the camera by hand. If no stabilizer is available, you’re likely to get better results leaving the camera on the tripod, and relying on horizontal and vertical panning for camera movement. Remember to always keep the camera’s field-of-view within the green screen area.
Looking the best: Lenses and focal length
Focal length and lenses are other important considerations. A 16mm or 24mm wide-angle lens has a low focal distance and a wide field-of-view, which means the camera needs to be relatively close to the subject, increasing the risks of physical collision, and of the field-of-view accidentally slipping outside the green screen area.
However, a longer focal distance—such as a 70mm zoom lens—means you have a much narrower field-of-view, and so need to be further away from the player to keep them in the frame. This can be problematic if space is limited. The best lens and focal distance to use will depend on the space available, the extent of the green screen, how much of the player you want to shoot, and how you want to film (that is, static camera versus handheld). A 70mm lens is probably too long in most circumstances, and it’s likely that you’ll want to stick to something like a 24mm or 28mm lens.
What is very important, especially if you’re using a zoom lens where the focal distance can be adjusted, is that you keep the focal distance fixed once calibration is complete. Changing the focal distance (that is, zooming in or out) will throw off the calibration, ruining the alignment between the virtual and physical camera, and forcing you to start again. With this in mind, fixed focal distance lenses are a safe bet.
In the spotlight: Lighting
Your lighting goal is to evenly illuminate both the subject and the green screen, avoiding any harsh shadows or dramatic differences in lighting that will show up in the chroma key. If possible, a three-light setup is ideal: one to light the subject, and two to light the green screen behind the subject—one from either side to minimize shadows. Fewer lights can be used depending on the conditions. Light emitting diode (LED) panels are great—they’re portable, and don’t produce as much heat as tungsten or halogen lights, which helps keep conditions comfortable for the player.
Figure 5: At Computex in Taipei, in May 2017, the Intel setup included frame-mounted green screen on three walls and the floor, and two fluorescent lighting panels.
Be sure to check that the frequency of the existing lighting in the space doesn’t cause excessive strobing on the camera image, especially if you’re relying on the ambient lighting rather than bringing in your own. This can occur with some LED lights, and other lights of a certain frequency. The only way to know for sure is to run a test in the space.
Software Steps
Soft choices: The VR application
The first key thing, from a software point of view, is that the game is enabled for mixed reality. This means that it allows the implementation and positioning of an additional third-person camera in-game, and can output a quadrant view comprised of separate background and foreground layers taken from that in-game, third-person view. This is the view that will be synced and calibrated with the physical camera, enabling the entire mixed-reality process.
Mixed-reality enablement is possible with games built in Unreal Engine*, and it’s also possible to code your own tools, as Croteam has done with their Serious Engine*, which supports the entire mixed-reality process, including capture and compositing. You can watch Croteam’s tutorial video on how to create mixed-reality videos in any of their VR games here.
For this guide, however, we will focus on the process as it relates to games built in Unity for HTC Vive using the SteamVR plugin, which automatically enables games for mixed-reality video.
Softly does it: Calibration
For mixed reality to work correctly, you need a way to calibrate the position offsets between the physical and virtual cameras. You could tweak the values manually, and calibrate by trial and error, but that’s a very painful process. We have primarily used MixCast VR Studio for the alignment and calibration of the in-game and physical cameras, a key process which is explained in detail later.
Soft results: Compositing and encoding
We use OBS Studio* for the compositing and encoding for recording and/or streaming. OBS Studio is open-source software with good support that is widely used by video creators and streamers. There are other solutions available such as XSplit*, but OBS Studio is used for the purposes of this guide.
Step by Step
Before we get into the calibration, let’s recap where we are so far, and what we need to do next. We have one PC running the VR application, and the video capture and compositing, and a second PC handling the encoding for recording and/or streaming (or one single PC for all tasks, if it’s from the Intel Core X-series processor family). We have a camera with an additional tracker, or controller, attached, perfectly level, in a studio space with as much green screen as possible.
The VR application we’re using is enabled for mixed reality (for example, built in Unity using the SteamVR plugin), with an additional in-game, third-person camera implemented, and outputting a quadrant view, including foreground and background layers. Then, we have capture and compositing software running.
The next stage is to calibrate the in-game and physical third-person cameras so they are perfectly in sync. This will let us film the player and have the in-game camera track both the physical camera and the hand movements of the player (and, by extension, any items or weapons player(s) are holding in the game), with a high level of precision. This lets us accurately combine the in-game and real-world video layers, and create a convincing mixed-reality composite.
Camera calibration
Before you start the calibration process, it's worth going back to double check that the camera and attached tracker are perfectly level. For the calibration process, the player (or a stand-in) needs to stand in front of the camera in the play volume, with the controllers in their hands. That person should also be level, meaning they should stand directly in front of the camera and square their shoulders to it, so they're aligned and centered with the centerline of the camera view.
This is important because when you’re doing the adjustments, you have six different values that you can change: the X, Y, and Z position, and X, Y, and Z rotation. The adjustments can be fiddly, but if you know the camera is level and flat, and the person is standing directly on the camera’s center line, you can minimize some of those offsets to get them close to zero and not have to adjust them later. It helps things go much more smoothly.
For the calibration process of aligning the in-game and physical cameras, we use MixCast VR Studio. Start up the software, make sure it can see the physical camera, and, using the drop-down menu, check that it knows which device is tracking your physical camera as the third controller (the Vive Tracker or controller attached to the camera). Before you start, you also need someone in the VR headset positioned in the play space, with a controller in each hand.
Quick setup
Next, launch the Quick Setup process, which walks you through the calibration process. This will give you the XYZ position, XYZ rotation, and field-of-view values for the virtual camera that the VR application needs in order to line it up with the physical camera.
Figure 6: Select Quick Setup in MixCast VR Studio* to begin the calibration process.
The first step is taking one handheld controller and placing its ring on the lens of the physical camera to line it up as closely as possible. Click the side buttons on the controller to register as complete.
Figure 7: The first calibration step involves aligning the physical controller with the physical camera.
Next, the tool projects crosshairs at the corners of the screen. Move the hand controller to position the ring so it lines up as closely as possible with the center of the crosshair, and click the button.
Figure 8: The setup process initially provides two crosshairs, in opposite corners of the screen, with which to align.
Initially, there are two crosshair alignments to complete for the top-left and bottom-right corners of the screen. Once they’re done, there is an option to increase precision by clicking on the Plus button to bring up more crosshairs.
Figure 9: Clicking the Plus button brings up additional crosshairs for greater precision.
We have found that four crosshairs is the optimal number. Only two, and it’s not quite closely enough aligned, and more than four also tends to be off. Four crosshairs will cover the four corners of the screen.
Figure 10: Completing the fourth-corner alignment operation.
By this point, a rough calibration is established. You will see two virtual hand controllers tracking the approximate positions of the physical controller. From there, you use the additional refinement controls below the screen to adjust the camera position and rotation to bring them as close as possible.
Figure 11: The controls for XYZ position, and XYZ rotation, are for fine-tuning the camera position.
Fine tuning
To fine-tune the hand alignment, the person in VR holds their hands to the sides so that you see the virtual controllers drawn over the top of the real ones. If the virtual controllers are further apart than the physical controllers, you need to bring the drawn controllers closer together by pulling the virtual camera slightly back. To do this, the person in VR clicks the arrows to adjust the camera position.
Figure 12: Showing the VR drawn hand controller out of alignment with the real one.
Figure 13: Here, following adjustments, the drawn hand controller is in tighter alignment with the real one.
Each click results in a small, visible movement in the desired direction. It’s useful to keep track of the number of clicks, in case you overshoot and need to go back, or otherwise need to undo. Next, look at the up and down alignment by moving the controllers around in the space.
Figure 14: Once the camera is lined up, click the checkmark to confirm the settings.
It’s possible to hold the hand controller still and get it perfectly aligned, and then move it somewhere else only to find that the alignment is off. It may be aligned in one position but not in another, which means there is more fine-tuning to do. This is an iterative process—Josh describes it as a six-dimensional Rubik’s Cube* where you have to get one face right without messing up the others—but, through careful trial and error, you’ll eventually have them perfectly lined up.
Once you have the virtual and physical controllers well aligned, there is a selection of other objects that can be used to perform additional alignment checks, including weapons, sunglasses, and a crown. Play around with the items to make sure everything looks aligned.
Figure 15: The selection of objects available in MixCast VR Studio* for alignment purposes.
Figure 16: Holding the drumstick to check controller alignment.
The crown is particularly useful to check the alignment of the player’s head in VR. The player places it on their head, and then the camera position and rotation can be adjusted until the crown is perfectly centered and level.
Figure 17: Use the crown to check accurate alignment and position of the player’s head in relation to the camera.
Camera config
When you have the alignment values set you’re calibrated and ready to go, and you can use those values with any VR application that uses the same method. You need to save the XYZ position, the XYZ rotation values, and the field-of-view value from MixCast VR Studio as a externalcamera.cfg file.
Figure 18: The externalcamera.cfg file showing the XYZ position, XYZ rotation, and field-of-view values that need to be saved following the calibration process.
For this Unity/SteamVR mixed-reality method, two conditions have to be met to trigger the quartered screen view that you need for compositing the mixed-reality view. The first is that a third controller, or tracker, has to be plugged in and tracking. The second condition is that the externalcamera.cfg file needs to be present in the root directory of the executable file of the VR application that you’re using.
Figure 19: Ensure the externalcamera.cfg file is saved in the same root folder as the VR application executable.
Launch the game
Now it’s time to fire up the VR application you’re using to create your mixed-reality video. With Unity titles, if you hold the Shift button while you launch the executable, a window pops up that lets you choose the resolution to launch at, and whether to launch in a window or full screen.
At this point, you need to specify that the application needs to run at 4K (in order that each quadrant of the quartered window is high-definition 1080p), and that it needs to run full screen, not windowed (uncheck the windowed box, and select the 4K resolution from the list). Then, when you launch the application, it will start the quartered desktop window view running at 4K resolution.
Figure 20: Example of the quartered desktop view for Rick and Morty: Virtual Rick-ality*.
This quartered view is comprised of the following quadrants: game background view (bottom left); game foreground view (top left); game foreground alpha mask (top right); and first-person headset view (bottom right).
Compositing
The open-source OBS Studio software is used for compositing for the purposes of this guide. Before you start, make sure you have the quadrant view on screen. The first step is to capture the background layer, which is the lower-left quadrant. Add a new Source in OBS of the type “Window Capture”. Select the window for the VR application. Rename this source “Background”. Next, crop away the parts of the screen that you don’t want to capture by adding a Crop Filter to this source (right-click on the Background source, Add Filter, Crop/Pad). The crop values represent how much of the window to crop from each of the four sides, in pixels. So, to capture the bottom-left quadrant for the background layer, use a value of 1080 for the top, and 1920 for the right (remember, at 4K 3840 x 2160 resolution, this is exactly half of each dimension).
Figure 21: The cropping filter showing the quartered view, before entering the crop values.
Figure 22: The cropping filter showing only the bottom-left quarter (background view), after having entered the crop values.
Once you’ve applied the crop filter, you’ll see it in the preview window, but it won’t take up the full screen. Right-click on the source, select Transform, and choose, Fit to screen (or use the Ctrl+F shortcut). Every layer in the composite image needs to be full screen in OBS, so do that for each layer.
Chroma key
When you have your background, you then need do the same thing for the physical camera view, and cut away the green background, leaving the player ready to be superimposed on the captured game background.
Figure 23: Defining the source for the physical camera in OBS*.
Add a new Video Capture source. Choose your camera capture device, and name this layer “Camera”. You should see your camera view. Next, right-click on the source, go to filters, and select "Chroma Key".
Figure 24: Selecting the chroma key filter in OBS*.
Figure 25: Once Chroma Key is selected, the green background will disappear, and the sliders can be used for fine tuning.
You can adjust the sliders in the chroma key settings until you get the person sharp against the background, with the green screen completely removed, and without erasing any of the person. The default values are usually pretty good, and should only need small adjustments. This is where you will see the benefit of good, even lighting. If you make too many changes, and mess up the chroma key filter, you can always delete the filter and re-add it to start fresh.
Figure 26: An interim test composite, minus the foreground layer, showing the player positioned on top of the game background.
When it looks good, position it to “Fit to Screen” in the preview window, and make sure the Camera layer is listed before the Background layer in the Source list (which means the camera layer is rendered on top). You should see your camera view against the VR background at this point.
Foreground
Next, you need to follow the same process as the background layer for the foreground, which is the upper-left quadrant. Add a new window-capture source, capture the quadrant view, and apply a crop filter—this time cutting off the right 1920 and the bottom 1080 pixels to isolate the upper-left corner of the quadrant. Size it with Ctrl+F for “Fit to Screen” to make sure it fills the full preview window. Name this source “Foreground”.
Figure 27: Crop of the foreground view.
The foreground layer shows objects that are between the player and the camera (and therefore should be rendered on top of the live camera view). You’ll need to key-out the black parts of the foreground view to allow the background and camera layers to show through. Right-click on the Foreground source, and apply a “Color Key” filter.
Figure 28: Select the Color Key filter to remove the unwanted black areas of the foreground view.
OBS will ask what color you want to use for the key, and there’s an option for custom color. Go into the color picker, and choose black. To be sure, you can actually set the values, selecting the hexadecimal value #000000 for solid, absolute black. Apply that, and it will make the black part transparent, so the foreground objects can sit in front of the rest of the layers in the composite.
Figure 29: Select the color key hex value of #000000 for solid black.
Figure 30:The foreground layer with the color key applied, and the black area removed.
The upper-right quadrant is the alpha mask, which can be applied to the foreground layer, but is more complicated to use. However, if the foreground layer includes solid black elements that you don’t want to turn transparent, then applying the alpha mask is the way to do that. Setting up the alpha mask is beyond the scope of this guide, but you can find useful information on this topic in the OBS Project forums.
Composite image
With the Background, Camera, and Foreground sources properly filtered, and in the correct order (Foreground first, then Camera, then Background), you should be able to see your real-time, mixed-reality view.
Figure 31: The final composite image, comprised of game background, player, and foreground.
Once you have the mixed-reality source, you can do other creative things in OBS. For example, you can set it up to switch between the mixed-reality view and first-person view. This is great for showing the difference between the two views, and can be useful during streams, or recording, to vary what the viewer sees.
You can also set it up to show a different camera by selecting it as another source, which is useful if you’re hosting a live stream and want to cut to a camera pointing to the host, for instance. It’s also possible to bring different graphics or stream plugins into your OBS workflow as needed, which again is an important capability for streamers and YouTubers.
Troubleshooting lag
One issue that could arise is lag in OBS, which may be caused by a mismatch in frames per second between the different video sources. If that happens, first make sure that the desktop capture (in OBS settings under Video) is set at 60 frames per second. Next, check the frame rate on your camera capture is set to 60 frames per second. Lastly, check that your video camera is also set at 60 frames per second.
We had a lag problem with a demo, and we discovered that one of the cameras had been set to 24 frames per second. Setting it to 60 frames per second instantly fixed the problem. It may be that your setup can’t handle higher than 30 frames per second, which is fine; but, in that case, all the settings noted above need to be at 30 frames per second.
Video out
The last stage of the process is taking the mixed-reality video signal and encoding it for recording or streaming. With a two-PC setup, the first machine does the capture and compositing, and the second one handles the encoding for recording or streaming. If you’re using a PC utilizing the Intel Core X-series processor family with lots of cores, it should be able to handle all the tasks, including high-quality encoding, without needing to output to a second machine.
In a two-PC setup, to send the signal from the first PC to the second, OBS has a feature called Full-screen Projector, which allows you to do a full-screen preview. If you right-click on the preview window, you can pick which monitor to output it to. Then, you can take a DisplayPort* or HDMI cable, and plug it into your graphics processing unit (GPU) so that the computer running the VR and compositing thinks that it’s a second monitor (you can also run it through a splitter, so you actually do have it on a second monitor as well).
Send the signal into the second computer using a second USB HDMI capture device (or PCI card) with the same 1080p/60 frames per second capabilities. You also run OBS on that second computer, and set up a very simple scene where you have one video capture device, which is the USB HDMI capture box, or card. You then set all the recording and streaming quality settings on that second system.
The second computer is also where you would add other details to the video output, such as subscriber alerts or graphics. When it comes to switching scenes, it can get complicated across two machines. For example, switching between first- and third-person view needs to be done on the first computer, while you might have a number of other operations running on the second computer that you also need to manage.
Encoding
OBS output settings, by default, are set to simple, which uses the same encoder settings for recording and streaming. You will need to set it to advanced mode to be able to adjust the settings for streaming and recording separately.
You usually have the choice of using either the GPU or the CPU for encoding. When using a single PC for all tasks, we recommend using the x264 CPU encoder, as using the GPU encoder negatively impacts the frame rate of the VR experience, both for the person in VR and on the desktop.
Using the x264 CPU encoder also takes advantage of the CPU hardware scaling to the point where, if you have an Intel Core X-series processor family with a 12- or 18-core processor, you can crank the quality settings up very high because it’s using the CPU to do the encoding and not the GPU, resulting in a better quality final video.
Streaming
With streaming, the bitrate is limited by the upstream bandwidth of your internet provider, and the bitrate your stream provider supports. Twitch*, for example, is limited to about six megabits, so in OBS you select a bitrate of 6,000 kilobits per second. It’s best to use the maximum streaming bitrate that your stream provider, and your Internet provider, can handle. If you have slower Internet, or just want to lower the quality, you could drop that down to 4,000, or 2,500, kilobits per second for streaming.
Figure 32: Streaming settings in OBS*.
There’s also the key-frame interval to consider, which should be changed from the default zero (0) auto setting to two seconds. Other settings include setting it to constant bitrate, and selecting medium for the CPU usage preset. This gives you a high-quality video for your stream.
Recording
The recording encoder settings are identical, except for the bitrate, which you can turn up much higher. For streaming on Twitch, you’re looking at a bitrate of 6 megabits per second; for recording, a maximum could be anything from 20- to 100- megabits per second, depending on the system. It’s here that a powerful system such as the Intel Core X-series processor family can really make a difference in terms of the video quality—but if you set your bitrate too high, the file size may become unmanageable.
You’ll need to experiment with the bitrate value to find the sweet spot between quality, file size, and not overloading the encoder. Start low, around 15 megabits, then run a test recording using the whole mixed-reality stack. The CPU usage varies depending on what’s happening, so watch the status bar and stats window on OBS for encoder overloaded! or dropped frames warnings. If you don’t get any warnings, your system might be able to handle a little more. Stop the recording and raise the bitrate a bit, if you want.
Figure 33: Recording settings in OBS*.
That’s a wrap
The green screen, mixed-reality video process for VR requires a good deal of trial and error to get right but, once you’ve nailed it, the results are very satisfying. Covering every eventuality and permutation in a guide such as this is impossible; but the aim is that, by now, you have a grasp of what’s involved, and can experiment with producing mixed-reality videos of your own. We’ll be looking out for them.