What's New? - Intel® VTune™ Amplifier XE 2016 Update 1

October 19, 2015, 10:47 am

Latest and popular articles on Intel Technologies

≫ Next: JavaScript* Parser for Depth Photo

≪ Previous: Code Samples for "From Serial to Awesome: Advanced Code Vectorization and Optimization"

Intel® VTune™ Amplifier XE 2016 performance profiler

A performance profiler for serial and parallel performance analysis. Overview, training, support.

New for the 2016 Update 1! (Optional update unless you need...)

As compared to 2016 initial release

General Exploration analysis for Intel microarchitecture code name Cherry Trail
Event-based sampling collection for multiple ranks per node with an arbitrary MPI launcher
Command-line option -knob event-config extended to display a list of PMU events available on the target system
Algorithm analysis views extended to display confidence indication (greyed out font) for metrics lacking sufficient samples
Event-based sampling collection support for .NET* processes (.NET 4.0 and higher) in the attach mode
Intel® Manycore Platform Software Stack (Intel® MPSS) version 3.6 support
Linux* kernel 4.1, 4.2 and 4.3 support

Note: We are now labeling analysis tool updates as "Recommended for all users" or "Optional update unless you need…". Recommended updates will be available about once a quarter for users who do not want to update frequently. Optional updates may be released more frequently, providing access to new processor support, new features, and critical fixes.

Resources

Learn (“How to” videos, technical articles, documentation, …)
Support (forum, knowledgebase articles, how to contact Intel® Premier Support)
Release Notes (pre-requisites, software compatibility, installation instructions, and known issues)

File: vtune_amplifier_xe_2016_update1.tar.gz

Installer for Intel® VTune™ Amplifier XE 2016 for Linux* Update 1

File: VTune_Amplifier_XE_2016_update1_setup.exe

Installer for Intel® VTune™ Amplifier XE 2016 for Windows* Update 1

File: vtune_amplifier_xe_2016_update1.dmg

Installer for Intel® VTune™ Amplifier XE 2016 - OS X* host only Update 1

* Other names and brands may be claimed as the property of others.

Microsoft, Windows, Visual Studio, Visual C++, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.

↧

JavaScript* Parser for Depth Photo

October 15, 2015, 4:24 pm

Latest and popular articles on Intel Technologies

≫ Next: Tutorial: Migrating Your Apps to DirectX* 12 – Part 2

≪ Previous: What's New? - Intel® VTune™ Amplifier XE 2016 Update 1

Abstract

The JavaScript parser for depth photo parses eXtensible Device Metadata (XDM) image files [1] and extracts metadata embedded in image files to generate XML files. In addition, this app analyzes XML files to extract color image data and depth map data. It is a fundamental building block for depth photography use cases, like the image viewer, refocus feature, parallax feature, and measurement feature. We have deployed the JavaScript parser in an open source project named Depthy [2] and proved its correctness and efficiency.

The input to this app is an XDM image file and outputs include XML file(s), color image file(s), and depth map file(s).

XDM

First, we describe the input to this app, XDM image files. XDM is a standard for storing metadata in a container image while maintaining the compatibility with existing image viewers. It is designed for Intel® RealSense™ Technology [3]. Metadata includes device-related information, like depth map, device and camera pose, lens perspective model, vendor information, and point cloud. The following figure shows an example where the XDM file stores the depth map (right) as metadata with the color image (left).

XDM

Adobe XMP Standard

Currently, the XDM specification supports four types of container image formats: JPEG, PNG, TIFF, and GIF. XDM metadata is serialized and embedded inside a container image file, and its storage format is based on the Adobe Extensible Metadata Platform (XMP) standard [4]. This app is specifically developed for JPEG format. Next we briefly describe how XMP metadata is embedded in JPEG image files and how the parser parses XMP packets.

In the XMP standard, 2-byte markers are interspersed among data. The marker types 0xFFE0–0xFFEF are generally used for application data, named APPn. By convention, an APPn marker begins with a string identifying the usage, called a namespace or signature string. An APP1 marker identifies Exif and TIFF metadata; an APP13 marker designates Photoshop Image Resources that contains IPTC metadata, another or multiple APP1 marker designate the location of the XMP packet(s).

The following table shows an entry format for the StandardXMP section in the JPEG, including:

2-byte APP1 marker 0xFFE1
Length of this XMP packet, 2-bytes long
StandardXMP namespace, http://ns.adobe.com/xap/1.0/, 29-bytes long
XMP packet, less than 65,503 bytes

Adobe XMP Standard

If the serialized XMP packet becomes larger than 64 KB, it can be divided into a main portion (StandardXMP) and an extended portion (ExtendedXMP), stored in multiple JPEG marker segments. The entry format for the ExtendedXMP section is similar to that for StandardXMP except that the namespace is http://ns.adobe.com/xmp/extension/.

The following image shows how StandardXMP and ExtendedXMP are embedded in a JPEG image file.

Adobe XMP Standard

The following code snippet shows three functions:

findMarker. Parse the JPEG file (that is, buffer) from the specified location (that is, position) and search 0xFFE1 marker. If it is found, return the marker position; otherwise, return -1.
findHeader. Look for StandardXMP namespace (http://ns.adobe.com/xap/1.0/) and ExtendedXMP namespace (http://ns.adobe.com/xmp/extension/) in the JPEG file (that is, buffer) from the specified location (that is, position). If found, return corresponding namespace; otherwise, return an empty string.
findGUID.Look for GUID which is stored in xmpNote:HasExtendedXMP in the JPEG file (that is, buffer) from the start location (that is, position) to the end location (that is, position+size-1) and return its location.

// Return buffer index that contains marker 0xFFE1 from buffer[position]
// If not found, return -1
function findMarker(buffer, position) {
    var index;
    for (index = position; index < buffer.length; index++) {
        if ((buffer[index] == marker1) && (buffer[index + 1] == marker2))
            return index;
    }
    return -1;
}

// Return header/namespace if found; return "" if not found
function findHeader(buffer, position) {
    var string1 = buffer.toString('ascii', position + 4, position + 4 + header1.length);
    var string2 = buffer.toString('ascii', position + 4, position + 4 + header2.length);
    if (string1 == header1)
        return header1;
    else if (string2 == header2)
        return header2;
    else
        return noHeader;
}

// Return GUID position
function findGUID(buffer, position, size) {
    var string = buffer.toString('ascii', position, position + size - 1);
    var xmpNoteString = "xmpNote:HasExtendedXMP=";
    var GUIDPosition = string.search(xmpNoteString);
    var returnPos = GUIDPosition + position + xmpNoteString.length + 1;
    return returnPos;
}

A 128-bit GUID stored as a 32-byte ASCII hex string is stored in each ExtendedXMP segment following the ExtendedXMP namespace. It is also stored in the StandardXMP segment as the value of xmpNote:HasExtendedXMP property. This way, we can detect a mismatched or modified ExtendedXMP.

XML

XMP metadata can be directly embedded within an XML document [5]. According to the XDM specification, the XML data structure is defined as follows:

XML

The image file contains the following items as shown in the above table, formatted as RDF/XML. This describes the general structure:

Container image. The image external to the XDM, visible to normal non-XDM apps.
DeviceThe root object of the RDF/XML document as in the Adobe XMP standard.
- Revision - Revision of XDM specification
- VendorInfo - Vendor-related information for the device
- DevicePose - Device pose with respect to the world
- Cameras - RDF sequence of one or more camera entities
  - Camera - All the information for a given camera. There must be a camera for any image. The container image is associated with the first camera, which is considered the primary camera for the image.
    - VendorInfo - Vendor-related information for the camera
    - CameraPose - Camera pose relative to the device
    - Image - Image provided by the camera
    - ImagingModel - Imaging (lens) model
    - Depthmap - Depth-related information including the depth map and noise model
      - NoiseModel - Noise properties for the sensor
    - PointCloud - Point-cloud data

The following code snippet is the main function of this app, which parses the input JPEG file by searching APP1 marker 0xFFE1. If it is found, search the StandardXMP namespace string and ExtendedXMP namespace string. If the former, calculate metadata size and starting point, extract the metadata, and form the StandardXMP XML file. If the latter, calculate metadata size and starting point, extract the metadata, and form the ExtendedXMP XML file. The app’s outputs are two XML files.

// Main function to parse XDM file
function xdmParser(xdmFilePath) {
 try {
     //Get JPEG file size in bytes
     var fileStats = fs.statSync(xdmFilePath);
     var fileSizeInBytes = fileStats["size"];

     var fileBuffer = new Buffer(fileSizeInBytes);

        //Get JPEG file descriptor
     var xdmFileFD = fs.openSync(xdmFilePath, 'r');

     //Read JPEG file into a buffer (binary)
     fs.readSync(xdmFileFD, fileBuffer, 0, fileSizeInBytes, 0);

     var bufferIndex, segIndex = 0, segDataTotalLength = 0, XMLTotalLength = 0;
     for (bufferIndex = 0; bufferIndex < fileBuffer.length; bufferIndex++) {
         var markerIndex = findMarker(fileBuffer, bufferIndex);
         if (markerIndex != -1) {
                // 0xFFE1 marker is found
             var segHeader = findHeader(fileBuffer, markerIndex);
             if (segHeader) {
                 // Header is found
                 // If no header is found, go find the next 0xFFE1 marker and skip this one
                    // segIndex starts from 0, NOT 1
                 var segSize = fileBuffer[markerIndex + 2] * 16 * 16 + fileBuffer[markerIndex + 3];
                 var segDataStart;

                 // 2-->segSize is 2-byte long
                    // 1-->account for the last 0 at the end of header, one byte
                 segSize -= (segHeader.length + 2 + 1);
                 // 2-->0xFFE1 is 2-byte long
                 // 2-->segSize is 2-byte long
                 // 1-->account for the last 0 at the end of header, one byte
                 segDataStart = markerIndex + segHeader.length + 2 + 2 + 1;
               
                 if (segHeader == header1) {
                        // StandardXMP
                     var GUIDPos = findGUID(fileBuffer, segDataStart, segSize);
                     var GUID = fileBuffer.toString('ascii', GUIDPos, GUIDPos + 32);
                     var segData_xap = new Buffer(segSize - 54);
                     fileBuffer.copy(segData_xap, 0, segDataStart + 54, segDataStart + segSize);
                     fs.appendFileSync(outputXAPFile, segData_xap);
                 }
                 else if (segHeader == header2) {
                        // ExtendedXMP
                     var segData = new Buffer(segSize - 40);
                     fileBuffer.copy(segData, 0, segDataStart + 40, segDataStart + segSize);
                     XMLTotalLength += (segSize - 40);
                     fs.appendFileSync(outputXMPFile, segData);
                 }
                 bufferIndex = markerIndex + segSize;
                 segIndex++;
                 segDataTotalLength += segSize;
             }
         }
         else {
                // No more marker can be found. Stop the loop
             break;
         };
     }
 } catch(ex) {
  console.log("Something bad happened! " + ex);
 }
}

The following code snippet parses the XML file and extracts the color image and depth map for depth photography purposes. It is very straightforward. The function xmpMetadataParser() searches the attribute named IMAGE:DATA and extracts the corresponding data into a JPEG file which is the color image. If multiples are found, multiple JPEG files are created. The function also searches the attribute named DEPTHMAP:DATA and extracts the corresponding data into a PNG file which is the depth map. If multiples are found, multiple PNG files are created, too. The app’s outputs are JPEG file(s) and PNG file(s).

// Parse XMP metadata and search attribute names for color image and depth map
function xmpMetadataParser() {
    var imageIndex = 0, depthImageIndex = 0, outputPath = "";
    parser = sax.parser();

    // Extract data when specific data attributes are encountered
    parser.onattribute = function (attr) {
        if ((attr.name == "IMAGE:DATA") || (attr.name == "GIMAGE:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_" + imageIndex + ".jpg";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            imageIndex++;
        } else if ((attr.name == "DEPTHMAP:DATA") || (attr.name == "GDEPTH:DATA")) {
            outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_depth_" + depthImageIndex + ".png";
            var atob = require('atob'), b64 = attr.value, bin = atob(b64);
            fs.writeFileSync(outputPath, bin, 'binary');
            depthImageIndex++;
        }
    };

    parser.onend = function () {
        console.log("All done!")
    }
}

// Process XMP metadata
function processXmpData(filePath) {
    try {
        var file_buf = fs.readFileSync(filePath);
        parser.write(file_buf.toString('utf8')).close();
    } catch (ex) {
        console.log("Something bad happened! " + ex);
    }
}

Conclusion

This white paper described the XDM file format, Adobe XMP standard, and XML data structure. The JavaScript parser app for the depth photo parses the XDM image file and output StandardXMP XML file and ExtendedXMP XML file. Then it parses the XML files to extract color image file(s) and depth map file(s). This app does not depend on any other programs. It is a basic building block for any depth photography use cases.

References

[1] “The eXtensible Device Metadata (XDM) specification, version 1.0,” https://software.intel.com/en-us/articles/the-extensible-device-metadata-xdm-specification-version-10

[2] Open source project Depthy. http://depthy.me/#/

[3] Intel® RealSense™ Technology: http://www.intel.com/content/www/us/en/architecture-and-technology/realsense-overview.html

[4] Adobe XMP Developer Center. http://www.adobe.com/devnet/xmp.html

[5] “XML 1.0 Specification,” World Wide Web Consortium. Retrieved 2010-08-22.

About The Author

Yu Bai is an application engineer in Intel® Software and Services Group (SSG), working with external ISVs to ensure their applications run well on Intel® platforms. Before joining SSG, she worked for Rudolph Technologies as a senior software engineer, developing applications used in the operation of precision photolithograph equipment for the semiconductor capital equipment industry. Prior to Rudolph, she worked for Marvell Semiconductor as a staff engineer working on power analysis and power modeling for the company's application processors. She joined Marvell through the company's acquisition of Intel® XScale technology in 2006.

Yu received her master and doctorate degrees in Electrical Science and Computer Engineering from Brown University. Her graduate research focused on high-performance and low-power computer architecture design. Yu holds six U.S. patents and has published 10+ journal and international conference papers on power/performance management and optimization.

↧

Tutorial: Migrating Your Apps to DirectX* 12 – Part 2

October 20, 2015, 8:26 am

Latest and popular articles on Intel Technologies

≫ Next: Intel® RealSense™ SDK Background Segmentation Feature

≪ Previous: JavaScript* Parser for Depth Photo

Download PDF[PDF 471KB]

Link to: Chapter 1: Overview of DirectX* 12

Chapter 2 DirectX 12 Tools

2.1 Visual Studio Graphics Diagnostics tools

We recommend that you use Visual Studio 2015 to develop DirectX 12 programs. The following content is mainly for Visual Studio 2015 Graphics Diagnostics tools.

2.1.1 Overview of Graphics Diagnostics Tools

Visual Studio 2015 Graphics Diagnostics is a set of toolsets for recording and analyzing presentation and performance problems of Direct3D apps. Graphics Diagnostics can be used to not only diagnose the programs running on your Windows PC and Windows Device Emulator, but also debug programs running on a remote PC and device.

To get the most accurate analysis result of how the app uses Direct3D, Graphics Diagnostics can directly capture a state of a running app and immediately analyze, share, or save it for analysis in the future. Not only the developers are able to use the command-line tool dxcap.exe to enable and control capture manually, but VS also provides three different ways of frame capture to help users programmatically enable and control capture: capturing frames on the VS interface, on the app interface and automatically capturing frames using capture API.

To diagnose the performance problems of an app, it is recommended to use a new feature of Graphics Diagnostics called the Frame Analysis tool to analyze the captured frame data. In contrast to manually modifying graphic parameters and constantly comparing the performance before and after the change to decide whether the modification is appropriate, this tool will automatically change the way the app uses Direct3D and benchmark all parameters for developers so as to reveal where the potential for performance optimization resides.

The Visual Studio Graphics Analyzer window is used to examine rendering and performance problems in captured frames. Several built-in tools help developers understand the rendering behavior of the app. Each tool exposes different information about the captured frame and intuitively shows rendering problems, starting from the frame buffer.

The following graph shows a typical layout of tools in the Graphics Analyzer.

2.1.2 The Compatibility of Graphics Diagnostics

Graphics Diagnostics supports apps that use Direct3D 12, Direct3D 11, and Direct3D 10. It provides limited support for apps that use Direct2D. It does not support apps that use earlier versions of Direct3D, DirectDraw, or other graphics APIs.

2.1.3 Graphics Diagnostics Features in Visual Studio

1. Graphic Toolbar
The Graphics toolbar provides commands that allow quick access to Graphics Diagnostics.

2. Capturing Graphics Information
When apps are running in Graphics Diagnostics, Visual Studio will display a diagnostics session interface that developers can use to capture the current frame and display the frame rate and frame time (GPU and CPU usage can only be seen after the GPU Usage tool is launched). The load display helps developers identify frames that developers might want to capture according to their performance characteristics. It is recommended not to use it for screen troubleshooting.

3. GPU Usage
The GPU Usage tool can be used to better understand the performance of Direct3D apps on GPU and CPU. Developers can use it to determine whether the performance of an app has reached the CPU or GPU limit so as to understand how to use the platform hardware more effectively. The GPU Usage tool supports apps that use Direct3D 12, Direct3D 11 and Direct3D 10 (VS2015 RTM currently does not support DirectX 12, but will add it in later updates); it does not support other graphics APIs such as Direct2D or OpenGL.

Coming Soon:

Chapter 3: Migrating From DirectX 11 to DirectX
Chapter 4: DirectX 12 Features
Chapter 5: DirectX 12 Optimization

4. DirectX control panel
The DirectX control panel is a DirectX component that developers can use to change the way that DirectX behaves. For example, developers can enable the debug version of the DirectX Runtime component, select the type of debug message, and disable certain graphics hardware capabilities from being used to emulate hardware that is not supported. This level of control over DirectX can help you debug and test your DirectX app. You can access DirectX control panel from Visual Studio.

2.1.4 Reference Resources

For latest information about related content in this chapter, please refer to the following MSDN website:
https://msdn.microsoft.com/zh-cn/library/hh315751(v=vs.140).aspx

Please watch the following video for new features of Visual Studio 2015 for DirectX development:
https://channel9.msdn.com/Series/ConnectOn-Demand/212

Coming Soon: Links to the Following Chapters

Chapter 3: Migrating From DirectX 11 to DirectX

Chapter 4: DirectX 12 Features

Chapter 5: DirectX 12 Optimization

↧

Intel® RealSense™ SDK Background Segmentation Feature

October 21, 2015, 10:12 am

Latest and popular articles on Intel Technologies

≫ Next: User Feedback for Natural User Interfaces with Intel® RealSense™ Technology Part 2

≪ Previous: Tutorial: Migrating Your Apps to DirectX* 12 – Part 2

Introduction

This whitepaper describes how developers can integrate Intel® RealSense™ SDK background segmentation (BGS) middleware to create new immersive collaboration applications. It outlines the expected behaviors and performance expectations under a variety of scenarios and presents some limitations that developers need to be aware of before shipping products to consumers. The primary audience is development teams implementing BGS and OEMs.

Background and Scope

Background segmentation (also known as “BGS technology”) is the key product differentiator for the Immersive Collaboration and Content Creation category for the Intel® RealSense™ camera. The ability for users to segment-out their backgrounds in real-time without the need for specialized equipment or post-processing is a compelling value-add for existing teleconferencing applications.

There is ample potential for users to augment their existing uses or invent new ones based on BGS technology. For example, consumers can view shared content together with friends on YouTube* through another sharing software over a video chat session. Co-workers can see each other overlaid onto a shared workspace during a virtual meeting. Developers can integrate BGS middleware for creating new uses like changing background images or adding video to background while running camera-based or sharing-based applications. Figures 1 and 2 illustrate applications that have immersive uses using the Intel RealSense camera. In addition, developers can think about uses such as taking selfies and changing the background, using collaboration tools such as browser or office applications to do sharing and editing with multi-parties, for example, creating a Karaoke video using a different background.

Figure 1. Cyberlink® YouCam RX* -
http://www.cyberlink.com/stat/product/youcamrx/enu/YouCamRX.jsp

Figure 2. Personify® App - http://www.personify.com/realsense

Creating a BGS Sample Application

Prerequisites:

Intel® Core™ processor platform with USB3.0 root port enabled
Memory: 4 GB
Intel RealSense camera (F200)
Intel® RealSense™ SDK
Intel® Depth Camera Manager software or a system with this component already installed by the OEM
Microsoft Windows* 8.1 64-bit
Microsoft Visual Studio* 2010-2013 with the latest service pack

In this paper we explain how developers can replace the background with video or images in a sample application. We also provide a snippet for blending the output of the image from middleware with any background image and what to expect in the way of performance.

The current implementation of background segmentation middleware supports YUY2 and RGB formats. Resolution varies from 360p to 720p for RGB and 480p for depth image.

Figure 3 shows the high-level pipeline for BGS. The depth and color frames are captured by the Intel RealSense camera and passed to the core SDK (that is, the Intel RealSense SDK runtime). Based on the request from the application, the frames are delivered to the User Extraction block, which is the segmented RGBA image. This image can be alpha-blended with any background-based RGB image to create the final output. Developer can use any mechanism to blend the images on screen but using graphics can yield best performance.

Figure 3. BGS pipeline.

The following steps explain how to integrate 3D segmentation into a developer application.

1. Install the following components as part of the Intel RealSense SDK:

Intel RealSense SDK core runtime
Background Segmentation module

2. Use a web setup or standalone installer to install only core and personify components. Runtime can be installed only in UAC mode.

intel_rs_sdk_runtime_websetup_x.x.x.xxxxxx --silent --no-progress --accept-license=yes --finstall=core,personify --fnone=all”

You can detect which runtime is installed on the system by using the following Intel RealSense SDK API:

   // session is a PXCSession instance
   PXCSession::ImplVersion sdk_version=session->QueryVersion();

3. Create Instance for using the 3D camera. It creates a pipeline construct for running any 3D-based algorithm.

   PXCSenseManager* pSenseManager = PXCSenseManager::CreateInstance();

4. Enable the module of the middleware that you need to use. It is recommended that you enable only the module that the application needs.

   pxcStatus result = pSenseManager->Enable3DSeg();

5. Identify which profile is needed by your application. Running at higher resolution and frames per second can impact performance. Pass the profiles to get a specific stream from the camera.

   PXC3DSeg* pSeg = pSenseManager->Query3DSeg();
   pSeg->QueryInstance<PXCVideoModule>()->QueryCaptureProfile(profile, &VideoProfile);
   pSenseManager->EnableStreams(&VideoProfile);

6. Initialize the pipeline for the camera and pass the first frame to the middleware. This stage is required by all middleware and is needed to make the pipeline work.

   result = pSenseManager->Init();

7. Retrieve the segmented image from the camera. Output of image from middleware is RGBA and contains only the segmented part.

PXCImage *image=seg->AcquireSegmentedImage(...);

8. Blend the segmented image with your own background.

Note: Blending has a significant impact on performance if done on a CPU versus a GPU. The sample application runs on CPU.

You can adopt any technologies to do blending with background image pixel and RGBA segmented image.
You can use zero copy for copying data to system memory using the GPU instead of the CPU.
Direct3D* or OpenGL* can be used for blending based on preference.

Here is a code snippet for getting image pass to system memory where srcData is of type pxcBYTE -

   segmented_image->AcquireAccess(PXCImage::ACCESS_READ,
   PXCImage::PIXEL_FORMAT_RGB32, &segmented_image_data);

   srcData = segmented_image_data.planes[0] + 0 * segmented_image_data.pitches[0];

Steps for Blending and Rendering

Capture: Read streams for color and depth data from camera
Segment: Discriminate between background and foreground pixels
Copy color and segmented image (depth mask) into textures.
Resize segmented image (depth mask) to the same resolution as the color image.
(Optional) Load or update background image (if replacing) into a texture.
Compile/load shader.
Set color, depth, and (optional) background textures for shader use.
Run shader and present.
(For a videoconferencing application) Copy blended image to NV12 or YUY2 surface.
(For videoconferencing application) Pass surface to Intel® Media SDK H.264 HW Encoder.

Performance

The application’s behavior is affected by three factors:

FPS
Blending
Resolution

The table below shows CPU utilization on a 5th generation Intel® Core™ i5 processor.

	No Render	Render on CPU	Render on GPU
720p/30fps	29.20%	43.49%	31.92%
360p/30fps	15.39%	25.29%	16.12%
720p/15fps	17.93%	28.29%	18.29%

To verify the impact of rendering on your own machine, run the sample application with and without the “-noRender” option.

BGS Technology Limitations

User segmentation is still evolving, and the quality is increasing with each new version of the SDK.

Points to remember while evaluating quality:

Avoid black objects on body that has similar color as background image. E.g. black shirt with black background
High-intensity light on head can impact hair quality.
Lying on the couch or bed can result in a poor user experience. A sitting position is better for video conferencing.
Translucent or transparent objects like a drinking glass won’t work as expected.
Hand webbing is an issue; expect quality to vary.
Hair on forehead may have segmentation issues.
Do not move hand or head very fast. Camera limitation impacts quality.

Providing Feedback to Intel on BGS Technology

How you can continue to making software better? The best way is to provide feedback. Running the scenarios under the similar environment can be difficult after developer want to re-test on new Intel RealSense SDK release.

To minimize run-to-run variance, it’s best to capture the input camera sequences that are used to replicate the issue to see whether the quality improves.

The Intel RealSense SDK is shipped with a sample application that can help to collect sequences to replay with new drops;

Important for providing feedback on quality
Not for performance analysis

In a default installation, the sample application is located at C:\Program Files (x86)\Intel\RSSDK\bin\win32\FF_3DSeg.cs.exe. Start the application and follow the steps shown in the screenshots below:

You will see yourself with the background removed.

Playing sequences

If you select the Record mode, you can save a copy of your session. You can then open the FF_3DSeg.cs.exe application and select playback mode to see the recording.

Summary

Intel RealSense technology background segmentation middleware brings a new immersive experience to consumers. These new usages include changing background to the video or picture, or creating a selfie with segmented images.

References

Intel RealSense camera (F200) : https://software.intel.com/en-us/realsense/home
Intel® RealSense™ SDK : https://software.intel.com/en-us/intel-realsense-sdk/download
Intel® Depth Camera Manager software or a system with this component already installed by the OEM : https://software.intel.com/en-us/intel-realsense-sdk/download
Tips and Tricks for Intel® RealSense™ Background Segmentation : https://software.intel.com/en-us/articles/realsense-bgs
Intel® Real Sense Showcase Applications: https://appshowcase.intel.com/en-us/realsense

↧

User Feedback for Natural User Interfaces with Intel® RealSense™ Technology Part 2

October 21, 2015, 1:20 pm

Latest and popular articles on Intel Technologies

≫ Next: Android and Crosswalk Cordova Version Code Issues

≪ Previous: Intel® RealSense™ SDK Background Segmentation Feature

By Justin Link on July 8, 2015

In my previous article, User Feedback for Natural User Interfaces with Intel® RealSense™ Technology Part 1, I discussed user expectations when interacting naturally with computers and the importance of and challenges involved with creating user feedback for natural user interfaces (NUIs). I also discussed user feedback when using hands, based on our experience creating the games Space Between* and The Risen*, which are based on Intel® RealSense™ technology. In this article, I discuss the other input modalities that Intel RealSense technology offers: head and voice.

An IDF 2015 attendee playing The Risen.

Interacting with your hands in an application (once you’ve gotten past the technical limitations of the recognition hardware and software) tends to be a natural and intuitive experience. We use our hands every day for nearly everything. They are our manipulators of the physical world, and so it makes sense that they would be used to manipulate virtual worlds as well. However, using your head and face to manipulate or control things is a much more abstract task. In contrast using voice to control applications is intuitive but presents a different kind of challenge that mostly revolves around the state of the technology and our own expectations. In both these cases, the challenges require specific design and attention to user feedback when developing an application for them.

A Quick Review of What User Feedback Is (and Why It’s Important)

In case you haven’t read Part 1 of this article, I want to quickly review what user feedback in an application is. To put it simply, user feedback in an application is any kind of notification the application gives the user to tell them that the software has recognized their input or has changed state. For example, when using a button in an application there will usually be a visual change when the mouse hovers over it, clicks down on it, or the user clicks it. These visual changes are important because they tell the user that this is an element that can be interacted with and what kind of input affects that element. This user feedback even extends outside the software itself into the hardware we use to interface with it. The mouse, when clicked, has an audible and tangible response to a user pushing a button for the same reasons a button in software does. All of this stems from the natural way we interact with people and our environments.

User feedback is important in NUIs for three reasons: (1) most NUIs are touchless, and so any kind of haptic feedback is impossible, (2) there are no standards for user feedback in NUIs as there are in other software input modalities, and (3) when using NUIs there is an expectation that they behave, well, naturally. Losing haptic feedback in any input modality means that the software has to account for this either visually or aurally. If you don’t design this in, your software will be much less intuitive and will create more frustration for a new user.

The lack of any kind of universal standard for this feedback means that not only will you have to figure out what works best, but your users will be unfamiliar with the kinds of feedback they’re getting and will have to learn it, increasing the learning curve for the application. Working around user expectations is the final challenge since the medium stems directly from (and is trying to be) human language. Unfortunately the current state of technology is not mature enough to replicate a natural conversation, and so you will find users acting with the software as they would with another person only to find that it doesn’t quite work the same in software.

Head Tracking in the Intel® RealSense™ SDK

Image 1.Head tracking in the Intel RealSense SDK showing the orientation of a user’s head.

The advantages of knowing where a user’s head is from inside an application aren’t immediately obvious. Unlike hands, you probably won’t want to develop intricate control mechanisms based on position tracking, or you’ll risk giving your users self-induced whiplash. However, light use of head position tracking can be a unique layer in your application and can even be used to immerse users further into your virtual worlds. The Intel RealSense SDK has a couple of limitations with head tracking that you’ll need to consider, especially when designing user feedback into your application.

Limitations of Head Tracking

The Tracking Volume

Image 2.The tracking volume for Intel RealSense SDK is finite and can be restrictive to the application.

As I discussed in Part 1, understanding that tracking in the Intel RealSense SDK happens within the camera’s field of view is critical to knowing how to use the device. This is the number one problem that users have when interacting with any Intel RealSense application, and it applies to all modalities of the Intel RealSense application except voice. This limitation is less pronounced when using the head-tracking module since users will generally be seated, but it can still be an issue especially if your application has the user leaning left and right.

Tracking is Based on Face Detection

Most of the time, users will be facing the camera and so the software’s need to detect a face for head tracking won’t be a huge issue. Face detection is needed the most during initial detection and in subsequent detections when tracking is lost. The software can especially have trouble picking up a user’s face when the camera is placed above or below the user’s head (causing a sharp perspective from the camera’s view). The solution, similar to hands, is to show the camera the thing it’s looking for—the face in this case. Needing to have a detected face has other implications for head tracking too, like not being able to track the back of the head (if the user turns around).

Head as a Cursor

Image 3.The Twilight Zone stage from Space Between used a glowing cursor to track the user’s head.

In Space Between, we used a head cursor to represent the player’s head position in two dimensions on the screen. While our usage didn’t have users selecting things with their head like you normally would with a cursor, we ended up basing the control of a whale off our “hands as a cursor” implementation.

Next I will talk about some of our challenges when designing this kind of head-tracking interaction, go over our implementation, and discuss from a usability perspective what worked and what didn’t.

Our Challenges

People often lean out of tracking bounds.
Again, this is the most common issue with new people using Intel RealSense applications, but different users had different levels of engagement in our application, which for some meant leaning much further than we anticipated. Leaving the tracking bounds wouldn’t be a problem if it didn’t require the Intel RealSense application to re-detect the user when that happened. We found that this could throw off our entire experience and so we needed to design around it.

Moving up/down wasn’t as intuitive for people as we expected.
Horizontal movement when leaning maps pretty easily to a cursor when using your head for input, but vertical movement isn’t quite the same. Literally raising and lowering your head didn’t make sense to use for vertical position since it would mean that users would have to stand up or crouch down to move their head up or down relative to the camera. Instead we chose to use the distance from the camera (leaning in and out) to control vertical cursor position, but we found that this wasn’t as intuitive for some as we had expected.

The Cursor in Space Between

For Space Between, we used the head cursor to control a whale in a stage called the Twilight Zone. In this stage, the player could lean left or right to swim left or right, and lean in and out to dive and ascend. The whale was fixed to a rail and leaning would make the whale swim within a certain distance of that rail, allowing players to pick up points along the way.

What Worked
From a user-feedback perspective, showing a cursor that mapped to the head’s position helped us understand how the head was being tracked. We also front-loaded each game with instructions and graphics to show what input modalities were being used, which helped prep players for exploring the input. Once people understood exactly what the cursor was showing (head position), the cursor also helped players intuitively learn where the camera’s field of view was, since a cursor on the edge of the screen meant the player’s head was on the edge of the camera’s field of view.

Image 4.A snip from our instructions in Space Between showing the input we’re using from the Intel RealSense SDK.

What Didn’t Work
While we did have an animation of the whale turning as you leaned left or right, it was pretty subtle and there were times when people didn’t know they were moving in the direction they were trying to. We needed a stronger visual indication that the leaning was directly related to moving the whale left, right, up, or down. There was also initial confusion sometimes over what the cursor was representing. To alleviate the confusion, I think we could have done a better job showing or explaining that the cursor was representing head position.

Takeaways

It’s important to prepare the user for the kind of input they will be doing.
To help eliminate loss of tracking, show in some way where the edge of the camera’s field of view is relative to the input being used.
The control is much more intuitive when the visual feedback is tied to what a user’s input is controlling.

Voice Recognition in Intel RealSense SDK

I saved the most challenging aspect for last, and I’ll start with a disclaimer: most of what we learned was about limitations and what didn’t work. For those not familiar, voice recognition in the Intel RealSense SDK comes in two flavors: command and dictation. In command mode you set the specific commands to listen for in the Intel RealSense SDK, while in dictation you’re given a string of recognized speech as it comes. While we tried some things that definitely improved the user experience when using the voice module, it has still been by far the most frustrating for users to use and for us to implement. The challenge here is to leverage user feedback to mitigate the technical limitations in voice recognition.

Limitations of Voice Recognition

The module’s accuracy does not meet user expectations

Most people have had experience with some kind of voice recognition software, such as Apple Siri*, Google Now*, or Microsoft Cortana*. In each of these solutions the software is cloud-based, leveraging tons and tons of data, complex algorithms, and so on. These capabilities aren’t available in a local solution such as what Intel RealSense SDK uses. Because user expectations are set based on the more functional cloud-based solutions, you’ll need to manage and mitigate the limitation through design and by providing instructions and user feedback.

There is sometimes a significant delay between spoken commands and recognized speech

Depending on the application, speech will sometimes have a significant delay between when the user speaks a command versus when Intel RealSense SDK processes that command and returns it as text.

Voice pitch, timbre, and volume play a role in voice-recognition accuracy

From our experience, adult male voices are recognized the best, while higher pitch and quiet voices are not recognized as well.

Accents play a role in voice-recognition accuracy

For English, there are two versions you can set in Intel RealSense SDK: American and British. This obviously does not cover the nuances of accents within those dialects, and so people with accents will have a harder time getting speech recognized.

Microphone quality plays a large role in voice-recognition accuracy

The microphone built into the Intel RealSense camera (F200) works well as an all-around webcam microphone, but for voice recognition we’ve found that headset mics work better.

Environment noise plays a larger role in voice-recognition accuracy

This is the biggest challenge with any voice recognition-enabled applications. Environments vary greatly in terms of ambient noise, and voice recognition works best in quiet environments where speech detected is clear and discernable. A headset mic helps mitigate this problem, but don’t expect voice recognition to work well outside of a relatively quiet home office.

Voice Recognition as an Input Controller

Image 5.Here I am giving a powerful command to my skeletons in The Risen.

Using your voice to command an application is one of the biggest ways you can knock down the wall between humans and computers. When it works, it’s magical. When it doesn’t, it’s frustrating. In our game The Risen we used voice to let players give real commands to their skeleton minions. Next I will talk about some of our challenges and how we approached them from a user feedback perspective.

Our Challenges

Voice commands often go unrecognized.
This is enough to warrant strongly considering its inclusion at all, but designing user feedback to mitigate this within the technical limitations of Intel RealSense SDK is a challenge.

Users often don’t know why their command wasn’t recognized.
Was it because I wasn’t loud enough, because I didn’t speak clearly, because the module didn’t initialize, or does it just not like me? These are some of the range of questions users have when trying to use voice recognition.

It was easy to forget what the commands were when playing for the first time.
We did our best to represent the voice commands with icons, but when you only see the words once they can be easy to forget.

Voice Recognition in The Risen

In The Risen, you could issue four simple commands to direct your skeletal minions: forward, back, attack, and defend. Each of these would put your skeletons into a specific attack state, allowing for high level control of their actions. Skeleton states were represented by colored icons in the GUI and effects on the skeletons themselves.

We also had GUI to give users feedback on when speech detected had begun and had ended, as well as a slider to control microphone input volume. For the feedback on detecting commands, we started playing an animation of a mouth moving on our GUI player skeleton when we received the alert LABEL_SPEECH_BEGIN and stopped playing it when we received LABEL_SPEECH_END. The microphone slider was there to increase the quality of speech recognized, but also changed color to indicate whether the detected speech was too loud or too quiet.

Image 6.The microphone slider in The Risen.

What worked
In terms of knowing what state the skeletons were in, the visual effects on the skeletons were the most informative. Before we had that, people would spam commands not knowing that they were already in that state, or wonder why specific skeletons were in a different state than the global skeleton state (an intended game mechanic). The visual effects also helped us better debug our skeleton AI.

The microphone volume slider actually ended up helping, so much so that I recommend that any game using voice recognition implement this feature. Not only did it give a way to dynamically adjust microphone input volume and improve the success rate of recognized commands, but it also gave a way to tell users why input might not be working. This is huge for mitigating user frustration because it implicitly told users that the microphone was working, commands were being recognized, and gave them a hint on how to improve their input.

What didn’t work
The animated player skeleton that was supposed to indicate when a user was talking didn’t quite work to tell people when commands were being recognized. I think this was because there were quite a few things to look for in the interface and so the animation detail was often overlooked. That said, we only created a short demo level for this game, and so users didn’t have much time to become familiar with the UI.

I also think that the icons we used to represent the skeleton state were mostly overlooked. This probably would have been fine for a non-voice controlled game, but with the goal of informing the user what command was just detected (and when), this was a problem. To show that a voice command is recognized, I think we need to flash the word on the screen for a second or so, to get the user’s attention and let them know that the system recognized their command. This approach would also help the user to remember the specific commands they need to use.

Takeaways

Tell the user when speech is being detected before speech has been processed to avoid frustration and command repetition while speech is being processed.
Make what the speech controls obvious and the changes between those states apparent.
Give users a microphone input volume slider that also tells when the speech being detected is too loud or too quiet.
Consider showing the command on the screen to help users remember what the system’s commands are.
Make it obvious to the user when a command has been recognized.

Man’s New Best Friend

Computers are finding their way into every aspect of our lives. As their technology continues to improve, we find new ways to utilize them and entrust them with even greater responsibilities. Fast approaching is the day when even we ourselves have bits of computers integrated into us. As such, our relationship with computers is becoming more and more human.

Intel RealSense technology and other NUIs are the first step in this direction. These technologies give us the capability to truly shift our perspective on how we see and interact with our world. But the relationship is still young, and as designers and developers we are responsible for guiding it in the right direction. One day our computers will be like our best friends, able to anticipate our intentions before we even start expressing them; but for now, they’re more like our pets and still need a little help telling us when they need to go outside.

About the Author

Justin Link is an Interactive Media Developer for Chronosapien Interactive in Orlando, Florida. His game Space Between placed 2nd in the Intel® Perceptual Computing Challenge. The game utilized the Intel Perceptual Computing gesture camera and focused on using gestures and voice to control underwater sea creatures in three mini games. In the top 10% of Intel Innovators, he has trained more than 1000 developers on perceptual computing, including the new 2014 Intel RealSense technology.

For More Information

User Feedback for Natural User Interfaces with Intel RealSense Technology Part 1 - by Justin Link
Using Intel® RealSense™ to Create "The Space Between"– Intel Software TV video
Space Between* Plumbs Ocean Depths with Intel® RealSense™ Technology– a case study with Ryan Clark of Chronosapien
Get the Intel RealSense SDK
Get an Intel® RealSense™ camera

↧

Android and Crosswalk Cordova Version Code Issues

October 22, 2015, 9:07 am

Latest and popular articles on Intel Technologies

≫ Next: Blend the Intel® RealSense™ Camera and the Intel® Edison Board with JavaScript*

≪ Previous: User Feedback for Natural User Interfaces with Intel® RealSense™ Technology Part 2

With the release of Apache* Cordova* CLI 5 by the Apache Cordova project, there was a change to how "App Version Codes" are written inside the Android and Android-Crosswalk APK files built with CLI 5 by the Intel XDK. The App Version Code is directly related to the Build Settings section of the Projects tab, and corresponds to the android:versionCode parameter found in the AndroidManifest.xml file of every Android and Android-Crosswalk APK:

In the past (CLI 4.1.2 and earlier), Cordova did not modify the Android version code, so the android:versionCode found in your Android APK was identical to the value provide (above) in the App Version Code field. However, in order to support the submission of multiple Android-Crosswalk APK files (e.g., ARM and x86) to an Android store, the Intel XDK build system did modify the version code for Android-Crosswalk embedded builds.

The "historic" behavior regarding the final Android version code that was inserted in your built APK files, when built with the Intel XDK using CLI 4.1.2 (or earlier) is:

no change to the version code for regular Android builds (android:versionCode = App Version Code)
no change to the version code for Android-Crosswalk shared library builds
add 60000 to the version code for Android-Crosswalk x86 embedded library builds
add 20000 to the version code for Android-Crosswalk ARM embedded library builds

So that means you will see the following android:versionCode values inside your built APK if you set the App Version Code field to one in the Build Settings section of the Projects tab and set the CLI version to 4.1.2:

"1" for a regular Android build
"1" for an Android-Crosswalk shared library build
"20001" for an Android-Crosswalk embedded library ARM build
"60001" for an Android-Crosswalk embedded library x86 build

Beginning with Cordova CLI 5, in order to maintain compatibility with standard Cordova, the Intel XDK no longer modifies the android:versionCode when building for Android-Crosswalk. Instead, the new Cordova CLI 5 encoding technique has been adopted for all Android builds. This change results in a discrepancy in the value of the android:versionCode that is inserted into your Android APK files when compared to building with CLI 4.1.2 (and earlier).

If you have never published an app to an Android store this change in behavior will have little or no impact on you. It might affect attempts to side-load your app onto a device, in which case the simplest workaround is to uninstall the previously side-loaded app before installing the new app. For those who have Android apps in stores that have been built with Cordova CLI 4.1.2 or earlier, please read on...

Here's what Cordova CLI 5 (Cordova-Android 4.x) does with the android:versionCode number (which you specify in the App Version Code field within the Build Settings section of the Projects tab):

multiplies your android:versionCode (App Version Code) by 10

then, if you are doing a Crosswalk (15+) build (with CLI 5):

adds 2 to the android:versionCode for ARM builds
adds 4 to the android:versionCode for x86 builds

otherwise, if you are performing a standard Android build (non-Crosswalk):

adds 0 to the android:versionCode if the Minimum Android API is < 14
adds 8 to the android:versionCode if the Minimum Android API is 14-19
adds 9 to the android:versionCode if the Minimum Android API is > 19 (i.e., >= 20)

"10" for a regular Android build if the android:minSdkVersion is < 14
"18" for a regular Android build if the android:minSdkVersion is 14-19
"19" for a regular Android build if the android:minSdkVersion is > 19
"12" for an Android-Crosswalk embedded library ARM build
"14" for an Android-Crosswalk embedded library x86 build
"TBD" for an Android-Crosswalk shared library build (not fully supported at this time)

NOTE: the Minimum Android API field in the Build Settings section of the Projects tab sets the value of the android:minSdkVersion number shown in the bullets above.

If you HAVE ALREADY PUBLISHED a Crosswalk app to an Android store this change may impact your ability to publish a newer version of that same app! In that case, if you are still building for Crosswalk, add 6000 (six with three zeroes) to your existing App Version Code field in the Crosswalk Build Settings section of the Projects tab.

If you have only published standard Android apps in the past and are still publishing only standard Android apps you should not have to make any changes to the App Version Code field in the Android Builds Settings section of the Projects tab.

The workaround described above only applies to Crosswalk CLI 5.1.1 (and higher) builds!

When you build with CLI 4.1.2 (which uses Cordova-Android 3.6) you will get the old Intel XDK behavior where: 60000 and 20000 (six with five zeros and two with five zeroes) are added to the android:versionCode for Crosswalk builds and no change is made to the android:versionCode for standard Android builds.

NOTE:

Android API 14 corresponds to Android 4.0
Android API 19 corresponds to Android 4.4
Android API 20 corresponds to Android 5.0
CLI 5.1.1 (Cordova-Android 4.x) does not allow building for Android 2.x or Android 3.x

↧

Blend the Intel® RealSense™ Camera and the Intel® Edison Board with JavaScript*

October 22, 2015, 11:48 am

Latest and popular articles on Intel Technologies

≫ Next: Deploying an Apache Hadoop* Cluster? Spend Your Time on BI, Not DIY

≪ Previous: Android and Crosswalk Cordova Version Code Issues

Introduction

Smart devices can now connect to things we never before thought possible. This is being enabled by the Internet of Things (IoT), which allows these devices to collect and exchange data.

Intel has created Intel® RealSense™ technology, which includes the Intel® RealSense™ camera and the Intel® RealSense™ SDK. Using this technology, you can create applications that detect gesture and head movement, analyze facial data, perform background segmentation, read depth level, recognize and synthesize voice and more. Imagine that you are developing a super sensor that can detect many things. Combined with the versatile uses of the Intel® Edison kit and its output, you can build creative projects that are both useful and entertaining.

The Intel RealSense SDK provides support to popular programming language and frameworks such as C++, C#, Java*, JavaScript, Processing, and Unity*. This means that developers can get started quickly using a programming environment they are familiar with.

Peter Ma’s article, Using an Intel® RealSense™ 3D Camera with the Intel® Edison Development Platform, presents two examples of applications using C#. The first uses the Intel RealSense camera as input and the Intel® Edison board as output. The result is that if you spread your fingers in front of Intel RealSense camera, it sends a signal to the Intel® Edison board to turn on the light.

In the second example, Ma reverses the flow, with the Intel® Edison board as input and the Intel RealSense camera as output. The Intel® Edison board provides data that comes from a sensor to be processed and presents it to us through the Intel RealSense camera as voice synthesis to provide more humanized data.

Ma’s project inspired me to build something similar, but using JavaScript instead of C#. I used the Intel RealSense SDK to read and send hand gesture data to a node.js server, which then sends the data to the Intel® Edison board to trigger a buzzer and LED that are connected to it.

About the Project

This project is written in JavaScript. If you are interested in implementing only a basic gesture, the algorithm module is already in the Intel RealSense SDK. It gives you everything you need.

Hardware

Requirements:

Intel® Edison board with the Arduino* breakout board
Seeed Grove* Starter Kit Plus - Intel® XDK IoT Edition (at http://www.seeedstudio.com/depot/Grove-starter-kit-plus-Intel-IoT-Edition-for-Intel-Galileo-Gen-2-and-Edison-p-1978.html)
4th generation Intel® Core™ processor (or later) with 8 GB free hard disk space and USB 3.0 port support.
Intel RealSense camera (F200) (system-integrated or peripheral version)
Linux* server equipped with node.js (see https://nodejs.org)

Intel Edison board with the Arduino* breakout board

The Intel® Edison board is a low-cost, general-purpose computer platform. It uses a 22nm dual-core Intel® Atom™ SoC running at 500 MHz. It supports 40 GPIOs and includes 1 GB LPDDR3 RAM, 4 GB EMMC for storage, dual-band Wi-Fi, and Bluetooth, and has a small size.

The board runs the Linux kernel and is compatible with Arduino, so it can run an Arduino implementation as a Linux program.

Figure 1. Intel® Edison breakout board kit.

Grove Starter Kit Plus - Intel® XDK IoT Edition

Grove Starter Kit Plus - Intel XDK IoT Edition is designed for the Intel® Galileo board Gen 2, but it is fully compatible with the Intel® Edison board via the breakout board kit.

The kit contains sensors, actuators, and shields, such as a touch sensor, light sensor, and sound sensor, and also contains an LCD display as shown in Figure 2. This kit is an affordable solution for developing an IoT project.

You can purchase the Grove Starter Kit Plus here: http://www.seeedstudio.com/depot/Grove-starter-kit-plus-Intel-IoT-Edition-for-Intel-Galileo-Gen-2-p-1978.html

Figure 2. Grove* Starter Kit Plus - Intel® XDK IoT Edition

Intel® RealSense™ Camera

The Intel RealSense camera is built for game interactions, entertainment, photography, and content creation with a system-integrated or a peripheral version. The camera’s minimum requirements are a USB 3.0 port, a 4th gen Intel Core processor, and 8 GB of hard drive space.

The camera (shown in Figure 3) features full 1080p color and an in-depth sensor and gives the PC a 3D visual and immersive experience.

Figure 3. Intel® RealSense™ camera

You can purchase the complete developer kit, which includes the camera here.

GNU*/Linux server

A GNU/Linux server is easy to develop. You can use an old computer or laptop or you can put the server on a cloud. I used a cloud server with an Ubuntu* server. If you have different Linux flavors for the server, just adapt to your favorite command.

Software

Before we start to develop the project, make sure you have the following software installed on your system. You can use the links to download the software.

Intel XDK IoT Edition
Intel® RealSense™ SDK (R4 or latest version)

Set Up the Intel® RealSense™ Camera

To set up the Intel RealSense camera, connect the Intel RealSense camera (F200) to the USB 3.0 port, and then install the driver as the camera connected to your computer. Navigate to the Intel RealSense SDK location, and open the JavaScript sample on your browser:

Install_Location\RSSDK\framework\JavaScript\FF_HandsViewer\FF_HandsViewer.html

After the file opens, the scripts checks to see what platform you have. While the script is checking your platform, click the link on your web browser to install the Intel RealSense SDK WebApp Runtime.

When the installation is finished, restart your web browser, and then open the file again. You can check to see that the installation was a success by raising your hand in front of the camera. It should show your hand gesture data visualized on your web browser.

Gesture Set Up

The first key code line that enables gesture looks like the following:

{"timeStamp":130840014702794340 ,"handId": 4,"state": 0,"frameNumber":1986 ,"name":"spreadfinger"
}

This sends "name":"spreadfingers" to the server to be processed.

Next, we will write some JavaScript code to stream gesture data from the Intel RealSense camera to the Intel® Edison board through the node.js server.

Working with JavaScript

Finally, we get to do some programming. I suggest that you first move the whole folder because the default installation doesn’t allow the original folder to be rewritten.

Copy the FF_HandsViewer folder from this location and paste it somewhere else. The folder’s location is:

\install_Location\RSSDK\framework\JavaScript\FF_HandsViewer\

Eventually, you will be able to create your own project folder to keep things managed.

Next, copy the realsense.js file from the location below and paste it inside the FF_HandsViewer folder:

Install_Location\RSSDK\framework\common\JavaScript

To make everything easier, let’s create one file named edisonconnect.js. This file will receive gesture data from the Intel RealSense camera and send it to the node.js server. Remember that you have to change the IP address on the socket variable directing it to your node.js server IP address:

// var socket = io ('change this to IP node.js server');

var socket = io('http://192.168.1.9:1337');

function edisonconnect(data){
  console.log(date.name);
  socket.emit('realsense_signal',data);
}

Now for the most important step: commanding the file sample.js to create the gesture data, and running a thread to intercept that gesture data and pass it to edisonconnect.js. You don’t need to watch CPU activity because it doesn’t take much frame rate or RAM as it compiles.

// retrieve the fired gestures
for (g = 0; g < data.firedGestureData.length; g++){
  $('#gestures_status').text('Gesture: ' + JSON.stringify(data.firedGestureData[g]));

  // add script start - passing gesture data to edisonconnect.js
	edisonconnect(data.firedGestureData[g]);
  // add script end
}

After the function above is running and called to create some gesture data, the code below finishes the main task of the JavaScript program. After that, you have to replace the realsense.js file path.

It is critical to do the following: link the socket.io and edisonconnect.js files

<!DOCTYPE html><html><head><title> Intel&reg; RealSense&trade; SDK JavaScript* Sample</title><script src=”https://aubahn.s3.amazonaws.com/autobahnjs/latest/autobahn.min.jgz” </script><script src=”https://promisejs.org/polyfills/promise-6.1.0.js” </script><script src=”https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js” </script><script src=”https://common/JavaScript/realsense.js” </script><script src=”sample.js” </script><script src=”three.js” </script><!-- add script start --><script src=”https://cdn.socket.io/socket.io-1.3.5.js” </script><script src=”edisonconnect.js” </script><!-- add script end → <link rel=”stylesheet” type=”text/css” href=”style.css”></head><body>

The code is taken from SDK sample. It has been reduced in order to make the code simple and easy. The code is about to send gesture data to the server. The result is that the Intel RealSense SDK was successful in understanding gesture and is ready to send it to the server.

Set Up the Server

We will use a GNU/Linux-based server. I use an Ubuntu server as the OS, but you can use any GNU/Linux distribution that you familiar with. We will skip the installation server section, because related tutorials are readily found on the Internet.

As the server has just been installed, we need to update the repository list and upgrade the server. To do this, I will use a common command that is found on Ubuntu distribution but you can use a similar command depending on the GNU/Linux distribution that you are using.

# apt-get update && apt-get upgrade

Once the repository list is updated, the next step is to install node.js.

# apt-get install nodejs

We also need to install npm Package Manager.

# apt-get install npm

Finally, install socket.io express from npm Package Manager.

# npm install socket.io express

Remember to create file server.js and index.html.

# touch server.js index.html

Edit the server.js file, using your favorite text editor such as vim or nano #

vim server.js

Write down this code:

var express   = require("express");
var app   	= express();
var port  	= 1337;

app.use(express.static(__dirname + '/'));
var io = require('socket.io').listen(app.listen(port));
console.log("Listening on port " + port);

io.on('connection', function(socket){'use strict';
  console.log('a user connected from ' + socket.request.connection.remoteAddress);

	// Check realsense signal
	socket.on('realsense_signal', function(data){
  	socket.broadcast.emit('realsense_signal',data);
  	console.log('Hand Signal: ' + data.name);
	});
  socket.on('disconnect',function(){
	console.log('user disconnected');
  });
});

var port = 1337; means that an available port has been assigned to port 1337. console.log("Listening on port " + port) ; indicates whether the data from JavaScript has been received or not. The main code is socket.broadcast.emit('realsense_signal',data); this means the data is received and is ready to broadcast to all listening port and clients.

The last thing we need to do is to run the server.js file with node. If listening at port 1337 displays as shown in the screenshot below, you have been successful.
# node server.js

root@edison:~# node server.js Listening on port 1337 events.js:85

Set up the Intel® Edison Board

The Intel® Edison SDK is easy to deploy. Refer to the following documentation:

Now it's time to put the code into the Intel® Edison board. This code connects the server and listens to any broadcast that comes from the server. It is like the code for the other server and listening step. If any gesture data is received, the Intel® Edison board triggers Digital Pins to On/Off.

Open the Intel XDK IoT Edition and create a new project from Templates, using the DigitalWrite template, as shown in the screenshot below.

Edit line 9 in package.json. by adding dependencies socket.io-client. If it is empty, search to find the proper installation. By adding dependencies, it will install the socket io client, if there was no client in the Intel® Edison board.

"dependencies": {"socket.io-client":"latest" // add this script
}

Find the file named main.js. You need to connect to the server in order to make sure that server is ready to listen. Next, check to see whether the gesture data name "spreadfingers" exists in that file, which will trigger Digital Pins2 and Digital Pins8 state to 1 / On and reversed.
Change the referring server IP’s addresses. If you want to change the pins, make sure you change on mraa.Gpio(selectedpins) too.

var mraa  = require("mraa");

var pins2 = new mraa.Gpio(2);
	pins2.dir(mraa.DIR_OUT);

var pins8 = new mraa.Gpio(8);
	pins8.dir(mraa.DIR_OUT);

var socket = require('socket.io-client')('http://192.168.1.9:1337');

socket.on('connect', function(){
  console.log('i am connected');
});

socket.on('realsense_signal', function(data){
  console.log('Hand Signal: ' + data.name);
  if(data.name=='spreadfingers'){
	pins2.write(1);
	pins8.write(1);
  } else {
	pins2.write(0);
	pins8.write(0);
  }
});

socket.on('disconnect', function(){
  console.log('i am not connected');
});

Select Install/Build, and then select Run after making sure the Intel® Edison board is connected to your computer.

Now make sure the server is up and running, and the Intel RealSense camera and Intel® Edison board are connected to the Internet.

Conclusion

Using Intel RealSense technology, this project modified the JavaScript framework sample script to send captured gesture data to the Node.js server. But this project is only a beginning for more to come.

This is easy to code. The server broadcasts Gesture Data to any socket client that listening. The Intel® Edison board that installed with socket.io-client is listening to the broadcast from server. Because of that, Gesture Data name spreadfingers will trigger Digital Pins change state from 1 to 0 and vice versa.

Possibilities are endless. The Intel RealSense camera is lightweight, easy to bring and use. Intel® Edison is a powerful embedded PC. If we blend and connect the Intel® Edison and the Intel RealSense camera with JavaScript, it is easy to pack, code, and build an IoT device. You can create something great yet useful.

About the Author

Aulia Faqih - Intel® Software Innovator

Intel® RealSense Technology Innovator based in Yogyakarta, Indonesia, currently lecturing at UIN Sunan Kalijaga Yogyakarta. Love playing with Galileo / Edison, Web and all geek things.

↧

Deploying an Apache Hadoop* Cluster? Spend Your Time on BI, Not DIY

October 23, 2015, 1:18 pm

Latest and popular articles on Intel Technologies

≫ Next: Rendering Researchers: Kai Xiao

≪ Previous: Blend the Intel® RealSense™ Camera and the Intel® Edison Board with JavaScript*

Achieve near-real-time results from your Apache Hadoop* cluster, without days or weeks of specialized tuning

Do-it-yourself (DIY) Apache Hadoop* clusters are appealing to many organizations because of the apparent cost savings from using commodity hardware and free software distributions. Despite these apparent savings, many organizations are opting to purchase pre-built clusters, such as the Oracle Big Data Appliance*, because they know that a commodity cluster requires considerable time and specialized engineering skills to deploy, optimize, and tune for real-time data analysis.

Powered by fast, efficient Intel® Xeon® processors, the Oracle Big Data Appliance is a pre-built and optimized solution for Apache Hadoop big data analysis that can be deployed in minutes. It can then be tuned for near-real-time analysis in minutes or hours, instead of the days or weeks it would take to tune a DIY Apache Hadoop cluster. This paper describes the performance tuning techniques used in benchmark testing that resulted in an Oracle Big Data Appliance performing nearly two times faster than a comparable DIY cluster.

View entire white paper (PDF) Download Intel_Oracle_BDA_Opt.pdf

↧

Rendering Researchers: Kai Xiao

October 23, 2015, 12:02 pm

Latest and popular articles on Intel Technologies

≫ Next: Building a Personality-Driven Poker AI for Lords of New York*

≪ Previous: Deploying an Apache Hadoop* Cluster? Spend Your Time on BI, Not DIY

Kai Xiao Kai joined the Advanced Rendering Technology team in July 2015, after finished his Ph.D. in Computer Science from University of Notre Dame. Prior to Intel, he has been studying texture sampling in NVIDIA research as an intern and GPGPU computing in Notre Dame. His current research focus lies in real-time rendering and ray tracing.

↧

Building a Personality-Driven Poker AI for Lords of New York*

October 23, 2015, 2:55 pm

Latest and popular articles on Intel Technologies

≫ Next: 面向 WebRTC 的英特尔® 协同通信开发套件应用研讨会 — 邀请函测试

≪ Previous: Rendering Researchers: Kai Xiao

by Dan Higgins, owner and lead programmer of Lunchtime Studios, LLC

Download PDF

Writing artificial intelligence (AI) might be the best job in games. It’s creative, challenging, and blurs the line between game design and programming. AI is used for a variety of tasks ranging from the mechanical (such as auto-attacking enemies) and bot AI, to flocking group intelligence, even to deep-thinking military generals. Games that emphasize story and character-based immersion such as Lords of New York require an additional, essential AI ingredient: personality.

Lords of New York

What is Lords of New York?

Lords of New York is a 1920s poker rpg. You play Vince, a gifted card-playing mobster competing in the secret poker tournament Lords of New York. Overlooked as a contender by your boss, your talents for cheating and intimidation at the poker table keep your head above water as you slug out victory after victory. With only six months left in the tournament, you must travel around New York, improving your skills, gaining items, running boxing rackets, and defeating the best card players in the city. As if being the best wasn’t difficult enough, you’re soon accused of murder—one you didn’t commit! Nevertheless, you are arrested, booked, and sent to prison.

Prison is awful. While you stare at the walls, the world outside carries on. You see your turf chipped away, you lose contact with your boxer, and find your position within the tournament buried and forgotten. Meanwhile, you must survive inside and continue playing poker, albeit for different types of currency. If you find a way back to the streets, time will not be on your side. You'll have to raise cash, defeat opponents, and quickly become one of the front-runners again.

Lords of New York is a fully voice-acted, story-driven game that features several new technologies, with two being most notable. The first is a new type of 2D animation system built to allow dynamic body language with the characters. The other is the personality-driven AI where playing poker isn’t about the cards, it’s about the story and the people. The combination of these two technologies brings the characters to life and, while the 2D animation is something that everyone can see, the poker AI is less obvious.

Lords of New York

Genesis

The idea of Lords of New York popped into my head during the fall of 2003, just when the poker boom was heating up. I watched my friend Keith playing online poker—eight games at once. He clicked quickly from room to room, chatting away with me about code, paying little attention to any one game. He’d fold, fold, check, fold, check, and so on.

“That looks boring,” I said.

He nodded and added, “It is, but if you add enough games, it’s all right.”

“Why don’t you just play a single game for more money?”

Poker’s too slow for that. The right move is almost always to fold,” Keith explained, clicking away.<

What if I made a poker game that made each hand exciting?”>

“I like the idea,” he said, his Boston accent turning “idea” into “idear” “…but I’m still not going to play it.”

“Why not?” I asked, a bit offended that one of my best friends was already canning my make-believe game.

He finally looked over at me and said, “Why would I play any [poker] game that wasn’t for real money?”

I didn’t have an answer. Instead, I dramatically replied, “I’m going to make it—and you will love it.”

“Yeah, yeah,” Keith dismissed, and returned to staring at his screen.

Keith was right, of course. Poker is a game about the risk/reward/thrill of money. It’s a game designed to encourage folding. It involves real people, real money. With that, I put the idea out of my mind—or so I thought.

Birth of Lunchtime Studios

Some things just don’t go away. When people ask how Lords of New York began, in truth, I can’t remember. There was the conversation with Keith, and then it’s almost as if the idea began working on itself within my mind and I was just along for the ride.

Birth of Lunchtime Studios

In 2006, writing AI for real-time strategy games was my dream job. Sadly, that same year I lost it when Stainless Steel Studios, one of the premier real-time strategy game companies, shut its doors. To help recover, I began building my own games and game technology. While I had several titles in mind, the one screaming the loudest at me was 1.4;">Lords of New York then, but I still had the idea of taking out the boring parts of poker and making each hand fun.

In 2009, my wife and I formed Lunchtime Studios with the mission of building high-quality, story-based games that are innovative and memorable. Like our role-model companies The Walt Disney Studios and Pixar Animation Studios, we focus on compelling stories and build the technology needed to realize our vision. With that in mind, we knew Lords of New York, with its gameplay angle and strong story focus, was the right game for Lunchtime Studios’ first big step. We needed more than just story, though. The fundamental problems of a single-player poker rpg were still there, and we had no idea if we had a solution that worked.

Lunchtime Studios

Proof

After many hard years of building our own game engine, creating a new way to animate in 2D, holding down full-time jobs, as well as being parents, we began to show Lords of New York to the public. We got great, valuable fan feedback, which we used to make better and better demos. We started winning awards and after showcasing at PAX, my old friend and greatest critic Keith called me up. I sent him a link to our demo and waited.

“So…I checked it out,” said Keith, giving nothing away.

“And what did you think?”

For the next 15 minutes, I heard a laundry list of flaws—flaws a AAA game developer would find. Keith was right again, all things that need to be done or redone in the final year of development.

“I didn’t get the story, where’s the tournament?” Keith asked.

“It’s a prequel. The story begins 2 years later. It’s just to give you the flavor. It’s where you (Vince) learn to play poker. We’ll have a tutorial—things like that.” I said.

“Oh…Makes sense. The voices were funny, most of the time. But I clicked through some. I really just wanted to get to the poker.”

“How was it?” I asked, hoping my voice didn’t crack.

“Good. Really good. I love the poker abilities,” Keith said.

“Really?” I asked, my voice pitching up.

“Yeah, and I absolutely loved that quests motivate poker.”

The easiest way to explain what I thought about his feedback is to try and visualize a 1980s glam-rock concert. Imagine the lead singer standing, legs apart, holding his arm and microphone high in the air over his huge, hair-sprayed locks. Fireworks erupt behind him—the final note of his catchiest song blasts into the audience. Fair to say, I had a hair-spray moment.

PAX Prime Demo 2015 - https://www.youtube.com/watch?v=6VAqSthNYMM

Audience and Design

As a child, I remember standing in the doorway to our computer room, watching over my dad’s shoulder as he played games like Bards Tale* and Civilization*. Years later, on visits home I’d find him at the same desk, with the same coffee mug, deeply engrossed in the same type of single-player games. As much as I love playing MMORPGS (massively multiplayer online role-playing games) and other multiplayer games, I knew I’d always be a voice for people like my dad, the single-player gamers.

While writing AI for Empire Earth*, I found that other developers on the team had different notions about the goal of our AI. One of those developers (ok, it was Keith) began a conversation with:

“Since the point of the AI is to train people on how to play multiplayer games…”

Keith was surprised when I told him that multiplayer training wasn’t our goal. He argued against us building an AI targeted at single-player gamers. All these years later, I wonder how different the AI would have been if multiplayer training had been our goal.

The AI for Lords of New York is also a single-player one. It’s geared toward story, immersion, and personality. We don’t worry about training people for playing poker in Vegas. Our focus is on the emotion of the moment. We want to reward the gamers by making their experiences as rich as possible. Our AI must enhance the story and create moments that are powerful, funny, relatable—human.

The philosophy behind how we implement single-player AI is as important as choosing its goal. Typically, AI developers fall into two camps: One that prefers traditional AI, where there are established algorithms and rules. Traditional AIs tend to create difficult and fair opponents, such as IBM’s Deep Blue chess brain. The other is user-focused AI, where the emphasis is entirely on the user experience. Here, the rules are out the window. Much like the movies, a magic show, or novels, the entire focus is on the customer and building an amazing experience for them—whatever it takes.

Cheating and AI

If you’re ever curious about what type of AI developer you’re dealing with, ask them if it’s ok if an AI cheats. If the answer is no, then they build traditional AI. If yes, they had better follow up with, “But never get caught.” Nothing ruins an experience more than the illusion of the world being shattered by obvious AI cheats or incredibly predictable behaviors with no variance.

The funny thing about AI cheating is that it’s the key to having fun. In RTS (real-time strategy) games, a common cheat I would employ is to watch the military strength of my opponent. If the AI got too powerful compared to the player (depending on difficulty), then I’d find some of my own poor, unsuspecting soldiers standing off-screen and kill them until I was a better match for the human player. Another type of cheat I used was rewarding a player's decisions. If I saw that my opponent built lots of anti-aircraft guns, I’d send over more airplanes to be shot down. This gave the player that “Ah, good thing I built those guns” feeling. Cheating, as dirty as it sounds, is especially important in user-focused AI because the user experience trumps everything.

Luckily, cheating is built into Lords of New York as RPG (role-playing game) talents. The AI has them, and so do you. For example, if you have a great hand and continually raise, you will show anxiety on your character through body language. If you face an opponent with skills in reading body language, they will pick up that you have a good hand and may even call you out to the others. Before going deep and big into a hand, you may have to use defensive skills like a poker face while controlling your reactions for a time. If you face a cheating player with the peek ability, like you, they can catch a glimpse of one of your cards on the pre-flop. If you end up switching a bad card for one up your sleeve, however, they won’t know.

The most difficult cheating decision we face about whether the AI should know your hand or not is in preventing people from the early game ruining exploits of: “Start a poker match, go all in on a no-limit game. If they don’t win, reload.” That’s a tricky problem to solve since you can recover—and may even need to lose—your money at times in Lords of New York. We won’t eliminate early-game big moves, but we may do some very early game checking on the player to make sure he or she has the kind of hand that would allow a big move early in the game.

Will I hesitate to let the AI know your hand in key situations to make the game more enjoyable? Not at all, but it’s mostly unnecessary. The AI has its own talents to use and doesn’t need to know what you have in your hand to deploy them.

Random Numbers in AI

One of the big surprises about AI in games is that at times, a random number can be the perfect decision maker, be fun, and save large amounts of developer time. While developing Empire Earth, I designed a detailed attack strategy for the AI whereby it would find the weak spot of your town, put guys on a helicopter, and drop them at the most ignored spot they could find. Once players discovered they were being struck from the inside, they would shout out with amazement at the cleverness of an attack, and how they’d never seen such coordination and complex behavior in a game before.

Did it work? Well, the sneak attacks did, but only once did I ever hear about it through word of mouth or from user feedback on forums. That attack took quite a bit of developer time to code, and only once did anyone ever mention they saw (and understood) what was happening.

I learned my lesson, until Rise and Fall*, where this happened again. I would have the AI group up into divisions, combine into a massive force, and charge with a curved front line like you’d see in a battle scene from Lord of the Rings or Troy.

As it turns out, just like the Romans fighting in the woods of Germania, trees don’t make large formations easy to use. I did a lot of work for something that looked impressive and cinematic but could almost never happen given the terrain of the game, since we had very few massive, open spaces.

The feedback from gamers often produces some of the best ideas. I’d read forum debates about what our AI was thinking. I’d bite my lip and stay silent when they speculated on a sophisticated AI decision when I knew it was just a random number. Sometimes, however, the feedback would give me an idea for something purposeful. I’d change the random number solution to do exactly what that gamer thought the AI was doing in the first place. Don’t worry, your favorite game’s AI isn’t just random numbers. If that was all we needed, AIs would be easy to write and quite boring and confusing to play against. Random numbers are, however, an important element in AI, as many decisions are random numbers compared against a (it’s hoped) well-thought-out chance percent.

Build an AI Personality

Personality is the glue that binds story to poker. Building characters who make sense both in the story and at the poker table requires that our poker AI actions must make sense with actions in the story. To build characters who are believable in and out of poker, we combine poker logic and story, and craft a noticeable personality. We only achieve harmony if the personality traits match the poker talents, which match the character’s story. For example, Tony’s (playable expansion pack character) skills are in statistics and probabilities. It’s unlikely you’ll see him cheating, but he’s going to have a better idea of his chance to win than say, The Bull, who likes to taunt players.

When we build a character, we lay out its personality and role in the story, and then match its talents. Lastly, we set personality trait values to match the character’s skills and story. This means a player like Tony, who is incredibly smart at poker, will have higher scores in intelligence, memory, and risk analysis than other characters. In addition, while not related to a talent, since Tony is a fairly even-keeled guy, his temperament is also above average.

Information Centers

Personality can’t carry the AI alone. If we have an AI with no notion of good/strong hands, table positions, etc., then our characters wouldn’t work. We give the AI the tools to understand the game and world around them by building repositories of world information, called“information centers.”

Information centers are responsible for computing or knowing about a single area of information. Not all are just knowledge bases. Some track an AI’s mood, or may just occasionally answer questions such as “Is this a must-win hand?” A few information centers are:

●Poker Odds. Each player takes what they know of the current hand and computes the odds that he or she will win.

●History. What is the play style of the current players? When do they raise and win, when do they fold? Lots of different stats are tracked and are available to be queried by the AI.

●Situation. Answer questions for special-case situations.

●Motivation. Determine how badly the AI wants to win the current hand (vengeance, desperation, and so on).

●Quest. Our poker revolves around story. While there are games that can be played for just money, often there may be a quest-specific response to a hand that will trump some other decision factors.

●Mood. Important in when and how to use players’ poker talents.

Every time we need a new AI question that relates to the world, we use the information center pattern. The power of this pattern is apparent once we apply filters (or lenses).

Poker Odds information center (and current hand) displayed on each character’s portrait for debugging.

Filters

Filters live between the AI brain and the information centers. We want information centers to contain or compute correct information regardless of the type of opponent or his personality. It’s in the filter that we alter the data to make it match the opponent’s personality.

Think of a filter as a lens. If you look at text in a book through a fisheye lens, it looks much different than if you look at text through normal glass. The ability to distort the truth based on a character’s stats (such as intelligence, memory, temperament, and so on) is what allows us to build a base of poker AI truth that makes sense for the person you’re playing against.

One of the main heuristics used to alter how a filter changes the data coming out of an information center is a linked personality trait. For example, the more intelligent a player, the less the filter will alter the scores of “Is this a good hand?

The process of getting data out of an information center as seen in Figure 1 involves the following steps:

1.A request is made from the AI brain to an information center.

2.Input passes through a filter and may be altered on the way to the information center.

3.The information center computes the accurate result.

4.The result coming back from the information center may be altered by the filter.

5.The AI brain uses the filtered data, believing it is unfiltered data.

Figure 1: The process of sending a request and getting a result from an information center,

Here’s an example of accessing hand odds:

1.Given a hand of 9 of clubs and Q of hearts in a two-player game, ask for the odds.

2.Data is unaltered and flows into the Hand Odds information center.

3.Hand odds computes that there is a 55.3 percent chance of winning in this pre-flop round.

4.The filter uses intelligence to shift the result within a range, resulting in a 42.8 percent chance.

5.The AI now thinks that they have a 42.8 percent chance to win and uses this as one of several data points needed to make a “check, raise/bet, fold, use ability” decision.

Strategy

The primary strategy in Lords of New York is to learn about a character’s personality, then work to exploit his weaknesses in poker. By using filters to manipulate how the AI sees data means that the gamer (human) can target those elements of a character's personality to influence play. For example, you are Vince, skilled in cheating and intimidation, playing against Tony, an intelligent player, skilled in statistics and probability. He’s such a smart guy that to get him off his game, you’ll want to rely on things that affect his intelligence and memory, such as poison or spiking his drink to cloud his mind.

Matching your tactics to the right player’s personality is key. Targeting intelligence against a player like The Bull, who’s an aggressive, intimidating guy, would likely have little impact.

All Together Now

In cooking, there are so many things that can go wrong with a dish. Great ingredients have to be added in the correct proportions and cooked just the right way to unlock their secrets and lead to perfect textures and flavors. Even with an expertly crafted dish, all that work can result in a bland experience without the right seasoning. Building a character for Lords of New York is no different. Just like a fantastic meal is the result of ingredients, cooking, and seasoning coming together in the right way, we blend together animation, voice, story, poker talents, and personality to build believable, flawed, human characters.

We match a character's story with his poker talents and those to his personality. We want gamers to figure out their opponents through their own innate knowledge of human behavior. They won’t realize they are doing this, and that’s the point. Be so immersed in the story that poker and the reason for playing poker makes intuitive sense. We’ve a ways to go, but an article written by SPONG summed up Lords of New York perfectly: “…it instead focuses on what people really care about: other people.”

A game about people—not poker, but people. If building a game was like cooking a meal, then personality is the salt in our dish. Personality alone isn’t impactful, but when added correctly, it enhances what’s already there. Without personality, gamers won’t connect to our characters, won’t know how to predict their actions, and won’t know how to defeat their opponents. If all ingredients are added just right, gamers won’t think about characters as poker opponents, but as people, and that’s what Lords of New York is all about.

↧

面向 WebRTC 的英特尔® 协同通信开发套件应用研讨会 — 邀请函测试

October 25, 2015, 8:07 pm

Latest and popular articles on Intel Technologies

≫ Next: Android* platform: What game engines, libraries, and APIs should I choose?

≪ Previous: Building a Personality-Driven Poker AI for Lords of New York*

尊敬的嘉宾：

WebRTC (Web Based Real Time Communication) 是 W3C/IETF 共同制定的开放的音视频通信协议，被认为是实现融合通信战略的关键技术之一。据权威机构预测，到 2019 年支持 WebRTC 的终端设备将达到 67 亿个。由于其卓越的通信质量及无缝的应用内嵌能力，基于 WebRTC 的通信解决方案近年来已经被全球众多的知名企业部署和使用。

面向 WebRTC 的英特尔® 协同通信开发套件 (Intel® Collaboration Suite for WebRTC) 是构筑于 WebRTC 标准之上的完整的音视频通信的软件解决方案，能够轻松实现一对一，一对多以及多对多的通信方式，并针对英特尔的平台作了深度优化，同时支持 PC，Android 以及 iOS 客户端。面向 WebRTC 的英特尔® 协同通信开发套件已经在欧美和亚太市场获得广泛应用，应用领域包括视频社交、在线直播、远程医疗、电子教学、视频监控、企业协作、可穿戴设备、智能家居等。

随着最新 2.8 版本的发布，我们诚邀阁下参加由英特尔系统技术及优化部门主办的面向 WebRTC 的英特尔® 协同通信开发套件应用研讨会并获得面对面的开发培训包括实战演练。

时间：2015年11月12日
地点：北京中关村皇冠假日酒店多功能厅A

会议日程

Time 时间	Subject 主题	Speaker 演讲嘉宾
9:00 – 9:30	Sign-in 签到	Doug Sommer 英特尔公司软件及服务事业部CSO 部门总监
9:30 – 9:40	Opening 欢迎致辞	张琦英特尔公司Web技术及优化部门总监
9:40 – 10:10	Intel® CS for WebRTC Product Strategy 面向 WebRTC 的英特尔® 协同开发套件产品战略
10:10 – 10:20	Tea Break茶歇
10:20 – 11:20	Intel® CS for WebRTC Client SDK 面向 WebRTC 的英特尔® 协同开发套件 ——客户端软件介绍	英特尔公司WebRTC项目资深软件工程师
11:20 – 12:00	Intel® CS for WebRTC MCU & Gateway 面向 WebRTC 的英特尔® 协同开发套件 ——服务器端软件及网关介绍	英特尔公司WebRTC项目资深软件工程师
12:00 – 13:00	Lunch 午休
13:00 – 13:40	WebRTC Performance Optimization by Intel Hardware 基于英特尔硬件的 WebRTC 性能优化和提升技巧介绍	英特尔公司WebRTC项目资深软件工程师
13:40 – 14:20	Customer Case Study Sharing 客户案例分享
14:20 – 14: 30	Tea Break 茶歇
14:30 – 16: 30	Hands on Lab 动手实验室——开发实战演练	英特尔公司WebRTC项目资深软件工程师

会议地点

北京中关村皇冠假日酒店多功能厅A
地址 / Add：中国北京市海淀区知春路106号 100086
No.106 Zhi Chun Road, Haidian District, Beijing 100086,P.R.China
电话 / Tel: (86 10) 5993 8888

版权所有 © 2015 英特尔公司。保留所有权利。英特尔和 Intel 标识是英特尔公司在美国其他国家（地区）的商标。
*文中涉及的其他名称及商标为各自所有者产权所有。

↧

Android* platform: What game engines, libraries, and APIs should I choose?

October 26, 2015, 8:27 am

Latest and popular articles on Intel Technologies

≫ Next: An Introduction to MPI-3 Shared Memory Programming

≪ Previous: 面向 WebRTC 的英特尔® 协同通信开发套件应用研讨会 — 邀请函测试

There are a lot of games available for the Android* platform. Independent developers can find it difficult to choose which tool, library, or API to use when developing a game. This article describes the best tools and engines to use for game development.

Google Play* Games Services

Google Play Games Services provides the Android SDK, which is equipped with all the tools and software to help developers to produce a fairly solid game. The Android SDK is intended for developers to package APIs that allow implementation with Google+ services. Because it’s a cloud-connected SDK and platform, developers can store data about players, game progress, achievements, and more in the cloud.

With Google+, developers get Google’s handy tools to make their games more social. To use game services, developers can set up the Google Play services SDK and study the game services samples to learn how to use the major components of the SDK. The SDK contains detailed documentation for Google Play game services. For quick access while developing apps, the API reference is available.

One more interesting feature is the ability to sync game data between the Web and Android games. In this way, the same game can be played on multiple platforms and data can be stored in the cloud. All Android devices from Android 2.2 that have the Google Play Store are all equipped with the Play Games capabilities.

Unity* game engine

The differences between platforms often mean having to use different programming languages and separate APIs, and dealing with different behaviors. Multiplatform game engines have become the go-to tool. One such game engine that is most popular among Android developers is Unity.

Unity can be used to create a game that can be used on computers, smart phones, the iPhone*, PlayStation*3, and even Xbox*. Unity provides an entire ecosystem for game development. This game development tool consists of a powerful rendering engine, fully integrated with a complete set of intuitive tools and fast workflow guides to create interactive 3D content, easy publishing on multiple platforms, and thousands of high-quality ready resources in the Asset Store.

The assets supporting the object image provided is quite diverse, ranging from the simplest 2D asset to a complex 3D one. Moreover, Unity is able to take pictures of software assets such as Autodesk 3ds Max*, Autodesk Maya*, Softimage*, Blender*, MODO*, ZBrush*, Cinema 4D*, Cheetah 3D*, Adobe Photoshop*, Adobe Fireworks*, and Allegorithmic Substance*.

The game engine also supports application development languages such as C#, UnityScript* (in the form of JavaScript*), and can be integrated with the Boo Script Python language. Games developed in Unity will be able to support multiple platforms including iOS*, Android, Windows* 8, Windows Phone* 8, BlackBerry* 10, Mac*, Windows, Linux*, Web Player*, PlayStation 3, Xbox 360, and Wii* U.

Some famous games built on the Unity Engine are Dead Trigger*, Bad Piggies*, Temple Run* 2, Three KingdomsOnline*, DreadOut*, Galactic Rush*, Roly Poly Penguin*, Eyes On Dragon*, and many more. Unity 4 is free.

App Game Kit*

The App Game Kit is a cross-platform game development language and library. The tools provided allow AGK Basic apps to be wirelessly broadcast to devices for testing. The App Game Kit community is very helpful, and the developers frequently publish tutorials in addition to regular documentation. Games can be developed through the AGK IDE in AGKBasic, or the libraries can be used with either C++ or Pascal. The software produced with the App Game Kit is written in a language called AGK Script. This language has powerful commands including commands for 2D graphics, physics, and networking. The commands make use of the platforms' native functions to improve performance. They are also designed to enhance code readability. The AGK Script commands have extensive online documentation. It contains many commands for OpenGL* 3D graphics and Shader deployment. One of the problems of using this tool is that it has a lot of bugs. A quick look at the release notes for each new version will quickly show that more time seems to be spent fixing issues with the existing command set than introducing new and improved features. For example, the latest version has a serious bug whereby Android apps that are placed in the background result in the app showing just a black screen. Another problem is that documentation is minimal.

Cocos2D*

Cocos2d-x is an open source cross-platform game framework written in C++/JavaScript/Lua. It can be used to build games, apps, and other interactive programs. Cocos2d-x allows developers to exploit their existing C++, Lua, and JavaScript knowledge for cross-platform deployment into iOS, Android, Windows Phone, Mac OS X*, Windows desktop, and Linux, saving time, effort, and cost. Cocos2d-x is fast, easy to use, and loaded with powerful features.

Cocos2D-x is not only open source but also supported by Chukong Technologies of China and the United States. The framework is regularly updated, and support is regularly added for the latest technologies. 2014 has already seen the release of version 3, a new Cocos Studio development toolkit (optional), and support for new technologies like skeleton animation systems Spine* and Adobe DragonBone*. This tool supports Lua and JavaScript with full-feature support. Especially with Cocos2d-JS developers can develop games—cross-web and native—and the native solutions have great performance with JS Bindings, much better than with using a hybrid solution. Unfortunately this tool is not popular among Android developers, so users won’t be able to find a lot of games in Google Play that use this engine.

Monkey* X Pro

The Monkey engine is a next-generation games programming language. Developers can create apps on multiple platforms with great ease. The engine works by translating Monkey code to one of a different number of languages at compile time, including C++, C#, Java*, JavaScript, and ActionScript*. It is possible to write code once for multiple platforms, including iOS, Android, Windows Phone, HTML5, Flash*, Windows, OS X*, Linux, and more. Develop using Windows, OS X, or Linux.

Monkey X has a selection of great built-in modules: graphics, audio, input, data and file systems, networking, math, text and strings, collections, and online services.

Developers are not restricted to only the modules they get from the official release. They can even build an "app" module. It feels limitless. In comparison to other cross-platform solutions, with Monkey X developers actually get the translated source code, which they can play with.

Monkey is an easy-to-learn language that's object-oriented, modular, statically typed, and garbage collected. Language features include classes, inheritance, generics, interfaces, reflection, exceptions, pre-processor directives, and native code support.

As an App Game Kit, this tool has poor documentation. The documentation contains a reasonably detailed language overview and a somewhat generated list of the included modules, classes, and methods. Module descriptions are rather lax, but usually present. Method descriptions tend to be short, and a majority of them contain no usage snippets; most parameters have minimal descriptions. And besides GitHub, there are no community collaboration features to help improve it.

Godot*

Godot is a fully featured, open source, MIT-licensed game engine. It focuses on having great tools and a visually oriented workflow that can export to PC, mobile, and Web platforms with no hassle. The editor, language, and APIs are feature rich, yet simple to learn, allowing developers to become productive in a matter of hours.

Godot has its own scripting language called GDscript. The scripting language is easy to learn with a Python-like format, but it is not Python. Rather, it is a mix of JavaScript, PHP, and C++. It's very powerful, and it's free of unnecessary things because it's designed for one purpose.

It can be used to add custom behaviors to any object by extending it with scripting, using the built-in editor with syntax highlighting and code completion.

Download PDF

A built-in debugger with breakpoints and stepping can be used and graphs for possible bottlenecks can be checked.

Conclusion

In this article we described several engines and tools for game development. All of the tools are powerful. For fast development of mobile games, developers should choose the tool that is easiest to use. Developers also need to determine the tool that can best meet the needs of their tasks.

About the Author

Vitaliy Kalinin works in the Software and Services Group at Intel Corporation. He is a PhD student at Lobachevsky State University in Nizhny Novgorod, Russia. He has a bachelor’s degree in economics and mathematics and a master’s degree in applied economics and informatics. His main interest is mobile technologies and game development.

↧

An Introduction to MPI-3 Shared Memory Programming

October 26, 2015, 10:03 am

Latest and popular articles on Intel Technologies

≫ Next: Mastering Performance Challenges with the New MPI-3 Standard

≪ Previous: Android* platform: What game engines, libraries, and APIs should I choose?

The Message Passing Interface (MPI) standard is a widely used programming interface for distributed memory systems. Hybrid parallel programming on many-core systems most often combines MPI with OpenMP*. This MPI/OpenMP approach uses an MPI model for communicating between nodes while utilizing groups of threads running on each computing node in order to take advantage of multicore/many-core architectures such as Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors.

The MPI-3 standard introduces another approach to hybrid programming that uses the new MPI Shared Memory (SHM) model.1 The MPI SHM model, supported by Intel® MPI Library Version 5.0.2 enables changes to existing MPI codes incrementally in order to accelerate communication between processes on the shared-memory nodes.

In this article, we present a tutorial on how to start using MPI SHM on multinode systems using Intel Xeon with Intel Xeon Phi. The article uses a 1-D ring application as an example and includes code snippets to describe how to transform common MPI send/receive patterns to utilize the MPI SHM interface. The MPI functions that are necessary for internode and intranode communications will be described. A modified MPPTEST benchmark has been used to illustrate performance of the MPI SHM model with different synchronization mechanisms on Intel Xeon and Intel Xeon Phi based clusters. With the help of Intel MPI Library Version 5.0.2, which implements the MPI-3 standard, we show that the shared memory approach produces significant performance advantages compared to the MPI send/receive model.

Download complete article (PDF) Download Download

↧

Mastering Performance Challenges with the New MPI-3 Standard

October 26, 2015, 10:10 am

Latest and popular articles on Intel Technologies

≫ Next: Enabling IPP on OpenCV ( Windows* and Linux* Ubintu* )

≪ Previous: An Introduction to MPI-3 Shared Memory Programming

This article demonstrates the performance benefits of the MPI-3 nonblocking collective operations supported by the Intel® MPI Library 5.0 and the Intel® MPI Benchmarks (IMB) 4.0 products. We’ll show how to measure the overlap of communication and computation, and demonstrate how an MPI application can benefit from the nonblocking collective communication.

The Message Passing Interface (MPI) standard is a widely used programming interface for distributed memory systems. The latest MPI-3 standard contains major new features such as nonblocking and neighbor collective operations, extensions to the Remote Memory Access (RMA) interface, large count support, and new tool interfaces. The use of large counts lets developers seamlessly operate large amounts of data, the fast, one-sided operations strive to speed up Remote Memory Access (RMA)-based applications, and the nonblocking collectives enable developers to better overlap the computation and communication parts of their applications and exploit potential performance gains.

Here, we concentrate on two major MPI-3 features: nonblocking collective operations (NBC) and the new RMA interface. We evaluate the effect of communication and computation overlap with NBC, and describe potential benefits of the new RMA functionality.

Download complete article (PDF) Download Download

↧

Enabling IPP on OpenCV ( Windows* and Linux* Ubintu* )

October 26, 2015, 9:54 pm

Latest and popular articles on Intel Technologies

≫ Next: Using the Intel® RealSense™ Camera with TouchDesigner*: Part 1

≪ Previous: Mastering Performance Challenges with the New MPI-3 Standard

To set up the environment (Windows* systems):

Configuration of OpenCV 3.0.0 – Enabling IPP
- Download OpenCV 3.0.0( http://www.opencv.org/ ) and CMake-3.2.3 (http://www.cmake.org/download/ )
- Extract OpenCV where you want and install CMake and run CMake.
- Add OpenCV’s location as the source location and choose a location where you want your build will be created.
- To enable IPP you have 2 options. One you can just use ‘ICV’ that is a special IPP build for OpenCV which is free and the other option is that you can use your IPP from any Intel® software tool suites ( Intel® System Studio or Intel® Parallel Studio )if you have one.
- To go with ICV just have WITH_IPP on. ICV package will download automatically and cmake configuration will catch it.
- In order to enable IPP from Intel® Software Suites , you need to manually add an entry for IPP as well on top of setting WITH_IPP. Click ‘Add Entry’ and type in its name as ‘IPPROOT’. Choose its type as PATH and insert where your IPP is located.
- If configuration gets done without a problem. Then it is ready to go

To set up the environment (Linux* Ubuntu* systems):

Configuration of OpenCV 3.0.0 – Enabling IPP
- Download OpenCV 3.0.0( http://www.opencv.org/ )
- Extract OpenCV where you want
- Open a terminal and go to where you extracted OpenCV
- As the same as Windows case, you can go with either ICV or IPP
- For ICV, type 'cmake -D WITH_IPP=ON .'
- Example configuration result for ICV
- For IPP, type 'cmake -D WITH_IPP=ON -D IPPROOT=<Your IPP Location> .'
- Example configuration result for IPP
- If the configuration went without a problem, then proceed and type 'make -j4'
- When building is done, type 'make install' to filnally install the library

↧

Using the Intel® RealSense™ Camera with TouchDesigner*: Part 1

October 27, 2015, 8:24 am

Latest and popular articles on Intel Technologies

≫ Next: Multithreaded code optimization in PARSEC* 3.0: BlackScholes

≪ Previous: Enabling IPP on OpenCV ( Windows* and Linux* Ubintu* )

Download Demo Files ZIP 35KB

TouchDesigner*, created by Derivative, is a popular platform/program used worldwide for interactivity and real-time animations during live performances as well as rendering 3D animation sequences, building mapping, installations and recently, VR work. The support of the Intel® RealSense™ camera in TouchDesigner makes it an even more versatile and powerful tool. Also useful is the ability to import objects and animations into TouchDesigner from other 3D packages using .fbx files, as well as taking in rendered animations and images.

In this two-part article I explain how the Intel RealSense camera is integrated into and can be used in TouchDesigner. The demos in Part 1 use the Intel RealSense camera TOP node. The demos in Part 2 use the CHOP node. In Part 2, I also explain how to create VR and full-dome sequences in combination with the Intel RealSense camera. I show how TouchDesigner’s Oculus Rift node can be used in conjunction with the Intel RealSense camera. Both Part 1 and 2 include animations and downloadable TouchDesigner files, .toe files, which can be used to follow along. To get the TouchDesigner (.toe) files click on the button on the top of the article. In addition, a free noncommercial copy of TouchDesigner which is fully functional (except that the highest resolution has been limited to 1280 by 1280), is available.

Note: There are currently two types of Intel RealSense cameras, the short range F200, and the longer-range R200. The R200 with its tiny size is useful for live performances and installations where a hidden camera is desirable. Unlike the larger F200 model, the R200 does not have finger/hand tracking and doesn’t support "Marker Tracking." TouchDesigner supports both the F200 and the R200 Intel RealSense cameras.

To quote from the TouchDesigner web page, "TouchDesigner is revolutionary software platform which enables artists and designers to connect with their media in an open and freeform environment. Perfect for interactive multimedia projects that use video, audio, 3D, controller inputs, internet and database data, DMX lighting, environmental sensors, or basically anything you can imagine, TouchDesigner offers a high performance playground for blending these elements in infinitely customizable ways."

I asked Malcolm Bechard, senior developer at Derivative, to comment on using the Intel RealSense camera with TouchDesigner:

"Using TouchDesigner’s procedural node-based architecture, Intel RealSense camera data can be immediately brought in, visualized, and then connected to other nodes without spending any time coding. Ideas can be quickly prototyped and developed with an instant-feedback loop.Being a native node in TouchDesigner means there is no need to shutdown/recompile an application for each iteration of development.The Intel RealSense camera augments TouchDesigner capabilities by giving the users a large array of pre-made modules such as gesture, hand tracking, face tracking and image (depth) data, with which they can build interactions. There is no need to infer things such as gestures by analyzing the lower-level hand data; it’s already done for the user."

Using the Intel® RealSense™ Camera in TouchDesigner

TouchDesigner is a node-based platform/program that uses Python* as its main scripting language. There are five distinct categories of nodes that perform different operations and functions: TOP nodes (textures), SOP nodes (geometry), CHOP nodes (animation/audio data), DAT nodes (tables and text) and COMP nodes (3D Geometry nodes and nodes for building 2D control panels), and MAT nodes (materials). The programmers at TouchDesigner consulting with Intel programmers designed two special nodes: the Intel RealSense camera TOP node and the Intel RealSense camera CHOP node to integrate the Intel RealSense camera into the program.

Note: This article is aimed at those familiar with using TouchDesigner and its interface. If you are unfamiliar with TouchDesigner and plan to follow along with this article step-by-step, I recommend that you first review some of the documentation and videos available here:

Learning TouchDesigner

Note: When using the Intel RealSense camera, it is important to pay attention to its range for best results. On this Intel web page you will find the range of each camera and best operating practices for using it.

Intel RealSense Camera TOP Node

The TOP nodes in TouchDesigner perform many of the same operations found in a traditional compositing program. The Intel RealSense camera TOP node adds to these capabilities utilizing the 2D and 3D data feed that the Intel RealSense camera feeds into it. The Intel RealSense camera TOP node has a number of setup settings to acquire different forms of data.

Color. The video from the Intel RealSense camera color sensor.
Depth. A calculation of the depth of each pixel. 0 means the pixel is 0 meters from the camera, and 1 means the pixel is the maximum distance or more from the camera.
Raw depth. Values taken directly from the Intel® RealSense™ SDK. Once again 0 means 1 meter from the camera and 1 is the maximum range or more away from the camera.
Visualized depth. A gray-scale image from the Intel RealSense SDK that can help you visualize the depth. It cannot be used to actually determine a pixel’s exact distance from the camera.
Depth to color UV map. The UV values from a 32-bit floating RG texture (note, no blue) that are needed to remap the depth image to line up with the color image. You can use the Remap TOP node to align the images to match.
Color to depth UV map. The UV values from a 32-bit floating RG texture (note, no blue) that are needed to remap the color image to line up with the depth image. You can use the Remap TOP node to align the two.
Infrared. The raw video from the infrared sensor of the Intel RealSense camera.
Point cloud. Literally a cloud of points in 3D space (x, y, and z coordinates) or data points created by the scanner of the Intel RealSense camera.
Point cloud color UVs. Can be used to get each point’s color from the color image stream.

Note: You can download that toe file, RealSensePointCloudForArticle.toe, to use as a simple beginning template for creating a 3D animated geometry from the data of the Intel RealSense camera. This file can be modified and changed in many ways. Together, the three Intel RealSense camera TOP nodes—the Point Cloud, the Color, and the Point Cloud Color UVs—can create a 3D geometry composed of points (particles) with the color image mapped onto it. This creates many exciting possibilities.

Point Cloud Geometry. This is an animated geometry made using the Intel RealSense camera. This technique would be exciting to use in a live performance. The audio of the character speaking could be added as well. TouchDesigner can also use the data from audio to create real-time animations.

Intel RealSense Camera CHOP Node

Note: There is also an Intel RealSense camera CHOP node that controls the 3D tracking/position data that we will discuss in Part 2 of this article.

Demo 1: Using the Intel RealSense Camera TOP Node

Click on the button on top of the article to get the First TOP Demo: settingUpRealNode2b_FINAL.toe

Demo 1, part 1: You will learn how to set up the Intel RealSense camera TOP node and then connect it to other TOP nodes.

Open the Add Operator/OP Create dialog window.
Under the TOP section, click RealSense.
On the Setup parameters page for the Intel RealSense camera TOP node, for Image select Color from the drop-down menu. In the Intel RealSense camera TOP node, the image of what the camera is pointing to shows up, just as in a video camera.
Set the resolution of the Intel RealSense Camera to 1920 by 1080.

The Intel RealSense camera TOP node is easy to set up.
Create a Level TOP and connect it to the Intel RealSense camera TOP node.
In the Pre parameters page of the Level TOP Node, choose Invert and slide the slider to 1.
Connect The Level TOP node to an HSV To RGB TOP node and then connect that to a Null TOP node.

The Intel RealSense camera TOP node can be connected to other TOP nodes to create different looks and effects.

Next we will put this created image into the Phong MAT (Material) so we can texture geometries with it.

Using the Intel RealSense Camera Data to Create Textures for Geometries

Demo 1, part 2: This exercise shows you how to use the Intel RealSense camera TOP node to create textures and how to add them into a MAT node that can then be assigned to the geometry in your project.

Add a Geometry (geo) COMP node into your scene.
Add a Phong MAT node.
Take the Null TOP node and drag it onto the Color Map parameter of your Phong MAT node.

The Phong MAT using the Intel RealSense camera data for its Color Map parameter.
On the Render parameter page of your Geo COMP for the Material parameter add type phong1 to make it use the phong1 node as its material.

The Phong MAT using the Intel RealSense camera data for its Color Map added into the Render/Material parameter of the Geo COMP node.

Creating the Box SOP and Texturing it with the Just Created Phong Shader

Demo 1, part 3: You will learn how to assign the Phong MAT shader you created using the Intel RealSense camera data to a box Geometry SOP.

Go into the geo1 node to its child level, (/project1/geo1).
Create a Box SOP node, a Texture SOP node, and a Material SOP node.
Delete the Torus SOP node that was there and connect the box1 node to the texture1 node and the material1 node.
In the Material parameter of the material1 node enter: ../phong1 which will refer it to the phong1 MAT node you created in the parent level.
To put the texture on each face of the box, in the parameters of the texture1 node, Texture/Texture Type, put face and set the Texture/Offset put .5 .5 .5.

At the child level of the geo1 COMP node, the Box SOP node, the Texture SOP node, and the Material SOP node are connected. The Material SOP is now getting its texture info from the phong1 MAT node which is at the parent level. ( …/phong1).

Animating and Instancing the Box Geometry

Demo 1, part 4: You will learn how to rotate a Geometry SOP using the Transform SOP node and a simple expression. Then you will learn how to instance the Box geometry. We will end up with a screen full of rotating boxes with the textures from the Intel RealSense camera TOP node on them.

To animate the box rotating on the x-axis, insert a Transform SOP node after the Texture SOP node.
Put an expression into the x component (first field) of the Rotate parameter in the transform1 SOP node. This expression is not dependent on the frames so it will keep going and not start repeating when the frames on the timeline run out. I multiplied by 10 to increase the speed: absTime.seconds*10

Here you can see how the cube is rotating.
To make the boxes, go up to the parent level (/project1) and in the Instance page parameters of the geo1 COMP node, for Instancing change it to On.
Add a Grid SOP node and a SOP to the DAT node.
Set the grid parameters to 10 Rows and 10 Columns and the size to 20 and 20.
In the SOP to DAT node parameters, for SOP put grid1 and make sure Extract is to set Points.
In the Instance page parameters of the geo1 COMP, for Instance CHOP/DAT enter: sopto1.
Fill in the TX, TY, and TZ parameters with P(0), P(1), and P(2) respectively to specify which columns from the sopto1 node to use for the instance positions.

Click on the button on top of the article to download this .toe file to see what we have done so far in this first Intel RealSense camera TOP demo.
TOP_Demo1_forArticle.toe
If you prefer to see the image in the Intel RealSense camera unfiltered, disconnect or bypass the Level TOP node and the HSV to RGB TOP node.

Rendering or Performing the Animation Live

Demo 1, part 5: You will learn how to set up a scene to be rendered and either performed live or rendered out as a movie file.

To render the project, add in a Camera COMP node, a Light COMP node, and a Render TOP node. By default the camera will render all the Geometry components in the scene.
Translate your camera about 20 units back on the z-axis. Leave the light at the default setting.
Set the resolution of the render to 1920 by 1080. By default the background of a render is transparent (alpha of 0).
To make this an opaque black behind the squares, add in a Constant TOP node and change the Color to 0,0,0 so it is black while leaving the Alpha as 1. You can choose another color if you want.
Add in an Over TOP node and connect the Render TOP node to the first hook up and the Constant TOP node to the second hook up. This makes the background pixels of the render (0, 0, 0, 1), which is no longer transparent.

Another way to change the alpha of a TOP to 1 is to use a Reorder TOP and set its Output Alpha parameter to Input 1 and One.

Shows the rendered scene with the background being set to opaque black.

Here you can see the screen full of the textured rotating cubes.

If you prefer to render out the animation instead of playing it in real time in a performance you must choose the Export movie Dialog box under file in the top bar of the TouchDesigner program. In the parameter for the TOP Video, enter null2 for this particular example. Otherwise enter any TOP node that you want to render.

Here is the Export Movie panel, and null2 has been pulled into it. If I had an audio CHOP to go along with it, I would pull or place that into the CHOP Audio slot directly under where I put null2.

Demo 1, part 6: One of the things that makes TouchDesigner a special platform is the ability to do real-time performance animations with it. This makes it especially good when paired with the Intel RealSense Camera.

Add a Window COMP node and in the operator parameter enter your null2 TOP node.
Set the resolution to 1920 by 1080.
Choose the Monitor you want in the Location parameter. The Window COMP node lets you perform the entire animation in real time projected onto the monitor you choose. Using the Window COMP node you can specify the monitor or projector you want the performance to be played from.

You can create as many Window COMP nodes as you need to direct the output to other monitors.

Demo 2: Using the Intel RealSense Camera TOP Node Depth Data

The Intel RealSense camera TOP node has a number of other settings that are useful for creating textures and animation.

In demo 2, we use the depth data to apply a blur on an image based on depth data from the camera. Click on the button on top of the article to get this file: RealSenseDepthBlur.toe

First, create an Intel RealSense camera TOP and set its Image parameter to Depth. The depth image has pixels that are 0 (black) if they are close to the camera and 1 (white) if they are far away from the camera. The range of the pixel values is controlled by the Max Depth parameter which is specified in Meters. By default it has a value of 5 which means pixels 5 or more meters from the camera will be white. A pixel with a value of 0.5 will be 2.5 meters from the camera. Depending on how far the camera is from you changing this value to something smaller may be good. For this example we’ve changed it to 1.5 meters.

Next we want to process the depth a bit to remove objects outside our range interest, which we will do using a Threshold TOP.

Create a Threshold TOP and connect it to the realsense1 node. We want to cull out pixels that beyond a certain distance from the camera so set the Comparator parameter to Greater and set the Threshold parameter to 0.8. This makes pixels that are greater than 0.8 (which is 1.2 meters or greater if we have Max Depth in the Intel RealSense camera TOP set to 1.5), become 0 and all other pixels become 1.
Create a Multiply TOP and connect the realsense1 node to the first input and the thresh1 node to the 2nd input. Multiplying the pixels we want by 1 will leave them as-is and others by 0 make them back. The multiply1 node now has only pixels greater than 0 for the part of the image you want to control the blur we will do next.
Create a Movie File in TOP, and select a new image for its File parameter. In this example we select Metter2.jpg from the TouchDesigner Samples/Map directory.
Create a Luma Blur TOP and connect moviefilein1 to the 1st input of lumablur1 and multiply1 to the 2nd input of lumablur1.
In the parameters for lumablur1 set White Value to 0.4, Black Filter Width to 20, and White Filter Width to 1. This makes pixels where the first input is 0 have a blur filter width of 20 and a pixels with a value of 0.4 or greater have a blur width of 1.

The whole layout.

The result is an image where the pixels where the user is located are not blurred while other pixels are blurry.

The background, by putting on the display of the Luma Blur TOP shows how the image is blurred.

Demo 3: Using the Intel RealSense Camera TOP Node Depth Data with the Remap TOP Node

Click on the button on the article top to get this file: RealSenseRemap.toe

Note: The depth and color cameras of the Intel RealSense camera TOP node are in different spots in the world so their resulting images by default do not line up. For example if your hand is positioned in the middle of the color image, it won’t be in the middle of the depth image, it will either be off to the left or right a bit. The UV remap fixes this by shifting the pixels around so they align on top of each other. Notice the difference between the aligned and unaligned TOPs.

The Remap TOP aligns the depth data from the Intel RealSense camera TOP with the color data from the Intel RealSense camera TOP, using the depth to color UV data, putting them in the same world space.

Demo 4: Using Point Cloud in the Intel RealSense Camera TOP Node

Click on the button on top of the article to get this file: PointCloudLimitEx.toe

In this exercise you learn how to create animated geometry using the Intel RealSense camera TOP node point Cloud setting and the Limit SOP node. Note that this technique is different than the Point Cloud example file shown at the beginning of this article. The previous example uses GLSL shaders, which results in the ability to generate far more points, but it is more complex to do and out of the scope of this article.

Create a RealSense TOP node and set the parameter Image to Point Cloud.
Create a TOP to CHOP node and connect it to a Select CHOP node.
Connect the Select CHOP node to a Math CHOP node.
In the topto1 CHOP node parameter, TOP, enter: realsense1.
In the Select CHOP node parameters, Channel Names, enter r g b leaving a space between the letters.
In the math1 CHOP node for the Multiply parameter, enter: 4.2.
On the Range parameters page, for To Range, enter: 1 and 7.
Create a Limit SOP node.

To quote from the information on the www.derivative.ca online wiki page, "The Limit SOP creates geometry from samples fed to it by CHOPs. It creates geometry at every point in the sample. Different types of geometry can be created using the Output Type parameter on the Channels Page."

In the limit1 CHOP Channels parameters page, enter r in the X Channel, g in the Y Channel, and b in the Z Channel.

Note: Switching the r g and b to different X Y or Z channels changes the geometry being generated. So you might want to try this later: In the Output parameter page, for Output Type select Sphere at Each Point from the drop-down. Create a SOP to DAT node. In the parameters page, for SOP put in limit1 or drag your limit1 CHOP into the parameter. Keep the default setting of Points in the Extract parameter. Create a Render TOP node, a Camera COMP node, and a Light COMP node. Create a Reorder TOP and make Output Alpha be Input 1 and One and connect it to the Render TOP.

As the image in the Intel RealSense camera changes, so does the geometry. This is the final layout.

Final images in the Over TOP CHOP node. By changing the order of the channels in the Limit TOP parameters you change the geometry which is based on the point cloud.

In Part 2 of this article we will discuss the Intel RealSense camera CHOP and how to create content both rendered and in real-time for performances, Full Dome shows, and VR. We will also show how to use the Oculus Rift CHOP node.

About the Author

Audri Phillips is a visualist/3d animator based out of Los Angeles, with a wide range of experience that includes over 25 years working in the visual effects/entertainment industry in studios such as Sony, Rhythm and Hues, Digital Domain, Disney, and Dreamworks feature animation. Starting out as a painter she was quickly drawn to time based art. Always interested in using new tools she has been a pioneer of using computer animation/art in experimental film work including immersive performances. Now she has taken her talents into the creation of VR. Samsung recently curated her work into their new Gear Indie Milk VR channel.

Her latest immersive work/animations include: Multi Media Animations for "Implosion a Dance Festival" 2015 at the Los Angeles Theater Center, 3 Full dome Concerts in the Vortex Immersion dome, one with the well-known composer/musician Steve Roach. She has a fourth upcoming fulldome concert, "Relentless Universe", on November 7th, 2015. She also created animated content for the dome show for the TV series, “Constantine” shown at the 2014 Comic-Con convention. Several of her Fulldome pieces, “Migrations” and “Relentless Beauty”, have been juried into "Currents", The Santa Fe International New Media Festival, and Jena FullDome Festival in Germany. She exhibits in the Young Projects gallery in Los Angeles.

She writes online content and a blog for Intel. Audri is an Adjunct professor at Woodbury University, a founding member and leader of the Los Angeles Abstract Film Group, founder of the Hybrid Reality Studio (dedicated to creating VR content), a board member of the Iota Center, and she is also an exhibiting member of the LA Art Lab. In 2011 Audri became a resident artist of Vortex Immersion Media and the c3: CreateLAB.

↧

Multithreaded code optimization in PARSEC* 3.0: BlackScholes

October 27, 2015, 8:17 am

Latest and popular articles on Intel Technologies

≫ Next: Level-Up Winner AuraLab Demonstrates Good Karma

≪ Previous: Using the Intel® RealSense™ Camera with TouchDesigner*: Part 1

Introduction

The Princeton Application Repository for Shared-Memory Computers (PARSEC) is a benchmark suite composed of multithreaded programs. The suite focuses on emerging workloads and was designed to be representative of next-generation shared-memory programs for chip-multiprocessors.

The benchmark suite with all its applications and input sets is available as open source free of charge. Some of the benchmark programs have their own licensing terms, which might limit their use in some cases.

The Black-Scholes benchmark is a one of the 13 benchmarks in the PARSEC. This benchmark does option pricing with Black-Scholes Partial Differential Equation (PDE). The Black-Scholes equation is a differential equation that describes how, under a certain set of assumptions, the value of an option changes as the price of the underlying asset changes.

The formula for a put option is similar. The cumulative normal distribution function, CND(x), gives the probability that a normally distributed random variable will have a value less than x. There is no closed form expression for this function, and as such it must be evaluated numerically. Alternatively, the values of this function may be pre-computed and hard-coded in the table; in this case, they can be obtained at runtime using table lookup.

Based on this formula, one can compute the option price analytically based on the five input parameters. Using this analytical approach to price option, the limiting factor lies with the amount of floating-point calculation a processor can perform.

The Hotspots

If we look at the results of Intel® VTune™ Amplifier XE profiling, we see two major hotspots.

Read from input file and write to output file.
Black-Scholes calculations.

Let’s look at each one in more detail.

Read from input file and write to output file

The problem is that the input file contains 10 million rows of data. Every row is an element of struct OptionData which contains nine parameters. That’s why before we start the calculations we have to spend a lot of time reading and parsing the data from the input file to an array of OptionData struct. This array is called data.

Obviously the same problem occurs after all the calculations, and you have to write the results in an output file.

These problems can be solved by using parallel read and write by pointers. For more details you can read about it here: Optimization of Data Read/Write in a Parallel Application | Intel® Developer Zone.

Black-Scholes calculations

We will consider this case in more detail. All calculations contain two functions: CNDF and BlkSchlsEqEuroNoDiv.

CNDF

This function realizes the cumulative distribution function of the standard normal distribution. For more details, refer to any book about the theory of probability.

In our case, this function is as follows:

#define inv_sqrt_2xPI 0.39894228040143270286
fptype CNDF ( fptype InputX )
{
    int sign;
    fptype OutputX;
    fptype xInput;
    fptype xNPrimeofX;
    fptype expValues;
    fptype xK2;
    fptype xK2_2, xK2_3;
    fptype xK2_4, xK2_5;
    fptype xLocal, xLocal_1;
    fptype xLocal_2, xLocal_3;
    // Check for negative value of InputX
    if (InputX < 0.0) {
        InputX = -InputX;
        sign = 1;
    } else
        sign = 0;
    xInput = InputX;
     // Compute NPrimeX term common to both four & six decimal accuracy calcs
    expValues = exp(-0.5f * InputX * InputX);
    xNPrimeofX = expValues;
    xNPrimeofX = xNPrimeofX * inv_sqrt_2xPI;
    xK2 = 0.2316419 * xInput;
    xK2 = 1.0 + xK2;
    xK2 = 1.0 / xK2;
    xK2_2 = xK2 * xK2;
    xK2_3 = xK2_2 * xK2;
    xK2_4 = xK2_3 * xK2;
    xK2_5 = xK2_4 * xK2;
    xLocal_1 = xK2 * 0.319381530;
    xLocal_2 = xK2_2 * (-0.356563782);
    xLocal_3 = xK2_3 * 1.781477937;
    xLocal_2 = xLocal_2 + xLocal_3;
    xLocal_3 = xK2_4 * (-1.821255978);
    xLocal_2 = xLocal_2 + xLocal_3;
    xLocal_3 = xK2_5 * 1.330274429;
    xLocal_2 = xLocal_2 + xLocal_3;

    xLocal_1 = xLocal_2 + xLocal_1;
    xLocal   = xLocal_1 * xNPrimeofX;
    xLocal   = 1.0 - xLocal;

    OutputX  = xLocal;

    if (sign) {
        OutputX = 1.0 - OutputX;
    }
       return OutputX;
}

BlkSchlsEqEuroNoDiv

This is known as the Black-Scholes model, which gives a theoretical estimate of the price of European-style options. All we need to know for our purposes is a Black-Scholes formula, which will be used for calculations. The formula is as follows:

C(S,t)=N(d₁)S-N(d₂)Ke^-r(T-t)- The value of a call option for a non-dividend-paying underlying stock.

P(S,t)=Ke^-r(T-t)N(-d₂)-N(d₁)S - The price of a corresponding put option.

d₁= 1/(σ√(T-t))(ln(S/K)+(r+σ²/2)(T-t))

d₂= 1/(σ√(T-t))(ln(S/K)+(r-σ²/2)(T-t))

For both, as above:

N(∙) is the CNDF function.
T - t is the time to maturity.
S is the spot price of the underlying asset.
K is the strike price.
r is the risk.
σ is the volatility of returns of the underlying asset.

In our case, this function is as follows:

fptype BlkSchlsEqEuroNoDiv( fptype sptprice,
                            fptype strike, fptype rate, fptype volatility,
                            fptype time, int otype, float timet )
{
    fptype OptionPrice;

    // local private working variables for the calculation
    fptype xStockPrice;
    fptype xStrikePrice;
    fptype xRiskFreeRate;
    fptype xVolatility;
    fptype xTime;
    fptype xSqrtTime;

    fptype logValues;
    fptype xLogTerm;
    fptype xD1;
    fptype xD2;
    fptype xPowerTerm;
    fptype xDen;
    fptype d1;
    fptype d2;
    fptype FutureValueX;
    fptype NofXd1;
    fptype NofXd2;
    fptype NegNofXd1;
    fptype NegNofXd2;

    xStockPrice = sptprice;
    xStrikePrice = strike;
    xRiskFreeRate = rate;
    xVolatility = volatility;

    xTime = time;
    xSqrtTime = sqrt(xTime);

    logValues = log( sptprice / strike );

    xLogTerm = logValues;


    xPowerTerm = xVolatility * xVolatility;
    xPowerTerm = xPowerTerm * 0.5;

    xD1 = xRiskFreeRate + xPowerTerm;
    xD1 = xD1 * xTime;
    xD1 = xD1 + xLogTerm;

    xDen = xVolatility * xSqrtTime;
    xD1 = xD1 / xDen;
    xD2 = xD1 -  xDen;

    d1 = xD1;
    d2 = xD2;

    NofXd1 = CNDF( d1 );
    NofXd2 = CNDF( d2 );

    FutureValueX = strike * ( exp( -(rate)*(time) ) );
    if (otype == 0) {
        OptionPrice = (sptprice * NofXd1) - (FutureValueX * NofXd2);
    } else {
        NegNofXd1 = (1.0 - NofXd1);
        NegNofXd2 = (1.0 - NofXd2);
        OptionPrice = (FutureValueX * NegNofXd2) - (sptprice * NegNofXd1);
    }

    return OptionPrice;
}

Optimizations

The only function that is in main() is the bs_thread(), this function is as follows:

#ifdef WIN32
DWORD WINAPI bs_thread(LPVOID tid_ptr){
#else
int bs_thread(void *tid_ptr) {
#endif
    int i, j;
    fptype price;
    fptype priceDelta;
    int tid = *(int *)tid_ptr;
    int start = tid * (numOptions / nThreads);
    int end = start + (numOptions / nThreads);

  for (j=0; j<NUM_RUNS; j++) {
#ifdef ENABLE_OPENMP
#pragma omp parallel for private(i, price, priceDelta)
        for (i=0; i<numOptions; i++) {
#else  //ENABLE_OPENMP
        for (i=start; i<end; i++) {
#endif //ENABLE_OPENMP
            /* Calling main function to calculate option value based on
             * Black & Scholes's equation.
             */
            price = BlkSchlsEqEuroNoDiv( sptprice[i], strike[i],
                                         rate[i], volatility[i], otime[i],
                                         otype[i], 0);
            prices[i] = price;

#ifdef ERR_CHK
            priceDelta = data[i].DGrefval - price;
            if( fabs(priceDelta) >= 1e-4 ){
                printf("Error on %d. Computed=%.5f, Ref=%.5f, Delta=%.5f\n",
                       i, price, data[i].DGrefval, priceDelta);
                numError ++;
            }
#endif
        }
    }
    return 0;
}
#endif //ENABLE_TBB

Let’s take a loop from 0 to numOptions and place it to BlkSchlsEqEuroNoDiv and add one more parameter to this function. The prices array will be this parameter. Of course our function BlkSchlsEqEuroNoDiv will take sptprice, strike, rate, volatility, time, otype arrays entirely. We also need to add some OpenMP* pragmas to parallelize this loop:

#pragma omp parallel for private(i)
#pragma simd
#pragma vector aligned

Thus we will get a performance gain by vectorization and parallelization.

If we read more about the CDF function of the standard normal distribution we will see that this function can be expressed by using the error function:

N(x)=  1/2(1+erf(x/√2))

Let’s create this implementation of the CDF function and try to use it:

#   define ERF(x)      erff(x)
#   define INVSQRT(x)  1.0f/sqrtf(x)
#   define HALF        0.5f
fptype cdfnorm (fptype input)
{
	fptype buf;
	fptype output;
	buf = input*INVSQRT(2);
	output = HALF + HALF*ERF(buf);
	return output;
}

Before we use it in a calculation, I want to mention that I used the erff(x) function, which is contained in the Intel® Math Kernel Library. I used #include <mathimf.h> at the top of source file.

Now we are ready to conduct the experiments!

The system

The system I used to test my modification was the dual Intel® Xeon® E5-2697 v3 server. The Intel Xeon E5-2697 v3 processor contains 14 cores with 2.6 GHz core frequency. Also it has Intel® Hyper-Threading Technology on. Thus we have 28 threads and 35 MB of L3 cache per package. In sum we have 56 available threads.

The compiler

I used the Intel® C++ Compilers 2013.

The compiler keys that I used are as follows:

export CFLAGS="-O2 -xCORE-AVX2 -funroll-loops -opt-prefetch -g -traceback"

These keys may be changed in the configure file «icc.bldconf».

Experiments

I launched the blackscholes benchmark, and the results were as follows:

Summary

Thus, we made some performance gains due to the modifications of the computational functions.
With the I/O parallelization using the method specified earlier in this article, we can achieve a big performance gain.

↧

Level-Up Winner AuraLab Demonstrates Good Karma

October 27, 2015, 12:55 pm

Latest and popular articles on Intel Technologies

≫ Next: Tutorial: Migrating Your Apps to DirectX* 12 – Part 3

≪ Previous: Multithreaded code optimization in PARSEC* 3.0: BlackScholes

Download PDF 801 KB

By Garret Romaine

Alexander Kuvshinov used to dream of creating something more than a simple Adobe Flash* banner. But a string of low-level technical jobs at an architectural firm, a printer, and an advertising agency kept his creativity hidden. Then he teamed up with Andrey Sharapov and Roman Povolotsky, two veterans of the Russian game industry, to found AuraLab. Their first title, Karma.Incarnation* 1, has already won major awards for art, design, sound, and storytelling; players around the world are awaiting this Russian team’s advance from a demo to a commercial game.

Karma.Incarnation* 1 is a 2D adventure game in a whimsical, psychedelic world reminiscent of the 1968 cartoon classic Yellow Submarine*, or possibly inspired by 60s artist Peter Max. In the game, the hero, a blob-like tentacled character named Pip, must learn from his environment and understand the cause-and-effect logic behind what happens as he travels forth, solving puzzles and conquering mazes. When Pip takes ineffective actions, his appearance is altered in a threatening way; when he takes desired actions his appearance becomes more appealing.

Getting game development to its current point has been a challenge for the AuraLab team; just moving from doodles on paper to digitized animation took several months. Advancing to a playable demo was another challenging push, and even more hard work is on the team’s horizon to bring this game fully to market. But with major wins at the 2015 Intel® Level Up Game Developer Contest, it’s worth pulling the team aside for a deeper dive into some of their secrets.

Figure 1: Karma is a classic point-and-click arcade-style adventure game in a psychedelic landscape.

What Inspired This Team?

The inspiration for Pip, the main character of Karma, actually came from an art contest Kuvshinov had entered for creating a design for the back panel of a tablet. The contest initially intrigued him because the winner was to receive an Apple iPad* with the winning design printed on the tablet’s back. Kuvshinov started played around with a black Apple logo, using the bite to form a kind of sideways mouth.

Figure 2: The design inspiration for Karma’s main character, Pip, started as a take on the Apple logo that the designer entered into a local contest.

He quickly had a whole herd of these creatures hanging around on his computer—he’d sketch some random blot, throw in some eyes and tentacles, and play around with that. “I didn’t win the main prize, but by then it didn’t matter much, because I had my new creations to occupy my mind,” he said.

Figure 3: Kuvshinov soon created an entire “herd” of characters that eventually became Pip.

His cartoon grew in sophistication, but in 2013, the agency where he worked disbanded, which turned out to be a blessing in disguise. He and his friends soon formed their own gaming company – AuraLab. They set right to work bringing Pip to life.

Leveling Up Leads to Pre-Production Rush

After the team finished their pre-alpha engine based on Adobe Flash they had three levels, or “locations” in the game. Pip could move between those locations, and the game had some simple interactions as well. “Still, we had a long way to go,” Kuvshinov remembered.

Fortunately, the team already had decent design and iterating skills based on their deep experience in technical positions. When asked what skills were most important for the team at this point in time, Kushinov explained without hesitation that their prior familiarity with the entire game-development process was key. “We knew what skills we needed for our team, and we had them,” he said. Those abilities included technical skills, experience with development processes, the ability to iterate and test, and a strong background in design.

The team worked from home, using only free software. Google* Docs* was their main collaborative tool during the pre-production phase of the project. For task tracking as they plowed through more dependent work, they used Redmine, which they installed on a virtual server that they purchased. “We didn’t buy any hardware—we all have PC’s, notebooks, and iPads, and that was enough,” Sharapov said. “And after we built our demo on Unity*, we actually won a Unity Pro license as the Best Unity Game at the Winter Nights ’15 conference, so we never had to purchase that license, either.”

Kuvshinov saw an announcement for the Intel® Level-Up Game Developer Contest on http://www.promoterapp.com. Meeting the qualifications and with a decent prototype demo that would take advantage of the touch screen and adaptability of 2 in 1 devices, he entered the contest. “Then, I forgot about it,” he admitted. “I didn’t expect anything, to be honest. We were all were very surprised to hear about the prizes we won.”

So unlike how some teams have to put in a mad scramble to produce a demo worthy of submission, AuraLab really got busy after its win. The team plugged into the global indie game-developer circuit and rolled up their sleeves. “The main work began after we received the message that we won,” Sharapov said. “That was when we started to prepare the build for the exhibition at PAX Prime , as well as to prepare the Steam* version.”

Figure 4: AuraLab uses intense shading and surreal art to create a unique mood for players.

Unity*—The Engine of Choice

One early decision the team debated was which game-development engine to choose. Two popular engines are Unreal Engine* 4 and Unity Engine, and while they both were strong contenders, AuraLab chose Unity.

“Unity allows us to make cross-platform games easily,” Sharapov said. “We already had some experience with Unity before, and that helped. Also, it is much easier to find good Unity programmers for the project in Russia.”

Once AuraLab converted to the Unity Engine, they initiated some serious game design sessions and developed the real demo that the public can play. They used the 2D toolkit plugin found in Unity 5, but they had a few challenges to overcome. So lead programmer Yuriy Kuptshevich created several tools to speed up the following tasks:

Upload and unload atlases in runtime.
Upload a series of atlases sequentially.
Calculate the number of textures that can fit in an atlas.

Kuptshevich also conducted several experiments with graphics compression, but he quickly learned it was problematic to try uploading a texture from a *.png file asynchronously. The Unity team has promised him a fix in version 5.3.

Helpful Hints—Lessons Learned

Because the team so recently struggled to get their own game noticed by the broader public, they know the challenges that developers face. In that spirit, and maybe to build up some good karma of their own, Kuptshevich is eager to share the following tips and tricks with his fellow indie developers:

How to bind the zoom of particles and objects in a scene:

BorderLeft and BorderRight - pivots the edges of the object

    public void ScaleParticles ()
    {
        float size = Vector3.Magnitude(Camera.main.WorldToViewportPoint(BorderLeft.position) -
                                       Camera.main.WorldToViewportPoint(BorderRight.position));
        _particleRend.maxParticleSize = size;
    }

How to create a deferred-in-time lambda expression via the delegate of Action or call them in the loop for a set time, or both:

    public void InvokeWithDelay (Action method, float delay)
    {
        StartCoroutine(InvokeMethodWithDelay(method, delay));
    }

    private IEnumerator InvokeMethodWithDelay (Action method, float delay)
    {
        yield return new WaitForSeconds(delay);
        method.Invoke();
    }

Kuptshevich’s best advice is this: “Keep it simple. The main thing is to not do complicated things if at all possible.” He also recommends that developers look for places where a tool can help with repeating processes. His team expanded the Unity editor with their own tools, doing everything possible in an automated way. “You might think you will spend too much time on the tool at first, but automation will make your life much easier and speed things up quite a bit when you are deep into the main part of the development process.”

The team would do a few things differently given the chance, Kuptshevich admitted. Still, experience is a great teacher. “This was not the first game for any of us, “he said. “We didn’t make any critical errors in the development process because we all have a lot of experience in the pre-production phase. That’s where the planning and the design decisions are very important.”

AuraLab made a great, straightforward game with fewer locations but more intensive gameplay. That resulted in a shorter and cheaper production plan. “The moral is, if you can split your work into sections and do that first part faster, with the same or better quality, just do it,” Sharapov said. And once you make a decision, he says to stick with it—“with no mercy.”

Live Demos Provide Positive Feedback

Without a sizable testing team behind them, AuraLab relied on in-person demos at gaming events, conferences, conventions, and more, for testing their game at various stages. “Our testing methodology was to run each scene separately,” Kuptshevich said. “In a game with a lot of related scenes, it is also important to be able to easily test from any location in the storyline. It sounds trite, but it is necessary to remember about it from the very beginning, and development in general will be easier.”

To describe the cutscene script the team decided to use coroutines inside other coroutines. These components generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. The choice gave them the flexibility to manage asynchronous events, after writing some custom code to make it easier to manage events associated with triggers in animation.

They used Mecanim’s Animation State Machines for logic. According to the Unity documentation, “Mecanim’s Animation State Machines provide a way to overview all of the animation clips related to a particular character and allow various events in the game (for example, user input) to trigger different animations.” (For more details, the team refers the reader to Code Project, which has a link to Simon Jackson’s book on Mastering Unity 2D Game Development, which covers state machines in depth. You’ll also find a link to his sample project code with state machines.)

The AuraLab team stressed that developers should take the time to create custom inspectors for commonly used classes, as they add a lot of power and flexibility to most workflows. “In the long term, it will save you time,” Kuptshevich said.

As for feedback directly from players, Sharapov said that users quickly identified some annoying bottlenecks in gameplay that had to be fixed. As a result, they reworked parts of the demo, making it easier for the user to pass through from beginning to end. However, more difficult riddles are already in development and are sure to please the conference crowds.

Figure 5: The AuraLab team succeeded in creating scenes in Karma that are whimsical, engaging, and beautiful.

Karma has an interesting, original soundtrack, with engaging effects and a soothing musical flow. The team credits that to their collaboration with the Zmeiraduga band, whose beautiful, improvised music has developed a bit of a following. The band’s 100-percent improvised music means a completely different concert experience each time out. To create the rest of the sound effects, Kuvshinov used a microphone, distributed free samples, and a vivid imagination.

Multi-Platform Success

Because the Unity Engine offers flexibility with its output targets, AuraLab had no trouble creating demos for multiple platforms. They currently support Microsoft Windows* 7/8, Windows Vista*, and Windows XP* SP3; Mac*, and Android*. The team is planning to develop for AppleTV*, Microsoft Xbox*, and Sony PlayStation* after they release the first commercial version on PC, Mac, iOS*, and Android.

“The Unity Engine allows us to create these multiple releases without too much effort. However, we have to refactor our game several times before we have a stable build on multiple platforms,” Sharapov said. All the different versions have unique features, so it helps to use a game-development engine that knows those subtleties.

“After several experiments and months of refactoring the code, we are to the point where we have a stable architecture for the project and a solid development process that allows us to make builds for each of our supported platforms at one time with a minimum of hand work,” Sharapov said.

One bit of planning really helped at this stage—the team always planned for their gameplay to work on both mobile and desktop versions. They knew they would have to accommodate many different types of user controls, from gesture to touch to point-and-click. “We planned the gameplay from the very start so we could handle both mobile and desktop versions,” Sharpov said. By planning for multiple versions up front, they avoided coding in bottlenecks that would take longer to resolve down the road.

For example, the team kept in mind the idea that users might not have a mouse or may only use a touch screen. One place that showed up was when they planned out a “hint” icon. They saw in testing that they couldn’t show hints by requiring the user to mouse over an object. They realized also that they couldn’t draw small clickable elements too close to each other, because they needed big tap zones for mobile phone users.

Testing sessions have frequently led to fixes they would never have considered. “We learned how the camera position and zoom at the cutscenes for a mobile phone is different from those for the desktop version,” Sharapov said. “We are learning fast.”

The team inserted a hint icon on the bottom-right corner, building it so that when a new hint is coming in, the lamp on the icon begins to shine. Unfortunately, when they tested the hint on an iPad, they learned that most of the users didn’t see the hint icon because of the natural tendency to cover that area with a hand when holding the tablet. Their response was to add additional support for the hint icon right into the game viewport. Now the lamp shines over the head of the main character AND in the bottom-right corner.

Next Steps

The future looks bright for Pip and AuraLab. At the time of this writing, no announcement date has been set yet for the full download of Karma across all four main platforms, but the team is hard at work. Already they are planning a second episode of the game to fully explore Pip’s incarnation. They’ll continue the storyline and come up with more riddles, puzzles, and mazes, and it should be a fun ride. And based on their willingness to share tips, tricks, and codes, maybe they’ve built a little positive karma for themselves.

Resources

YouTube* trailer: https://www.youtube.com/watch?t=1&v=d_PeeemZIBA

Kobe-channel with six unique mini-videos: http://coub.com/karma.incarnation1

Karma demos: http://karma-adventure.com/demo.html

Unity Engine download: https://unity3d.com/get-unity

2015 Intel® Level Up Contest: https://software.intel.com/sites/campaigns/levelup2015/

↧

Tutorial: Migrating Your Apps to DirectX* 12 – Part 3

October 27, 2015, 5:00 pm

Latest and popular articles on Intel Technologies

≫ Next: DrugGraph: A Big Data Approach To Predicting Drug Characteristics

≪ Previous: Level-Up Winner AuraLab Demonstrates Good Karma

Download PDF[PDF 471KB]

Chapter 3 Migrating From DirectX 11 to DirectX 12

3.0 Links to the Previous Chapters

Chapter 1: Overview of DirectX* 12
Chapter 2: DirectX 12 Tools

3.1 Interface Mapping

If your upper-level rendering logic is written based on DirectX11, then the best way to migrate is building an interface layer that’s fully compatible with DX11 because the upper-level logic is not required to do a lot of code refactoring in order to adapt to DX12. This kind of migrating is very fast. In our practices, we only spent a total of about six weeks or so to complete the migrating and testing of the vast majority of functions. Nevertheless, it also has some disadvantages. Because a lot of DX11’s render objects have been integrated or removed in DX12, the wrapper classes of DX12 need to do a lot of runtime state transitions. These operations will consume some CPU time and you cannot completely remove them. So if you have plenty of time for development, it is recommended that you abstract DX12-like graphic interfaces instead to adapt backwards to DX11 features. DX12 will be the future trend, after all.

In order to adapt to DX11 APIs, we re-implemented almost all of the interfaces in D3D11.h document. The following is part of the code sample.

For example:

class CDX12DeviceChild : public IUnknown
{
public:
	void GetDevice(ID3D11Device **ppDevice);
	HRESULT GetPrivateData(REFGUID guid, UINT *pDataSize, void *pData);
	HRESULT SetPrivateData(REFGUID guid, UINT DataSize, const void *pData);
	HRESULT SetPrivateDataInterface(REFGUID guid, const IUnknown *pData);
	HRESULT QueryInterface(REFIID riid, void **ppvObject);
};
class CDX12Resource : public CDX12LDeviceChild
{
public:
	void GetType(D3D11_RESOURCE_DIMENSION *pResourceDimension);
	void SetEvictionPriority(UINT EvictionPriority);
	UINT GetEvictionPriority(void);
};
typedef class CDX12Resource ID3D11Resource;
typedef class CDX12DeviceChild ID3D11DeviceChild;

It’s important to note here that the project could not include D3D11 headers, otherwise definition conflicts might occur.

3.2 Pipeline State Object

Pipeline State Object is the core concept of D3D12. It consists of Shader, RasterizerState, BlendState, DepthStencilState, InputLayout and other data. Once the PSO object is delivered to the system, these states associated with PSO will be set at the same time. However, at the interface layer of D3D11, these rendering parameters are set using different APIs. In order to complete the adaptation, we must use a queriable runtime container to manage them. The most common object container is HashMap which can be used to avoid redundant PSO and the corresponding API calls.

Before using HashMap, we must first prepare the resource ID. The first thing you might think of is the memory address of resource. It is globally unique within the entire app life cycle, but it has a drawback which is taking up too much memory space: 8 bytes on 64-bit systems especially. Practical analysis shows that most apps do not use such a huge amount of objects, thus we can reduce the space that represents resource objects by means of sequential numbers, that is, using a monotonically increasing integer value to represent a resource object. The same integer can also represent different resources as long as those resources are of different types. For example, RasterizerState and BlendState can use different resource counters. An important benefit of this management approach is that it makes the coding space for resources more compact and easy to generate shorter Hash values. Otherwise, if you use the Hash value generated after stitching the memory addresses, the number of bytes of memory occupied by the Hash value will be big, which not only affects the storage of PSOs, but also affects the query speed. The upper limit defined for counters needs to be found out in practice. Different projects may have great differences, but we can first use a larger value in the test, and add assertion where the sequential number is assigned. Once it exceeds the upper limit, the system will trigger an alarm. Then you can determine whether to modify the underlying implementation or adjust the upper-level logic.

To further reduce the number of PSO instances, when generating RasterizerState, BlendState and DepthStencilState, we need to observe the state dependency between them. For example, when we disable the depth test in DepthStencilState, the settings for depth offset in RasterizerState can be ignored. To avoid producing redundant objects, we use default values in those cases.

RTV and DSV are also related to PSOs. Since DSV can control whether to read and write Depth or Stencil in the depth map, when the depth test is enable and depth write is disabled, you need to set a read-only DSV in the system. DSV has three read-only modes: 1) Depth Read Only 2) Stencil Read Only 3) Depth And Stencil Read Only. Besides, PSOs also need the Format information of RTV and DSV, thus you’d better defer the OMSetRenderTargets operation to the time when the PSO is set.

The ScissorEnable property has been removed from RasterizerState. The Scissor test will be in an always-on state on the hardware side. So if the app needs to disable Scissor test, you should set the width and height of ScissorRect to match the viewport or to the maximum resolution allowed by the hardware, such as 16k.

The primary topology type of Primitive needs to be set in PSO, which includes Point, Line, Triangle and Patch. We can use the pre-built conversion table when calling IASetPrimitiveTopology to directly convert it to the primary topology type mentioned above. PSO's HashMap can also be classified according to the primary topology type, with each topology type corresponding to a HashMap, then it will be directly positioned using the array subscript.

3.3 Resource Binding

Before getting to know the resource binding, we must first understand a core concept which is RootSignature. There are big differences between D3D12 and D3D11 in terms of resource binding model. Resource binding in D3D 11 is fixed. The runtime arranges a certain amount of resource Slots for each Shader and the app only needs to call the corresponding interface to be able to bind the resources to the Shader. In D3D12, the resource binding process is very flexible and does not limit the way in which you bind resources or the number of resources you bind. You can set the resource binding style on your own. The most commonly used approaches to bind resources are Descriptor Table and Root Descriptor. The Descriptor Table method is a little more complicated in that it places the Descriptors of a set of resources in a Descriptor Heap in advance, so that when DrawCall needs to reference these resources, you only need to set up an initial handle on it. The Shader will find all subsequent Descriptors based on this handle. This is kind of like pointer array, which means that the Shader needs to do the 2nd addressing to locate the ultimate resources. While the advantage of Root Descriptor is that, instead of placing the Descriptors in a Descriptor Heap in advance, you can set the GPU address of resources into the Command List, which is equivalent to dynamically constructing a Descriptor in the Command List, so that the Shader can locate the resources by only one addressing operation. However, Root Descriptor consumes twice as much parameter space as Descriptor Table. Since the maximum size of RootSignature is limited, reasonable arrangements of the proportion between Root Descriptor and Descriptor Table is very important.

Under normal circumstances, we put SRV and UAV in the Descriptor Table (while Sampler can only exist in the Descriptor Table), but place CBV in the Root Descriptor. Since most resources consumed by CBV are dynamic, its address changes frequently, the use of Descriptor Table may cause combinatorial explosion. Not only the amount of memory occupied increases sharply, but it is troublesome to manage. By contrast, the combinations of Sampler, SRV and UAV will vary much less than CBV, especially for Sampler. As long as the upper rendering logic is properly designed, the number of Sampler combinations can be less than 128. Therefore, it is more appropriate to directly place them into a Descriptor Heap. Here, in order to reuse the Descriptor combinations in a Descriptor Heap, we have to use the PSO-like object management technology to first number each Sampler, SRV and UAV, and then in accordance with the needs of Shader, put them together and generate a unique Hash value which is used to create and index the Descriptor combinations in the Descriptor Heap. Since the maximum number of Samplers used in Shader is 16, each Sampler combination can be placed over a span of 16 per unit. SRV and UAV can also be managed using Sampler's approach, so the upper limit of Shader referring to them is better to be 16 too. Of course, variable combination span unit is also an option, but it is not very convenient to reuse them across frames, because when the texture pointed by SRV is released, its sequence number will be reclaimed by the app, and all Descriptor combinations referencing it will be marked as Deleted. At this point, if the combination blocks in the Descriptor Heap vary in size and are discontinuous, they would be very difficult to be reassigned like the memory pool fragments unless you make time consuming anti-fragmentation efforts. For this reason, a compromise is to use a fixed length of span for Descriptor combinations.

There can be only maximum two Descriptor Heaps set in the Command List, one for each type of Descriptor Heap. “Sampler” and “SRV / UAV / CBV” belong to two different types of Descriptor Heap, and cannot be mixed.

For the sake of efficiency, when we need to rewrite the Descriptor of Descriptor Heap, we can first complete update in a CPU visible Descriptor Heap, and then copy the contents of the Heap to a GPU visible Descriptor Heap via CopyDescriptors* command. If each view is only in one location, then it should be created/updated directly in the shader-visible heap.

3.4 Resource Management

3.4.1 Static Resources

In D3D11, there are two approaches to initialize static resources. The first one applies to Immutable resources. It only allows the data within these resources to change once, passes the data that needs to be initialized into the system through Create* interface. The second one applies to Default resources. It can change data within these resources for several times, but with the help of Staging resources.

In D3D12, the initialization processes of these two resources are merged into one – the 2nd approach. Data was updated to the Default Heap through resources in an Upload Heap. As with D3D11, all resources allocated from the Default Heap cannot be mapped, which means that apps don’t have direct access to its CPU addresses, and therefore need an intermediate resource allocated from the Upload Heap as a bridge to push the data from the CPU side to the GPU side. One thing that needs attention here is when to delete this intermediate resource. In D3D11, the intermediate resource can be deleted directly after executing the Copy command, but this is not feasible in D3D12, because D3D12 does not provide resource life cycle management functions for runtime. All work must be done by the app, so the app needs to know whether those Copy tasks executed asynchronously completed, in other words, when the GPU no longer references these resources. We can easily access this information through the Fence feature of Command Queue. In addition, we can also complete the work of uploading resources through a shared memory pool of dynamic resources. After all, allocating an Upload Heap resource for each Default Heap resource is rather inefficient. Not only the reuse rate is very low, but it’s easy to produce excessive fragments in the system. Therefore, using the dynamic resource memory pool technique described later, it is possible to avoid these problems.

You should first record the current frame number each time before applying resources to a Command List. This frame number can be used to determine whether a resource can be deleted directly when it is released on condition that the number of frames between the frame that has this frame number and the current frame exceeds the total number of Command Lists, or you can buffer it up into the delayed release list of Command List, and release it after the Command List finishes execution on GPU.

3.4.2 Dynamic Resources

In D3D11, the Dynamic Usage resources should be very familiar to us. It is widely used in the Vertex Buffer, Index Buffer, and Constant Buffer. The related application scenarios include particles and interfaces. The Map function normally provides a Write Discard feature which allows the app to use the same resource repeatedly, provided that the initial size of this resource meets the needs of rendering logic. As mentioned earlier, we know that the APIs of D3D are performed asynchronously, that is, the end of API calls does not mean that the execution of the task has been finished at the same time, instead, it is likely that there’s still time before the final completion. At this point, if subsequent DrawCalls have modified this resource, there is a certain probability of causing competition for resources. But if you’ve used the Write Discard feature, it can prevent this from happening. Because the runtime or driver will automatically rename the resource so that it looks like a reference to the same object externally, but in fact it has been switched to another free resource internally. This new resource will take over the old resource for external update. To avoid prolonged occupation of a large amount of memory, the system puts these old resources in the memory pool for unified management. When they are no longer referenced by GPU, the system will reclaim and recycle them.

In D3D12, we must implement a similar function. First, we need to establish a resource pool which consists of a list of resources. As the size of resources we request each time may vary, it is best to allocate a larger resource in advance, and then classify child resources by different offsets for upper logic use. By doing so, we can reduce the number of system allocation, as well as the number of excessive fragments generated as a result of the discontinuity of memory. In general, we recommend using resource blocks in a unit of 4MB. When the resource pool is ready, we can start to allocate resources. But before that, we also need to know the memory address of resources. In D3D11, the memory address is derived from the Map function. D3D12 also returns the memory address through this function. The difference is that, unlike D3D11, D3D12 doesn’t require to call the Unmap function each time you map and fill the data. As the mapping of D3D12 dynamic resources is continuous, which means that their memory addresses will always be valid, there’s no need to tell the system to cancel the mapping of resources through the Unmap function. So under normal circumstances, in the lifecycle of a dynamic resource, it’s only required to call the Map function once. You can save the memory address it returns for repeated use. Before this resource is released, you need to call the Unmap function once to ensure that this memory address space can be reclaimed by the system.

Reclaiming occupied resources is also very simple. We just need to put those resource blocks allocated by the current frame into a pending queue numbered by the current Command List. Each frame will detect if the Command List corresponding to this queue is finished. If it is done, you can link all the resources in this queue to the Free List for subsequent distribution and reuse. The above method is suitable for the data that’s updated and used per frame. The resources that are referenced across frames and may be updated once for a number of frames need to be maintained using another method. We should record the current Command List number before each resource is referenced. When it is renamed again, you should first check if the Command List corresponding to this number has already been executed. If not, you should put it into the To-Be-Reclaimed list of the current Command List and wait for the Command List to complete before reclaiming and using it. Since this method doesn’t discard resources for every frame, it can ensure the use of resources across frames, but it needs to check the last time usage of resources every time the resources are renamed.

Due to the dynamic Buffer, GPU address changes with every request. For this reason, it is better for the external rendering logic to place the buffered requests before the resources are set to the Command List, otherwise you will need to defer the setting of resources to the time when DrawCall makes calls.

Special reminder: CPU-side logic shall not read the memory space mapped out of resources allocated in the Upload Heap, otherwise it will cause a significant performance loss because this memory should be accessed in Write-Combine mode.

3.4.3 Update of Dynamic Texture

In D3D11, the dynamic Texture is updated basically in the same way as dynamic Buffer. You directly pull out the memory address after mapping, and then fill the Texture with data. However, compared to Buffer, extra consideration needs to be given to the span value of Row Pitch and Depth Pitch. But in D3D12, you cannot fill the Texture like you did in D3D11, because the Texture is stored in the form of Swizzle in GPU, while the memory layout of buffer is linear. So the Buffer can be directly filled on the CPU side without conversion. However, for GPU read efficiency considerations, the Texture has to be uploaded in another way. As with Buffer, the first step is to allocate an Upload Heap resource of appropriate size. According to the GetCopyableFootprints API, we can learn of the allowed space layout of the Texture uploaded to the Default Heap in the Upload Heap, and then use the Copy * command to upload the data filled on the CPU side. As seen from the description, the Texture used by GPU is actually also a static resource. When we look at the implementation of D3D11, the Texture resource that can be mapped also went through a similar process as D3D12 internally, but D3D12 externalizes these tasks, and thus gives apps more possibilities for optimization.

3.4.4 Readback of GPU Data

In D3D11, there are two types of GPU data readback. One is the readback of the GPU data of Buffer and Texture, which is handled by the Staging resource. First you copy the static resource that needs to be read back to the Staging resource, and then use the Map function to return an address that can be read by CPU. But before you actually start reading the data, you also need to determine whether copy of the resources to the readback heap has been completed because the Copy operation is asynchronous. In D3D12, the process is similar. Unlike the Upload Heap mentioned earlier, D3D12 provides a Heap type dedicated to data readback. The ReadBack Heap can be used almost in the same way as the Upload Heap. Likewise, you first allocate a resource from the Heap, then copy the Default Heap resource that needs to be read back to this resource through the Copy* function. What’s different from D3D11 is that its Map function does not provide the capability to wait and check whether the readback has been completed. At this point, we need to apply the mechanism we mentioned in the static and dynamic resource management sections to the readback operation, determine whether the readback has been completed by way of Fence. We don’t want to encourage persistent map of readback (write-back) memory. You can keep it mapped, but before reading data that was written by the GPU, you should do another Map() with a range to ensure the cache is coherent. This is free on systems which don’t need it, but ensures correctness on systems which do.

Another type is the readback of hardware Query. The process for this type of readback is basically the same as the readback of Buffer and Texture, except that the Resolve function is used instead of the Copy function. Since the Resolve function can readback Query data in bulk, when assigning the Query Heap, it’s not required for every single Query object to call the create function. Instead, you can allocate the Query object collection with contiguous memory space, and then conduct subsequent positioning by way of offsetting within resources.

Since we are using D3D11 interface encapsulation, externally, upload and readback operations may both use the Staging resource. So how do we distinguish between them internally? First, we need to determine whether the Map operation takes place before or after the Copy* command is executed when we first use this resource. Under normal circumstances, the Map operation takes place before the Copy* command is executed because we believe that users want to upload data to GPU. And when users want to read back data from GPU, the Map operation takes place after the Copy* command is executed. Of course there is a precondition here, that is, for the same resource, the external logic is not allowed to use it to both upload CPU data and read back GPU data. Otherwise it would not be able to identify this ambiguity internally. Fortunately, such cases rarely happen in practice.

3.5 Resource Barrier

This is a new concept. Prior to D3D12, resource state management was done by the driver. Now D3D12 strips it out from the driver layer, and lets the app control when and how to handle it.

There are three different types of resource barriers. The most common type is Transition which is mainly used to switch the state of a resource. When the application scenario of a resource changes, we will place a corresponding Resource Barrier before this resource is used.

A very common Transition Barrier in practice is that, a resource switches back and forth between ShaderResourceView and RenderTargetView. So we need to add a member variable in the wrapper class of a resource to record the current resource state. When the upper logic calls OMSetRenderTargets, we should first check whether the current state is RenderTargetView, if not, place a Barrier. Its StateBefore is filled with the state value stored in the member variable, while its StateAfter is filled with RenderTargetView state. If the rendering logic calls XXSetShaderResources, then we proceed with similar actions by following the above process, except that the StateAfter should be filled with ShaderResourceView state.

It is better to defer the setting of the Transition Barrier until we really start to use resources, so as to avoid unnecessary synchronization, because it will block the execution of subsequent commands.

Resources passed in through Copy*, Resolve* and Clear* commands all have a corresponding target state that needs to be set.

3.6 Command List/Queue

D3D12 Command List/Queues are features transformed out of D3D11 Device Context. Command List is responsible for buffering rendering commands which are then built into hardware commands known by the driver and finally executed by the Command Queue. Since each Command List can independently fill rendering commands without any intermediate lock protection, it operates faster than its counterpart in D3D11.

The Command List can be re-used repeatedly. When the app wants to re-submit the same Command List for GPU execution, it must wait until the execution of this Command List on GPU is complete; otherwise, the behavior is undefined. App can also reset the Command List after it is closed, no matter if the Command List is still being executed on GPU or not. A Command List that’s been reset is equivalent to a blank Context. It no longer inherits any previous rendering state, so you need to set them again, such as PSO, Viewport, ScissorRect, RTV and DSV.

In general, in order to avoid the need for each frame to wait synchronously for the Command List being executed in the previous frame, we can prepare a number of spare Command Lists, and then check the previous pending Command Lists at the end of each frame. If a recent Command List has been executed, then it indicates that the previous Command Lists have also been executed because the Command List is executed strictly in sequence, which is equivalent to a FIFO queue. In order to determine whether a Command List has been executed, we need to use an object like Fence. When the Command Queue calls the ExecuteCommandList function, the Signal function allows the system to immediately notify the Fence object when the Command List has been executed by setting the expected value passed to the Signal function into the Fence object. So under normal circumstances, we will take the accumulated frame number of each frame as the expected value and pass it to the Signal function. The way to query whether a Command List is completed is determining whether the return value of GetCompletedValue is equal to or greater than the expected value.

The Command Queue will be bound with the SwapChain, so the first argument passed in when you create the SwapChain is Command Queue. At present, the common pattern of SwapChain is Flip*. In this mode, BufferCount should be greater than one, which means that there will be more than one BackBuffer in the SwapChain. In order to render them alternately, you need to switch the next BackBuffer to the current RenderTarget after the Present operation, automatically rewind based on the total number of BackBuffer that you’ve created. If you haven’t performed the Present operation in a frame, then you cannot switch the BackBuffer, otherwise it will cause the system to crash.

There are three types of Command Queue: Direct, Copy and Compute. These three types of Command Queue can be performed in parallel.

Direct Command Queue is responsible for handling Graphics rendering commands.
Copy Command Queue is responsible for data upload or readback operation.
Compute Command Queue is responsible for handling commands for general purpose (has nothing to do with rasterization) calculation.

For example, while we use Direct Command Queue to render a scene on a worker thread, we can use Copy Command Queue on another thread to handle the upload of Texture data. The operation in Direct Command Queue that refers to the Texture data uploaded in the background must wait until the Copy Command Queue has been executed.

Coming Soon: Links to the Following Chapters

Chapter 4: DirectX 12 Features

Chapter 5: DirectX 12 Optimization

↧

DrugGraph: A Big Data Approach To Predicting Drug Characteristics

October 28, 2015, 10:20 am

Latest and popular articles on Intel Technologies

≫ Next: Intel® Xeon® Processor - Solution Briefs

≪ Previous: Tutorial: Migrating Your Apps to DirectX* 12 – Part 3

Big data holds the promise to transform industries and research, and spawn new solutions to a range of challenges in healthcare and the life sciences. With massive scalability, breakthrough economics, and a vibrant ecosystem, big data platforms enable the capture of diverse data at very high volumes and velocities. Yet, achieving the anticipated data insights remains elusive. Data scientists are scarce, and their efforts are often taxed by the effort and complexity required to program and integrate the diversity of advanced, open-source analytics tools. The typical data science workflow is not conducive to iteration and collaboration, further slowing time to insight. And, finally, many of the tools data scientists use, focus on answering known questions, sometimes on sampled data, at the expense of opportunities to use big data to discover answers to questions no one has thought to ask.

Intel and the Icahn School of Medicine at Mount Sinai have initiated a project called DrugGraph to explore the applicability of data science advances to therapeutic drug discovery, using data science capabilities found in the open source software project Trusted Analytics Platform (TAP) to greatly reduce the complexity of big data analytics processes. The project employs hardware and software advances in data science, including big data analytics, graph analytics, and machine learning, to help predict the clinical efficacy of existing drugs and compounds, and make the management and discovery of drug-related information more efficient. Success in this joint research can lead to faster, less expensive discovery of new drug therapies; improved patient outcomes by reducing toxic drug reactions; and reduced cost of treatment by predicting novel uses for existing compounds.

Using TAP, data scientists can achieve efficient knowledge discovery and modeling on big data. TAP provides data scientists with extensible tools, scalable algorithms and powerful engines to train and deploy predictive models.

This paper will explore the building of DrugGraph, the scientific challenges in doing so, and the use of TAP.

Download complete whitepaper (PDF) Download Intel_MtSinai_Whitepaper.pdf

↧

Intel® VTune™ Amplifier XE 2016 performance profiler

New for the 2016 Update 1! (Optional update unless you need...)

Resources

Contents

Abstract

XDM

Adobe XMP Standard

XML

Conclusion

References

About The Author

Chapter 2 DirectX 12 Tools

2.1 Visual Studio Graphics Diagnostics tools

2.1.1 Overview of Graphics Diagnostics Tools

2.1.2 The Compatibility of Graphics Diagnostics

2.1.3 Graphics Diagnostics Features in Visual Studio

Coming Soon:

2.1.4 Reference Resources

Coming Soon: Links to the Following Chapters

Introduction

Background and Scope

Creating a BGS Sample Application

BGS Technology Limitations

Providing Feedback to Intel on BGS Technology

Summary

A Quick Review of What User Feedback Is (and Why It’s Important)

Head Tracking in the Intel® RealSense™ SDK

Limitations of Head Tracking

The Tracking Volume

Tracking is Based on Face Detection

Head as a Cursor

Our Challenges

The Cursor in Space Between

Takeaways

Voice Recognition in Intel RealSense SDK

Limitations of Voice Recognition

The module’s accuracy does not meet user expectations

There is sometimes a significant delay between spoken commands and recognized speech

Voice pitch, timbre, and volume play a role in voice-recognition accuracy

Accents play a role in voice-recognition accuracy

Microphone quality plays a large role in voice-recognition accuracy

Environment noise plays a larger role in voice-recognition accuracy

Voice Recognition as an Input Controller

Our Challenges

Voice Recognition in The Risen

Takeaways

Man’s New Best Friend

About the Author

For More Information

Introduction

About the Project

Hardware

Software

Set Up the Intel® RealSense™ Camera

Gesture Set Up

Working with JavaScript

Set Up the Server

Set up the Intel® Edison Board

What is Lords of New York?

Genesis

Birth of Lunchtime Studios

Proof

Audience and Design

Cheating and AI

Random Numbers in AI

Build an AI Personality

Information Centers

Filters

Strategy

All Together Now

尊敬的嘉宾：

会议日程

会议地点

Google Play* Games Services

Unity* game engine

App Game Kit*

Cocos2D*

Monkey* X Pro

Godot*

Conclusion