Quantcast
Channel: Intel Developer Zone Articles
Viewing all 3384 articles
Browse latest View live

Single Node Caffe Scoring and Training on Intel® Xeon E5-Series Processors

$
0
0

    As Deep Neural Network (DNN) applications grow in importance in various areas including internet search engines and medical imaging, Intel teams are working on software solutions to accelerate these workloads that will become available in future versions of Intel® Math Kernel Library (Intel® MKL) and Intel® Data Analytics Acceleration Library (Intel® DAAL). This technical preview demonstrates performance that is possible to achieve on Intel platforms with software that we have under development.  The current technical preview only works on an Intel® Advanced Vector Extensions 2 (Intel® AVX2) enabled processor. In the upcoming article we will demonstrate what’s possible with distributed multinode configuration.

    Caffe is a deep learning framework developed by Berkeley Vision and Learning Center (BVLC) and one of the most popular community frameworks for image recognition. Together with AlexNet, a neural network topology for image recognition, and ImageNet, a database of labeled images, Caffe is often used as a benchmark.

    While Caffe can take advantage of the optimized mathematical routines provided in Intel MKL, there are further opportunities to improve performance on Intel® Xeon processor-based systems by applying code modernization techniques. With the right use of Intel MKL, vectorization, and parallelization it is possible to achieve an 11x increase in training performance and a 10x increase in classification performance compared to a non-optimized Caffe implementation.

With these optimizations time to train AlexNet* network on full ILSVRC-2012 dataset to 80% top5 accuracy reduces from 58 days to about 5 days.

Getting started

While we are working to bring new functionality in our software offerings, you can use the technology preview package attached to this article to reproduce the demonstrated performance results and even train AlexNet on your own dataset.

The package supports the AlexNet topology and introduces the 'intel_alexnet' model, which is similar to bvlc_alexnet with the addition of two new 'IntelPack' and 'IntelUnpack' layers, as well as optimized convolution, pooling, and normalization layers. Additionally we changed validation parameters to facilitate vectorization increasing the value of validation minibatch from 50 to 256 and reducing the number of test iterations from 1000 to 200 to keep number of images used in validation run constant. The package contains the intel_alexnet model in these files:

  • models/intel_alexnet/deploy.prototxt
  • models/intel_alexnet/solver.prototxt
  • models/intel_alexnet/train_val.prototxt.

The 'intel_alexnet' model allows you to train and test the ILSVRC-2012 training set.

To start working with the package make sure that all regular Caffe dependencies listed in ‘System Requirements and Limitations’ are installed on your system, then:

  • Unpack the package
  • Specify the paths to the database, snapshot location, and image mean file in these 'intel_alexnet' model files:
    • models/intel_alexnet/deploy.prototxt
    • models/intel_alexnet/solver.prototxt
    • models/intel_alexnet/train_val.prototxt
  • Set up a runtime environment for software tools listed in ‘System Requirements and Limitations’ section
  • Add the path to ./build/lib/libcaffe.so to the LD_LIBRARY_PATH environment variable
  • Set the threading environment:
    $> export OMP_NUM_THREADS=<N_processors * N_cores>
    $> export KMP_AFFINITY=compact,granularity=fine
  • Run timing on a single node using the command:
    $> ./build/tools/caffe time \
           -iterations <number of iterations> \
           --model=models/intel_alexnet/train_val.prototxt
  • Run training on a single node using the command:
    $> ./build/tools/caffe train \
           --solver=models/intel_alexnet/solver.prototxt

System requirements and limitations

The package has the same software dependencies as non-optimized Caffe:

and Intel MKL 11.3 or later.

Hardware compatibility:

The software was validated with AlexNet topology only and may not work in other configurations.

Support:

Please direct questions and comments on this package to mailto:intel.mkl@intel.com.


Performance Considerations for Resource Binding in Microsoft DirectX* 12

$
0
0

By Wolfgang Engel, CEO of Confetti

With the release of Windows* 10 on July 29 and the release of the 6th generation Intel® Core™ processor family (code-name Skylake), we can now look closer into resource binding specifically for Intel® platforms.

The previous article “Introduction to Resource Binding in Microsoft DirectX* 12” introduced the new resource binding methods in DirectX 12 and concluded that with all these choices, the challenge is to pick the most desirable binding mechanism for the target GPU, types of resources, and their frequency of update.

This article describes how to pick different resource binding mechanisms to run an application efficiently on specific Intel’s GPUs.

Tools of the Trade

To develop games with DirectX 12, you need the following tools:

  • Windows 10
  • Visual Studio* 2013 or higher
  • DirectX 12 SDK comes with Visual Studio
  • DirectX 12-capable GPU and drivers

Overview

A descriptor is a block of data that describes an object to the GPU, in a GPU-specific opaque format. DirectX 12 offers the following descriptors, previously named “resource views” in DirectX 11:

  • Constant buffer view (CBV)
  • Shader resource view (SRV)
  • Unordered access view (UAV)
  • Sampler view (SV)
  • Render target view (RTV)
  • Depth stencil view (DSV)
  • and others

These descriptors or resource views can be considered a structure (also called a block) that is consumed by the GPU front end. The descriptors are roughly 32–64 bytes in size and hold information like texture dimensions, format, and layout.

Descriptors are stored in a descriptor heap, which represents a sequence of structures in memory.

A descriptor table holds offsets into this descriptor heap. It maps a continuous range of descriptors to shader slots by making them available through a root signature. This root signature can also hold root constants, root descriptors, and static samplers.

Descriptors, descriptor heap, descriptor tables, root signature

Figure 1. Descriptors, descriptor heap, descriptor tables, root signature.

Figure 1 shows the relationship between descriptors, a descriptor heap, descriptor tables, and the root signature.

The code that Figure 1 describes looks like this:

// the init function sets the shader registers
// parameters: type of descriptor, num of descriptors, base shader register
// the first descriptor table entry in the root signature in
// image 1 sets shader registers t1, b1, t4, t5
// performance: order from most frequent to least frequent used
D3D12_DESCRIPTOR_RANGE Param0Ranges[3];
Param0Ranges[0].Init(D3D12_DESCRIPTOR_RANGE_SRV, 1, 1); // t1 Param0Ranges[1].Init(D3D12_DESCRIPTOR_RANGE_CBV, 1, 1); // b1 Param0Ranges[2].Init(D3D12_DESCRIPTOR_RANGE_SRV, 2, 4); // t4-t5

// the second descriptor table entry in the root signature
// in image 1 sets shader registers u0 and b2
D3D12_DESCRIPTOR_RANGE Param1Ranges[2]; Param1Ranges[0].Init(D3D12_DESCRIPTOR_RANGE_UAV, 1, 0); // u0 Param1Ranges[1].Init(D3D12_DESCRIPTOR_RANGE_CBV, 1, 2); // b2

// set the descriptor tables in the root signature
// parameters: number of descriptor ranges, descriptor ranges, visibility
// visibility to all stages allows sharing binding tables
// with all types of shaders
D3D12_ROOT_PARAMETER Param[4];
Param[0].InitAsDescriptorTable(3, Param0Ranges, D3D12_SHADER_VISIBILITY_ALL);
Param[1].InitAsDescriptorTable(2, Param1Ranges, D3D12_SHADER_VISIBILITY_ALL); // root descriptor
Param[2].InitAsShaderResourceView(1, 0); // t0
// root constants
Param[3].InitAsConstants(4, 0); // b0 (4x32-bit constants)

// writing into the command list
cmdList->SetGraphicsRootDescriptorTable(0, [srvGPUHandle]);
cmdList->SetGraphicsRootDescriptorTable(1, [uavGPUHandle]);
cmdList->SetGraphicsRootConstantBufferView(2, [srvCPUHandle]);
cmdList->SetGraphicsRoot32BitConstants(3, {1,3,3,7}, 0, 4);

The source code above sets up a root signature that has two descriptor tables, one root descriptor, and one root constant. The code also shows that root constants have no indirection and are directly provided with the SetGraphicsRoot32bitConstants call. They are routed directly into the shader registers; there is no actual constant buffer, constant buffer descriptor, or binding happening. Root descriptors have only one level of indirection, because they store a pointer to memory (descriptor->memory), and descriptor tables have two levels of indirection (descriptor table -> descriptor-> memory).

Descriptors live in different heaps depending on their types, such as SV and CBV/SRV/UAV. This is due to wildly inconsistent sizes of descriptor types on different hardware platforms. For each type of descriptor heap, there should be only one heap allocated because changing heaps could be expensive.

In general DirectX 12 offers an allocation of more than one million descriptors upfront, enough for a whole game level. While previous DirectX versions dealt with allocations in the driver on their own terms, with DirectX 12 it is possible to avoid any allocations during runtime. That means any initial allocation of a descriptor can be taken out of the performance “equation.”

Note: With 3rd generation Intel® Core™ processors (code-name Ivy Bridge)/4th generation Intel® Core™ processor family (code-name Haswell) and DirectX 11 and the Windows Display Driver Model (WDDM) version 1.x, resources were dynamically mapped into memory based on the resources referenced in the command buffer with a page table mapping operation. This way copying data was avoided. The dynamic mapping was important because those architectures only offer 2 GB of memory to the GPU (Intel® Xeon® processor E3-1200 v4 product family (code-name Broadwell) offers more).
With DirectX 12 and WDDM version 2.x, it is no longer possible to remap resources into the GPU virtual address space as necessary, because resources have to be assigned a static virtual address when created and therefore the virtual address of resources cannot change after creation. Even if a resource is “evicted” from GPU memory, it maintains its virtual address for later when it is made resident again.
Therefore the overall available memory of 2 GB in Ivy Bridge/Haswell can become a limiting factor.

As stated in the previous article, a perfectly reasonable outcome for an application might be a combination of all types of bindings: root constants, root descriptors, descriptor tables for descriptors gathered on-the-fly as draw calls are issued, and dynamic indexing of large descriptor tables.

Different hardware architectures will show different performance trade-offs between using sets of root constants and root descriptors versus using descriptor tables. Therefore it might be necessary to tune the ratio between root parameters and descriptor tables depending on the hardware target platforms.

Expected Patterns of Change

To understand which kinds of change incur an additional cost, we have to analyze first how game engines typically change data, descriptors, descriptor tables, and root signatures.

Let’s start with what is called constant data. Most game engines store usually all constant data in “system memory.” The game engine will change data in CPU accessible memory and then later on during the frame, a whole block of constant data is copied/mapped into GPU memory and then read by the GPU through a constant buffer view or through the root descriptor.

If the constant data is provided through SetGraphicsRoot32BitConstants() as a root constant, the entry in the root descriptor does not change but the data might change. If it is provided through a CBV == descriptor and then a descriptor table, the descriptor doesn’t change but the data might change.

In case we need several constant buffer views—for example, for double or triple buffered rendering— the CBV or descriptor might change for each frame in the root signature.

For texture data, it is expected that the texture is allocated in GPU memory during startup. Then an SV == descriptor will be created, stored in a descriptor table or a static sampler, and then referenced in the root descriptor. The data and the descriptor or static sample do not change after that.

For dynamic data like changing texture or buffer data (for example, textures with rendered localized text, buffers of animated vertices or procedurally generated meshes), we allocate a render target or buffer, provide an RTV or UAV, which are descriptors, and then these descriptors might not change from there on. The data in the render target or buffer might change.

In case we need several render targets or buffers—for example, for double or triple buffered rendering—the descriptors might change for each frame in the root signature.

For the following discussion, a change is considered important for binding resources if it does the following:

  • Changes/replaces a descriptor in a descriptor table, for example, the CBVs, RTVs, or UAVs described above
  • Changes any entry in the root signature

Descriptors in Descriptor Tables with Haswell/Broadwell

On platforms based on Haswell/Broadwell, the cost of changing one descriptor table in the root signature is equivalent to changing all descriptor tables. Changing one argument means that the hardware has to make a copy (version) of all the current arguments. The number of root parameters in a root signature is the amount of data that the hardware has to version when any subset changes.

Note: All the other types of memory in DirectX 12, like descriptor heaps, buffer resources, and so on, are not versioned by hardware.

In other words, changing all of the parameters is roughly the same cost as just changing one (see [Lauritzen] and [MSDN]). Changing none is still the cheapest, but not that useful.

Note: Other hardware, that has for example a split between fast / slow (spill) root argument storage only has to version the region of memory where the argument changed – either the fast area or the spill area.

On Haswell/Broadwell, an additional cost of changing descriptor tables can come from the limited size of the binding table in hardware.

Descriptor tables on those hardware platforms use “binding table” hardware. Each binding table entry is a single DWORD that can be considered an offset into the descriptor heap. The 64 KB ring can store 16,384 binding table entries.

In other words the amount of memory consumed per draw call is dependent on the total number of descriptors that are indexed in a descriptor table and then referenced through a root signature.

In case we run out of the 64 KB memory for the binding table entries, the driver will allocate another 64 KB binding table. The switch between those tables leads to a pipeline stall as shown in Figure 2.

Pipeline stall (courtesy of Andrew Lauritzen)

Figure 2. Pipeline stall (courtesy of Andrew Lauritzen).

For example a root signature references 64 descriptors in a descriptor table. The stall will happen every 16,384 / 64 = 256 draw calls.

Because changing a root signature is considered cheap, having multiple root signatures with a low number of descriptors in the descriptor table is favorable over having root signatures with a larger amount of descriptors in the descriptor table.

Therefore it is favorable on Haswell/Broadwell to keep the number of descriptors referenced in descriptor tables as low as possible.

What does this mean for renderer designs? Using more descriptor tables with less descriptors and therefore more root signatures should increase the number of pipeline state objects (PSO), because with an increased number of root signatures the number of PSOs needs to increase because of the one-to-one relationship between these two.

Having more pipeline state objects might lead to a larger number of shaders that, in this case, might be more specialized, instead of longer shaders that offer a wider range of features, which is the common recommendation.
 

Root Constants/Descriptors on Haswell/Broadwell

Similar to where changing one descriptor table is the same cost compared to changing all of them, changing one root constant or root descriptor is the equivalent to changing all of them (see [Lauritzen]).

Root constants are implemented with “push constants” that are a buffer that hardware uses to prepopulate Execution Unit (EU) registers. Because the values are immediately available when the EU thread launches, it can be a performance win to store constant data as root constants, instead of storing them with descriptor tables.

Root descriptors are implemented as “push constants” as well. They are just pointers passed as constants to the shader, reading data through the general memory path.

Descriptor Tables versus Root Constants/Descriptors on Haswell/Broadwell

Now that we looked at the way descriptor tables, root constants, and descriptors are implemented, we can answer the main question of this article: is one favorable over the other? Because of the limited size of binding table hardware and the potential stalls resulting from crossing this limit, changing root constants and root descriptors is expected to be cheaper on Haswell/Broadwell hardware because they do not use the binding table hardware. For root descriptors and root constants, this is especially recommended in case the data changes every draw call.

Static Samplers on Haswell/Broadwell

As described in the previous article, it is possible to define samplers in the root signature or right in the shader with HLSL root signature language. These are called static samplers.

On Haswell/Broadwell hardware, the driver will place static samplers in the regular sampler heap. This is equivalent to putting them into descriptors manually. Other hardware implements samplers in shader registers, so static samplers can be compiled directly into the shader.

In general static samplers should be a win on many platforms, so there is no downside to using them. On Haswell/Broadwell hardware there is still the chance that by increasing the number of descriptors in a descriptor table, we end up more often with a pipeline stall, because descriptor table hardware has only 16,384 slots to offer.

Here is the syntax for a static sampler in HLSL:

StaticSampler( sReg,
               [ filter = FILTER_ANISOTROPIC,
               addressU = TEXTURE_ADDRESS_WRAP,
               addressV = TEXTURE_ADDRESS_WRAP,
               addressW = TEXTURE_ADDRESS_WRAP,
               mipLODBias = 0.f,     maxAnisotropy = 16,
               comparisonFunc = COMPARISON_LESS_EQUAL,
               borderColor = STATIC_BORDER_COLOR_OPAQUE_WHITE,
               minLOD = 0.f, maxLOD = 3.402823466e+38f,
               space = 0, visibility = SHADER_VISIBILITY_ALL ])

Most of the parameters are self-explanatory because they are similar to the C++ level usage. The main difference is the border color: on the C++ level it offers a full color range while the HLSL level is restricted to opaque white/black and transparent black. An example for a static shader is:

StaticSampler(s4, filter=FILTER_MIN_MAG_MIP_LINEAR)

Skylake

Skylake allows dynamic indexing of the entire descriptor heap (~1 million resources) in one descriptor table. That means one descriptor table could be enough to index all the available descriptor heap memory.

Compared to previous architectures, it is not necessary to change descriptor table entries in the root signature as often. That also means that the number of root signatures can be reduced. Obviously different materials will require different shaders and therefore different PSOs. But those PSOs can reference the same root signatures.

With modern rendering engines utilizing less shaders than their DirectX 9 and 11 ancestors so that they can avoid the cost of changing shaders and the attached states, reducing the number of root signatures and therefore the number of PSOs is favorable and should result in a performance gain on any hardware platform.

Conclusion

Focusing on Haswell/Broadwell and Skylake, the recommendation for developing performant DirectX 12 applications are dependent on the underlying platform. While for Haswell/Broadwell, the number of descriptors in a descriptor table should be kept low, for Skylake it is recommended to keep this number high and decrease the number of descriptor tables.

To achieve optimal performance, the application programmer can check during startup for the type of hardware and then pick the most efficient resource binding pattern. (There is a GPU detect example that shows how to detect different Intel® hardware architectures at https://software.intel.com/en-us/articles/gpu-detect-sample/) The choice of resource binding pattern will influence how shaders for the system are written.

About the Author

Wolfgang is the CEO of Confetti. Confetti is a think-tank for advanced real-time graphics research and a service provider for the video game and movie industry. Before cofounding Confetti, Wolfgang worked as the lead graphics programmer in Rockstar's core technology group RAGE for more than four years. He is the founder and editor of the ShaderX and GPU Pro books series, a Microsoft MVP, the author of several books and articles on real-time rendering and a regular contributor to websites and conferences worldwide. One of the books he edited, ShaderX4, won the Game developer Front line award in 2006. Wolfgang is in many advisory boards throughout the industry; one of them is the Microsoft’s Graphics Advisory Board for DirectX 12. He is an active contributor to several future standards that drive the Game Industry. You can find him on twitter at wolfgangengel. Confetti's website is  www.conffx.com

Acknowledgement

I would like to thank the reviewers of this article:

  • Andrew Lauritzen
  • Robin Green
  • Michal Valient
  • Dean Calver
  • Juul Joosten
  • Michal Drobot

References and Related Links

** Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

GIANTS Software Optimizes Farming Simulator* 15 with Intel® Graphics Performance Analyzers

$
0
0

Download [PDF 1.59 MB]

Advanced, physically based rendering technologies in games like GIANTS Software Farming Simulator 15 push the limits of technology. Intel® Graphics Performance Analyzers (Intel® GPA) play a significant role in finding bugs and optimizing the latest generation of physically based renderers.

Farming Simulator 15 is a realistic farming simulation in which users can take on all the challenges of farming life, including working with animals (cattle, chickens, and sheep), cultivating land, and harvesting crops. Players can grow their own farm in a huge, open world using more than 100 vehicles and tools.

Farming Simulator 15 is available for multiple platforms and provides multiplayer support over the Internet. The primary version supports Windows* and OS X*, Sony PlayStation*3 and PlayStation4, as well as Xbox One* and Xbox 360*. In addition to the primary version, a mobile version is available that supports iOS* and Android* as well as Nintendo 3DS* and PlayStation Vita. GIANTS updates the primary version every two years and releases the mobile version in between.

Innovative features that GIANTS provides in Farming Simulator 15 are simulated tree cutting with an advanced mesh cutting algorithm and simulating a trailer filling up with heap.

Mesh Cutting

Cutting any type of mesh at any position, like in this game’s woodcutting simulation, is a technical challenge.

In Farming Simulator 15, mesh cutting consists of three steps:

  1. Cut the visual mesh (XYZs, UVs, normals), and then tessellate the cross-sections.
  2. Cut the collision skin.
  3. Compute mass properties for the pieces.

Visual Mesh

For fast retrieval of the cross-section and cut pieces of the visual mesh, GIANTS relies on a boundary representation (B-rep). The triangle-based B‑rep is loosely based on the Corner Table, as described by Jarek Rossignac, Alla Safonova, and Andrzej Szymczak in their paper “3D Compression Made Simple: Edgebreaker on a Corner Table” (see “For More Information” for a link). Winged- and half-edge B‑reps are less suitable because of their larger memory footprint and longer traversal times.

Cross-Sections

After cutting the visual mesh, the cross-sections are tessellated using the OpenGL* Utility Library (GLU). It’s important that cross-section texture alignment not show growth rings leaving the bark. The alignment is computed from the principal component analysis of the vertices.

Collision Skin

The collision skin consists of convex polyhedra. Splitting a convex by a plane results in convex pieces.

Mass Properties

The mass and inertia of the cut pieces are computed from the collision skin. The engine uses AABBs of convex polyhedra rather than actual polyhedra because the mass properties from AABBs are cheaper to compute and physically sound.

Mesh cutting is only one example of the advanced features of GIANT Software’s engine.

Development

GIANTS Software employs 16 people: eight artists and eight developers who have backgrounds in computer science and work experience at companies like Sony Computer Entertainment and NVIDIA. GIANTS operates with two eight-person teams in parallel. One team constantly works on Farming Simulator, delivering a primary version every two years and a mobile version every other year. When a game reaches the final development stages, both teams come together to finish off the release.

Farming Simulator 15 uses a new way to render called physically based rendering. This new renderer produces almost photorealistic rendering, notably for the numerous vehicles in the game, by simulating how the light behaves on materials like metal, plastic, and glass. GIANTS has been developing the rendering engine for the past 10 years using C++ and Microsoft Visual Studio*. Farming Simulator 15 is the fifth generation of the game, and it provides the team with a great basis on which to add new features.

GIANTS uses the engine, however, for more than Farming Simulator. The company has two other games, as well: Demolition Company and Ski Region Simulator. Users can extend the game through Lua scripts, and there’s a big community of fans who provide their own mods. Although the game itself isn’t CPU bound, the user scripts can put a lot of load on the CPU, so here’s where it helps for players to have a computer with an Intel® Core™ i7 processor available to keep gameplay smooth.

Dedicated Server Version

Farming Simulator supports multiplayer gaming over the Internet. Players can host their own sessions or use a game server provider to host the game for them. The game server providers run Windows Server* instances so that they can host as many games in parallel as possible.

To enable game server providers to run as many instances of Farming Simulator 15 as possible in parallel, GIANTS Software has developed a dedicated server version. The server version is headless and optimized for dual-socket Intel® Xeon® processors with 16 cores and hyperthreading. This highly optimized server version enables game server providers to run up to 50 instances of Farming Simulator 15 in parallel, driving down costs and increasing the scalability of the offering dramatically.

Challenges

Supporting a multitude of platforms with their various proprietary APIs is a lot of work for the team. They support Microsoft DirectX* version 9 and later, GLU, and a host of console APIs for Xbox and Nintendo.

In Farming Simulator 15, Intel GPA helped the team find and fix a bug in the rendering engine in which, using DirectX 9, it produced black triangles. Using the Graphics Frame Analyzer for DirectX and the Graphics Monitor for DirectX workloads, the developers were able to identify the problem in the rendering code and fix it in no time—a crucial step toward a timely and high-quality release of Farming Simulator 15.

Looking Forward

In coming versions of Farming Simulator, GIANTS Software will focus on using Metal* on OS X and DirectX 11 on Windows. Metal provides low-overhead access to the GPU, enabling developers to maximize the graphics and compute potential of the gaming engine. Features like precompiled shaders and support for efficient multithreading will help GIANTS Software improve its engine. Focusing on DirectX 11 will simplify development on Windows; version 11 APIs add new features such as tessellation, compute shaders, and dynamic shader linking.

The engine currently uses deferred rendering, in which all the geometries are passed down the pipe first. Only after all geometries have been passed down is the final image rendered, with shading applied as a last step. The advantage of deferred rendering is that it reduces the object count, which in turn leads to a reduced fragment count. Deferred rendering performs the lightning calculations only on the pixels visible on the screen, using the resolution size instead of the total fragment count. Deferred rendering is beneficial when using many dynamic lights.

The benefits of deferred rendering come at a cost, though: it uses big buffers, which need a lot of bandwidth to process. Deferred rendering is missing important features like transparent objects and anti-aliasing, too, so developers will have to work around those limitation using edge detection and similar algorithms. To be able to use multiple materials, developers will need to modify the deferred rendering, as well.

Forward rendering works differently. The graphics card projects the objects and breaks them down into vertices. The vertices are then transformed and split into fragments. The final rendering is done before the objects are passed to the screen. GIANTS Software is planning to switch to forward rendering for Farming Simulator 17.

About GIANTS Software GmbH

GIANTS Software GmbH is a Swiss video game development studio based in Zurich. Since 2004, GIANTS has produced many innovative games and technologic products. In addition to the development of its successful games Farming Simulator and Demolition Company, GIANTS offers its own game engine. For more information, visit www.giants-software.com.

For More Information

Jarek Rossignac, Alla Safonova, and Andrzej Szymczak, “3D Compression Made Simple: Edgebreaker on a Corner-Table,” http://www.cc.gatech.edu/~jarek/papers/CornerTableSMI.pdf.

Notices

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. Check with your system manufacturer or retailer or learn more at intel.com.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

Intel, the Intel logo, Intel Core, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others

© 2015 Intel Corporation.

Connect the Intel® Edison Board to IBM IoT Foundation

$
0
0

This article talks about creating a Bluemix application, registering a device, setting triggers using NodeRED flow editor and visualizing data using Rickshaw JS.

Create a Bluemix application with IoTF Starter boilerplate

   1.  Log in to the Bluemix console. Visit console.ng.bluemix.net and select LOG IN.

   2.  After logging in, you will be in the DASHBOARD view.

   3.  Click CREATE APP.

new.png

   4.  Select WEB.

   5.  Select the Browse Boilerplates option, then click BROWSE BOILERPLATES.

   6.  Select the Internet of Things Starter from the available boilerplates.

         IOTF.png

   7.  Name your app…, then click CREATE.

   8.  Wait for your application to finish staging. A message indicating that "Your app is running" will be displayed when done

   9.  Once your application has been created by IBM Cloud, from your application's dashboard, locate and click ADD A SERVICE OR API.

add-service.png

   10.  You can see a list of services available. Type Internet of Things in the search bar present at the top of the page.

search-service.png

   11.  Select the Internet of Things Service from the displayed list of services.

iotf-service.png

   12.  A window pops up with an auto-generated name. You can keep it or rename it as per your requirement.

iotf-create.png

   13.  Click CREATE.

   14.  After the service has been created, a popup window requesting to restage your application will appear. Click RESTAGE.

iotf-restage.png

   15.  Wait for your application to finish staging. A message indicating that "Your app is running" will be displayed when done.

iotf-running.png

Get the device ID of your Intel® Edison device

  1. Create a folder named ibm-iotf.

  2. Change to the directory ibm-iotf.

  3. Install MQTT onto your board using putty. Run:

    $npm install mqtt
  4. Install getmac node package, to get the mac address of the Edison device

    $npm install getmac

     

  5. Create file index.js.

    $vi index.js
  6. Open index.js file and create node reference variables for the modules mqtt and getmac.

    var mqtt = require('mqtt');
    var getmac= require('getmac');
  7. Print the MAC address using the getMac function provided by the getmac module.
    getmac.getMac(function(err, macAddress) {
      if (err) throw err;
      var deviceId = macAddress.toString().replace(/:/g, '').toLowerCase();
      console.log("Device ID: " + deviceId);
    });Note: We are using the MAC address as device ID.
  8. Save the file and run index.js.
    $node index.js
  9. Copy the device ID/ MAC address printed on the console.

Register your Intel® Edison device

You need to register your device in order to communicate with the Bluemix cloud.

   1.  Go to console.ng.bluemix.net and open the created application from left panel.
        From your application's dashboard, locate and click the Internet of Things service.

iotf-launch.png

   2.  You will be redirected to the Internet of Things Foundation homepage which would look like

iotf-page.png

   3.  Click Launch Dashboard. The dashboard opens in a new page.

   4.  Locate and click + Add a device.

iotf-add.png

   5.  A window pops up with Add Device as title. Click Create device type to define a type for your Edison device.

iotf-type.png

   6.  Type edison for the name and click Next located at the bottom right of the page.

iotf-next.png

   7.  Click Next in the following pages until you see Create Device Type. Click Create.

   8.  You will be returned to the Add Device page. Click Next.

   9.  Copy the device-id / mac-address you got in the previous step and paste in the Device ID field.

iotf-id.png

   10.  Click Next until you see Add Device. Click Add.

   11.  Save your device credentials into a file as config.json, which you will be using to send data to the cloud.

iotf-info.png

Send Data to IBM Cloud

   1.  Make sure you have the file config.json in the folder ibm-iotf along with the index.js file. It should be in the below format with the values you got in the previous step.

Screen Shot 2015-10-11 at 11.30.45 PM.png

org - organization name

port - port number (1883 fixed)

type - device type

id - device id

auth-method - Authentication type

password - Authentication token

username - Username (use-token-auth)

   2.  Open the index.js file.

   3.  Create a variable host to store the hostname.

var host = config.org + ".messaging.internetofthings.ibmcloud.com";

   4.  Create a variable clientId to store the client ID which is a combination of organization, device type and device ID.

var clientId = "d:" + config.org + ":" + config.type + ":" + config.id;

   5.  Create a variable topic_pub to store the topic to publish the data.

var topic_pub = "iot-2/evt/status/fmt/json";
status - event type
json - data format

   6.  Create a variable client by calling the mqtt API mqtt.connect with the required parameters.

var client = mqtt.connect(
 {
   host: host,
   port: config.port,
   username: config.username,
   password : config.password,
   clientId : clientId
 });

   7.  Create a function sendMessage to publish data to IBM Cloud.

function sendMessage(){
// Generates a random value, you can use the actual sensor value.
   var value = Math.floor((Math.random() * 30) + 60);
   var message = {"d" : {"value" : value,
     }
   };
   client.publish(topic_pub, JSON.stringify(message));
 }

   8.  Call the above function sendMessage periodically using JavaScript built-in function setInterval when the connection is established with the IBM Cloud.

client.on('connect', function () {
   console.log('Connected to IBM');
   setInterval(sendMessage, 1000);
 });

   9.  Run the application.

$node index.js

   10.  Go to console.ng.bluemix.net and launch the Internet of Things dashboard. Please refer Register your Edison Device section on how to launch the dashboard.

   11.  On the dashboard, go to DEVICES tab and click on your device.

Screen Shot 2015-10-11 at 11.19.20 PM.png

   12.  A popup window gets opened and you can view the messages being received as events.

Screen Shot 2015-10-11 at 11.18.29 PM.png

 

Create API Key

IBM provides an interface to respond to the real-time events or data using NODE-RED flow editor. To implement a Node-RED application that processes real-time events from devices in your IoT organization, you need to generate an API Key to access the data generated by the device.

   1.  From your IoT organization dashboard go to ACCESS tab and click API Keys.

Screen Shot 2015-10-12 at 12.08.41 AM.png

   2.  Click Generate API Key.

Screen Shot 2015-10-12 at 12.21.31 AM.png

   3.  Copy and save the access key which will be used in the NODE-RED flow editor to access the date.

Screen Shot 2015-10-12 at 12.27.01 AM.png

Note: You can only view it once but you can create multiple keys.

 

Create Triggers

   1.  From your application's dashboard, locate and click the route.

Screen Shot 2015-10-12 at 2.17.10 AM.png

   2.  Go to the Node-RED bluemix flow editor by clicking the red button.

Screen Shot 2015-10-12 at 2.24.27 AM.png

   3.  Double-click on IBM IoT App In and change Authentication to API Key.

Screen Shot 2015-10-12 at 2.28.00 AM.png

   4.  Click the edit icon next to the API Key.

Screen Shot 2015-10-12 at 2.38.51 AM.png

   5.  In the popup window, enter Name as API. Enter API Key and Token obtained from the Generate API Key section. Click Add.

Screen Shot 2015-10-12 at 2.37.10 AM.png

   6.  Enter values for Device Type, Device Id and Format leaving the field Event. Click Ok.

Screen Shot 2015-10-12 at 2.48.46 AM.png

   8.  Double-click on the temp threshold and change the value from 40 to 70.

   9.  Drag and drop the ibmiot output node from left pane.

Screen Shot 2015-10-12 at 3.05.32 AM.png

   10.  Double-click on the output node. In the edit popup window, select the same API key used for the input node, select Output Type as Device Command and fill the rest of the fields. Click OK.

Screen Shot 2015-10-12 at 3.13.38 AM.png

   11.  Now connect this ibm output node to the danger template.

Screen Shot 2015-10-12 at 3.31.25 AM.png

   12.  Double-click on danger template, change it to JSON and update the template.

Screen Shot 2015-10-12 at 3.31.08 AM.png

   13.  Click Deploy on the top-right corner.

   14.  If there are no errors, you can either select device data or cpu status nodes to view the messages on debug console.

Subscribe to triggers

  1. Open index.js file and create a variable topic_sub to store the topic to subscribe to.

    var topic_sub = "iot-2/cmd/trigger/fmt/json";
  2. Subscribe to the topic on connection with IBM. Add the subscribe function inside on connect.

    client.on('connect', function () {
       client.subscribe(topic_sub);
       console.log('Connected to IBM');
       setInterval(sendMessage, 1000);
    });
  3. On receive, log the message to console.

    client.on('message', function(topic, message) {
         console.log(JSON.parse(message));
    });

Additional resources

 

 

Diagnostic xxxxx: "ebx in __asm and stk frame alignment undone"

$
0
0

Diagnostic message "ebx in __asm and stk frame alignment undone" emitted by the Intel(R) C++ Compiler 

Cause: 

This diagnostic message is emitted If the callee in the application code has aligned stack frame and the caller uses ebx in asm insertions then inlining is normally disabled. Similarly if the callee uses ebx in asm insertions and the caller has aligned frame inlining is disabled.

However, if the callee is marked with "forceinline" and the alignment is not required for correctness but is desirable, then forceinline will override the desired alignment. Alignment is required if the rdecl stack alignment is greater than or equal to 16 bytes.

“That’s the Beauty of Unity*!” Intel® x86 and Unity Contest Challenge Winners Share Their Experiences

$
0
0

Download PDF

As game developers, you’re always looking for a way to reach wider audiences and drive greater performance for your games. Here at Intel, we’re always looking for ways to support you in achieving those goals. We recently spoke about these topics with some of the game developers who won our joint contest with Unity Technologies. They had some interesting insight to share on how easy it is to add support for the Android* platform in Unity, the new access they gained to important markets as a result, and the performance gains they enjoyed in their games as well. Some of the developers said that they plan to include x86 support in their games going forward. Overall, participating in the contest was a positive and motivating experience, as you’ll read here.

Adding x86 Support to the Android Platform in Unity: A Piece of Cake

Racing through a Tron-like world

Figure 1: Racing through a Tron-like world.

Participating in the contest was a piece of cake. James Carmichael of Fingerbait, whose title HARDKOUR – Parkour Runner Free* was a winner, said, “In all honesty, it took me zero effort to make it x86 ready. I started Unity, built the updated package, and it was ready to roll. It might have been a different story if I were building the projects manually myself, but because of the build process within Unity, I didn’t need to think about it too long. That’s the beauty of Unity!”

Michael Bowen of Bowen Games LLC said it was effortless to add x86 support to his game Farming USA*. “I use the Unity game engine to develop my games,” he explained, “And so it was easy to port my game to x86 Android devices. I was able to use the beta builds of Unity releases that included x86 support and get my already-published Android games working almost immediately!”

A Nice Boost: Performance Gains in x86 for Android

Dodging lethal aliens and perilous traps 1Dodging lethal aliens and perilous traps

Figure 2: Dodging lethal aliens and perilous traps 1Dodging lethal aliens and perilous traps.

After the winners’ games were made available for x86, several developers noticed a boost in performance. Andrea Sancio, whose game Gear Jack Black Hole* was among the winners, said, “We saw noticeable performance improvements on our internal Android Intel test devices, and the live analytics did show improvements in average [frames per second].” Bill Yeung of Element Cell Game Limited, whose Shepherd Saga* also won, noted, that since his company started offering an x86 version, he’s seen far fewer crash reports and complaints about game performance than before.

So, you might ask, do the winning game developers have any special tips and tricks to share about how best to add x86 Android support for games in Unity? Many of them said the process was so easy that anyone could do it. “Honestly, there is no secret to share. Unity did it all,” said Alex Nabrozidis of Gibs and Gore, whose Micronytes Director’s Cut* was honored. Arian Sohn of SOGWARE, whose game MOB: The Prologue* took home a prize, offered a pointer for his fellow developers. He suggested making sure to run “enough execution tests (on both x86 and ARM* devices) before release.”

Winning the Contest: a Great Experience

Surveying crops and developing the farm

Figure 3: Surveying crops and developing the farm.

The contest winners were thrilled to have their games recognized. “It was awesome to be a contest winner,” reported Michael Bowen, “as I am the owner and only dev for my own new game dev business. It gives me encouragement to continue to work hard making games and expanding into all device and hardware options.”

Bill Yeung observed that winning the contest helped his company reach its local market: “We were glad to see the support for developing x86 Android games. We found that a significant number of players in our local market (Hong Kong and Taiwan) were using x86 devices. Having native x86 support greatly improved the performance and stability on those devices.”

x86 Support for the Android Platform: A Natural Fit Going Forward

Caring for adorable sheep

Figure 4: Caring for adorable sheep.

Several of the developers we spoke with plan to include x86 support for the Android platform going forward. Andrea Sancio summed up his decision succinctly: “Adding support was painless and fast, my players saw great performance improvements, so why not?” Added James Carmichael, “I’m actually going to concentrate on Android releases as my primary platform.” He is currently developing a helicopter game for Android, so stay tuned! Element Cell Game Limited has already released two new games with x86 support included—Shepherd Saga 2 and Dragoon Maiden*. Keep an eye out for its upcoming title, Clash of Heavens*, which is also expected to include x86 support.

The verdict is in. It’s incredibly easy to include x86 support for the Android platform in Unity, and you can even see some great performance gains as a result. Have you tried it yet? If not, you can get started in no time with our article on how to produce a fat APK that includes both x86 and ARM libraries. Let us know how it goes!

How Intugine Integrated the Nimble* Gesture Recognition Platform with Intel® RealSense™ Technology

$
0
0

Shwetha Doss, Senior Application Engineer, Intel Corporation

Harshit Shrivastava, Founder and CEO, Intugine Technologies

Abstract

Intel® RealSense™ technology helps developers enable a natural user interface (NUI) for their gesture recognition platforms. The gesture recognition platform seamlessly integrates with Intel RealSense technology for NUI across segments of applications on Microsoft Windows* platforms. The gesture recognition platform handles all interactions with the user and the Intel® RealSense™ SDK, ensuring that no code changes are required for individual applications.

This paper highlights how Intugine (http://www.intugine.com/) enabled its gesture recognition platforms for Intel® RealSense™ technology. It also discusses how the same methodology can be applied to other applications related to games and productivity applications.

Introduction

Intel® RealSense™ technology adds “human-like” senses to computing devices. Intel® is working with OEMs to create future computing devices that will be able to hear, see, and feel the environment, as well as understand human emotion and a human’s sensitivity to context. These devices will interact with humans in immersive, natural, and intuitive ways.

Intel® RealSense™ technology understands four important modes of communication: hands, the face, speech, and the environment around you. This multi-modal processing will enable the devices to behave more like humans.

The Intel® RealSense™ Camera

The Intel® RealSense™ camera uses depth-sensing technology so that computing devices see more like you do. To harness the possibilities of the Intel® RealSense™ technology, developers need to use the Intel® RealSense™ SDK along with the Intel® RealSense™ camera. There are two camera options: theF200 and the R200. These Intel-developed depth cameras support full VGA depth resolution, full 1080p RGB resolution, and require USB 3.0. Both cameras support depth and IR processing at 640×480 resolution at 60 frames per second (FPS).

There are many OEM devices with integrated Intel® RealSense™ cameras available, including Ultrabooks*, tablets, notebooks, 2 in1s, and all-in-one form factors.

Gesture Recognition Platform

Figure 1. Intel® RealSense™ cameras.

The Intel® RealSense™ camera (F200)

Figure 2. The Intel® RealSense™ camera (F200).

The infrared (IR) laser projector on the Intel RealSense camera (F200) sends non-visible patterns (coded light) onto the object. The IR camera captures the reflected patterns. These patterns are processed by the ASIC, which assigns depth values to each pixel to create a depth video frame.

Applications see both depth and color video streams. The ASIC syncs depth with color stream (texture mapping) using a UVC time stamp and generates data flags for each depth value (valid, invalid, or motion detected.) The range of the F200 camera is about 120 cm.

The Intel® RealSense™ camera (R200)

Figure 3. The Intel® RealSense™ camera (R200).

The R200 camera actually has three cameras providing RGB (color) and stereoscopic IR to produce depth. With the help of a laser projector, the camera does 3D scanning for scene perception and enhanced photography. The inside range is approximately 0.5–3.5 meters, and the outside range is up to 10 meters.

Intel® RealSense™ SDK

The Intel® RealSense™ SDK includes a set of pattern detection and recognition algorithm implementations exposed through standardized interfaces. These algorithms implementations enable the application developer’s focus to move from coding the algorithm details to innovating on the usage of these algorithms.

Intel® RealSense™ SDK Architecture

The SDK library architecture consists of several components. The essence of the SDK functionalities lays in the I/O modules and the algorithm modules. The I/O modules retrieve input from the input device or send output to an output device.

The algorithm module includes various pattern detection and recognition algorithms related to face recognition, gesture recognition, and speech recognition.

The Intel® RealSense™ SDK architecture

Figure 4. The Intel® RealSense™ SDK architecture.

The Intel® RealSense™ SDK

Figure 5. The Intel® RealSense™ SDK provides 78-point face landmarks.

The Intel® RealSense™ SDK provides skeletal tracking

Figure 6. The Intel® RealSense™ SDK provides skeletal tracking.

Intugine Nimble*

Intugine Nimble* is a high-accuracy, motion-sensing wearable device. The setup consists of a USB sensor and two wearable devices: a ring and a finger clip. The sensor tracks the movement of rings in 3D space with sub-millimeter accuracy and low latency. The device works on computer vision, where the rings do a certain patterned emission in a narrow nanometer bandwidth, and the sensor is coupled to see only that wavelength. The software algorithm sitting on the host device recognizes the emitted pattern and tracks the rings individually. The software generates the coordinates of the rings at a high frame rate of over 60 coordinates per second, for each ring.

The Intugine Nimble

Figure 7. The Intugine Nimble* effectively replaces the mouse and keyboard.

I.

Applications With Nimble

Some of the available applications that Nimble can control are games such as Fruit Ninja*, Angry Birds*, and Counter-Strike* and utility applications such as Microsoft PowerPoint* and media players. These available applications are currently controlled by mouse and keyboard inputs. To control them with Nimble, we need to generate the keyboard and mouse events programmatically.

The software module that takes care of the keyboard and mouse events is called the interaction layer. Nimble uses a proprietary software interaction layer to interact with existing games and applications. The interaction layer maps the user’s fingertip coordinates to the application/OS recognizable mouse and keyboard events.

Nimble with the Intel® RealSense™ SDK

The Intel® RealSense™ SDK can detect IR emissions of 860 nm. The patterned emission of Nimble rings can be customized to a certain wavelength range. Replacing the emission source in the ring by an 860 nm emitter, the ring emits similar patterns in the 860 nm range. The Intel® RealSense™ SDK can sense these emissions, which can be taken as an image stream and then tracked using the SDK. By implementing Nimble pattern recognition and tracking algorithms in the Intel® RealSense™ SDK, we get the coordinates of individual rings at 60 FPS.

Intel® RealSense™ SDK’s design avoids most of lens and curvature defects, which allows a better scaled motion tracking of Nimble rings. The IR resolution of 640×480 generates refined spatial coordinate information. The Intel® RealSense™ SDK supports up to 300 FPS in the IR stream, which provides almost zero latency in Nimble’s tracking and provides an extremely responsive experience.

Nimble technology is designed to track only the emissions of rings and thus misses the details of skeletal tracking that might be required for a few applications.

The Intugine Nimble

Figure 8. The Intugine Nimble* along with Intel® RealSense™ technology.

Value proposition for Intel® RealSense™ Technology

Nimble along with Intel® RealSense™ technology can support a wide range of existing applications. Currently over 100 applications are working seamlessly without needing any source-code modifications. And potentially most of the Microsoft* Windows and Android* applications can work with this solution.

Currently the Intel® RealSense™ camera (F200) supports a range of 120 cm. With the addition of Nimble, this range can extend to over 15 feet.

Nimble allows sub-millimeter accurate finger tracking within a range of 3 feet and sub-centimeter accurate tracking within a range of 15 feet. This enables many high-accuracy games and applications to be used with better control.

Nimble along with Intel® RealSense™ technology reduces the application latency to less than 5 milliseconds.

Nimble along with Intel® RealSense™ technology can support multiple rings together; we have tested up to eight rings with Intel® RealSense™ technology.

Summary

Nimble’s interaction layer along with Intel® RealSense™ technology can help add gesture support to any application without any changes to the source code. Using this technology, applications in Windows* and Android* platforms can add gesture support with minimal efforts.

For More Information

  1. Intel® RealSense™ technology: http://www.intel.in/content/www/in/en/architecture-and-technology/realsense-overview.html
  2. Intugine: http://www.intugine.com/
  3. https://software.intel.com/en-us/articles/realsense-r200-camera

Optimizing for Intel® Xeon Processors

$
0
0

Optimization Steps

The key to performance measurement is two-fold, know exactly what you are measuring and collect your baseline data. Next, profile your application and identify a specific and realistic performance goal based on the profiling data. Follow these steps to optimize your software.

Vectorization Toolkit

Fundamental Concepts

The Intel Compilers provide a number of features for generating vectorized code. Auto-vectorization is the method used by the Intel Compilers to generate vectorized code for a given application without requiring code changes. Developers can also implement simple coding changes in the source code to enforce vectorization behavior. 

Intel Compiler Auto-vectorization (C++ | Fortran)

Vectorization Essentials

Common Vectorization Tips

Performance Essentials with OpenMP 4.0 Vectorization

Intermediate Techniques

Proven techniques for code optimizations and change recommendations are listed here. Note that these recmonndations depend entirely upon the application. 

Fortran Array Data and Arguments and Vectorization

Data Alignment to Assist Vectorization

Program Optimization through Loop Vectorization

Outer Loop Vectorization

Random Number Function Vectorization

Optimization Reports

Code changes may be required in order to facilitate vectorization even further. Once a developer has made changes to the code, how does one that the changes elicit the expected behavior? Use of special compiler optimization reports to guide source code changes and verify that the code does indeed vectorize.

Vectorization and Optimization Reports

Overview of Vectorization Reports and the -vec-report6 Option

Advanced Methods

The techniques offering the most control require greater application knowledge and skill in knowing where they should be applied. But these more intensive techniques, such as intrinsics, can result in greater performance when properly used.

Getting Started with Intel® Cilk™ Plus SIMD Vectorization and SIMD-enabled Functions

Outer Loop Vectorization via Intel Cilk Plus Array Notations

Intel Intrinsics Guide

References

Intel® Fortran Vectorization Diagnostics

Vectorization Diagnostics for Intel(R) C++ Compiler


How to import perf data in VTune amplifier XE

$
0
0


Intel(R) VTune(TM) Amplifier XE  supports importing of *.perf files. (You can also use the import action to import VTune Amplifier *.tb5/*.tb6*/*.perf/*.csv/*.sw1/*.sww1/*.ww1 data collection files and convert them into a result.)

Please notice the prerequisites for importing a *.perf file.
Run the Perf collection with the predefined command line options:
• For application analysis:
          >perf record -o <trace_file_name>.perf --call-graph dwarf -e cpu-cycles,instructions <application_to_launch>
• For process analysis:
          >perf record -o <trace_file_name>.perf --call-graph dwarf -e cpu-cycles,instructions <application_to_launch> -p <PID> sleep 15
 
     where the -e option is used to specify a list of events to collect as -e <list of events>; --call-graph option (optional) configures samples to be collected together with the thread call stack at the moment a sample is taken.
     See Linux Perf documentation on possible call stack collection options (for example, dwarf) and its availability in different OS kernel versions.

You can collect performance data remotely with the Intel® VTune™ Amplifier collectors (for example, SEP collector or Intel SoC Watch collector) or Linux* Perf* collector, import this data to the VTune Amplifier project, and view the data in the graphical or command line interface.

More details are in the "Importing Results in the VTune Amplifier GUI" and "Importing Results from the Command Line" article of the VTune help.

 

3D people full-body scanning system with Intel® RealSense™ 3D cameras and Intel® Edison: How we did it

$
0
0

By Konstantin Popov of Cappasity

Cappasity has been developing 3D scanning technologies for two years. This year we are going to release a scanning software product for Ultrabook™ devices and tablets with Intel® RealSense™ cameras: Cappasity Easy 3D Scan*. Next year we plan to create hardware and software solutions to scan people and objects. 
 
As an Intel Software Innovator and with the help of the Intel team, we were invited to show the prototype of the people scanning system much earlier than planned. We had limited time for preparations, but still we decided to take on the challenge. In this article I'll explain how we created our demo for the Intel Developer Forum 2015 held August 18– 20 in San Francisco.

Cappasity instant 3D body scan

Our demo is based upon previously developed technology that combines the multiple depth cameras and the RGB cameras into a single scanning system (U.S. Patent Pending). The general concept is as follows: we calibrate the positions, angles, and optical properties of the cameras. This calibration allows us to merge the data for subsequent reconstruction of the 3D model. To capture the scene in 3D we can place the cameras around the scene, rotate the camera system around the scene, or rotate the scene itself in front of the cameras.
 
We selected the Intel RealSense camera because we believe that it's an optimum value-for-money solution for our B2B projects. At present we are developing two prototype systems using several Intel RealSense cameras: a scanning box with several 3D cameras for instant scanning and a system for full-body people scanning.
 
We demonstrated both prototypes at IDF 2015. The people scanning prototype operated with great success for the three days of the conference, scanning many visitors who came to our booth.

A system for full-body people scanning

Now let's see how it works. We attached three Intel RealSense cameras to a vertical bar so that the bottom camera is aimed at the feet and lower legs, the middle camera captures the legs and the body, and the top-most camera films the head and the shoulders.

Three Intel RealSense cameras attached to a vertical bar

Each camera is connected to a separate Intel® NUC computer, and all the computers are connected to the local area network.
 
Since the cameras are mounted onto a fixed bar, we used a rotating table to rotate the person being filmed. The table construction is quite basic: a PLEXIGLAS® pad, roller bearings, and a step motor. The table is connected to the PC via an Intel® Edison board; it receives commands through the USB port.

The table is connected to the PC via an Intel® Edison board

a simple lighting system to steadily illuminate the front

We also used a simple lighting system to steadily illuminate the front of a person being filmed. In the future, all these components will be built into a single box, but at present we were just demonstrating an early prototype of the scanning system, so we had to assemble everything using a commercially available component.

Cappasity fullbody scan

Our software operates based on the client-server architecture, but the server part can be run on almost any modern PC. That is, any computer that performs our calculations is a "server" in our system. We often use an ordinary Ultrabook with Intel® HD Graphics as a server. The server sends the recording command to the Intel NUC computers, gets the data from them, then analyzes and rebuilds the 3D model. 
 
Now, let's look at some particular aspects of the task we are trying to solve. The 3D rebuilding technology that we use in the Cappasity products is based upon our implementation of the Kinect* Fusion algorithm. But in this case our challenge was much more complex: we had only one month to create an algorithm to reconstruct the data from several sources. We called it "Multi-Fusion." In its present state the algorithm can merge the data from an unlimited number of sources into a single voxel volume. For scanning people three data sources were enough.
 
Calibration is the first stage. The Cappasity software allows the devices to be calibrated pairwise. Our studies from the year we spent in R&D came in pretty handy in preparation for IDF 2015. In just a couple of weeks we reworked the calibration procedure and implemented support for voxel volumes after Fusion. Previously the calibration process was more involved with processing the point cloud. The system needs to be calibrated just once, after the cameras are installed. Calibration takes no more than 5 minutes.
 
Then we had to come up with a data-processing approach, and after doing some research we chose post-processing. That is, first we record the data from all cameras, then we upload the data to the server via the network, and then we begin the reconstruction process. All cameras record color and depth streams. As a result, we have the complete data cast for further processing. It is convenient considering that the post-processing algorithms are constantly improved, and the ones we're using were written in just a couple of days before IDF.
 
Compared to the Intel RealSense camera (F200), the Intel RealSense camera (long-range R200) performs better with black color and complex materials. We had few glitches in tracking. The most important thing, however, is that the cameras allow us to capture the images at the required range. We have optimized the Fusion reconstruction algorithm for OpenCL™ to achieve good performance even on Intel HD Graphics 5500 and later. To remove the noise we used Fusion plus additional data segmentation after a single mesh was composed.

Fusion plus additional data segmentation after a single mesh was composed

High resolution texture mapping algorithm

In addition, we have refined the high-resolution texture mapping algorithm. We use the following approach: we capture the image at the full resolution of the color camera, and then we project the image onto the mesh. We are not using voxel color since it causes the texture quality to degrade. The projection method is quite complex to implement, but it allows us to use both built-in and external cameras as color sources. For example, the scanning box we are developing operates using DSLR cameras to get high-resolution textures, which is important for our e-commerce customers.
 
However, even the built-in Intel RealSense cameras with RGB provide perfect colors. Here is a sample after mapping the textures:

Sample after mapping the textures

We are developing a new algorithm to eradicate the texture shifting. We plan to have it ready by the release of our Easy 3D Scan software product. 
 
Our seemingly simple demo is based upon complex code allowing us to compete with expensive scanning systems at USD 100K+ price range. The Intel RealSense cameras are budget-friendly, which will help them revolutionize the B2B market.
 
Here are the advantages of our people scanning system:

  • It is an affordable solution, and it’s easy to setup and operate. Only a press of a button is needed.
  • Small size: the scanning system can be placed in retail areas, recreational centers, medical institutions, casinos, and so on.
  • The quality of the 3D models is suitable for 3D printing and for developing content for AR/VR applications.
  • The precision of the resulting 3D mesh is suitable for taking measurements.

 
We understand that the full potential of the Intel RealSense cameras is yet to be uncovered. We are confident that at CES 2016 we'll be able to demonstrate significantly improved products.

Intel® C++ Compiler for Windows* - Fatal link error LNK1104 when using Intel® C++ Compiler with Boost* libraries

$
0
0

When building an application that uses the Boost libraries with the Intel® C++ Compiler, you may get linker errors like the ones shown below due to incorrect libraries being linked to the application:

fatal error LNK1104: cannot open file 'libboost_thread-iw-mt-1_33_1.lib'
fatal error LNK1104: cannot open file 'libboost_thread-iw-1_33_1.lib'
...

The root cause is missing Boost libraries for the Intel® C++ Compiler.

The preferred solution is to recompile all required Boost libraries with the Intel® C++ Compiler (libraries with the infix "iw" are created because of this). However, this is not mandatory. The libraries provided for the different Microsoft Visual Studio* versions are safe to use as well. Perform the following steps to use them instead:

  1. Open the Boost configuration file "auto_link.hpp".
  2. Search for
    #elif defined(__ICL)
    
    
    						   // Intel C++, no version number:
    
    						#  define BOOST_LIB_TOOLSET "iw"
  3. Change "iw" depending on which Microsoft Visual Studio version you're using:

    "vc71": Microsoft Visual Studio .NET 2003
    "vc80": Microsoft Visual Studio 2005
    "vc90": Microsoft Visual Studio 2008
    "vc100": Microsoft Visual Studio 2010
    "vc110": Microsoft Visual Studio 2012
    "vc120": Microsoft Visual Studio 2013
    "vc140": Microsoft Visual Studio 2015

  4. Rebuild your application to resolve the linker errors.
 
 

Intel® C++ Composer XE 2013 SP1 for Linux*, Update 6

$
0
0

Intel® C++ Composer XE 2013 SP1 Update 6 includes the latest Intel C/C++ compilers and performance libraries for IA-32, Intel® 64, and Intel® Many Integrated Core (Intel® MIC) architecture systems. This new product release now includes: Intel® C++ Compiler XE Version 14.0 Update 6, GNU* Project Debugger (GDB*) 7.5, Intel® Debugger 13.0, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Integrated Performance Primitives (Intel® IPP) Version 8.1 Update 1, Intel® Threading Building Blocks (Intel® TBB) Version 4.2 Update 5.

New in this release:

Note:

  1. For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  l_ccompxe_online_2013_sp1.6.214.sh
Product for developing 32-bit and 64-bit applications

File:  l_ccompxe_2013_sp1.6.214.tgz
Product for developing 32-bit and 64-bit applications

File:  l_ccompxe_2013_sp1.6.214_redist.tgz
Redistributable Libraries

File:  get-ipp-8.1-crypto-library.htm
Cryptography Library

Intel® Fortran Composer XE 2013 SP1 for Linux*, Update 6

$
0
0

Intel® Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32, Intel® 64, and Intel® Many Integrated Core (Intel® MIC) architecture systems. This new product release now includes: Intel® Fortran Compiler XE Version 14.0.6, Intel® Debugger Version 13.0.0, GNU* Project Debugger (GDB*) 7.5, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4

New in this release:

Note:

  1. For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

     

Resources

Contents
File:  l_fcompxe_online_2013_sp1.6.214.sh
Online installer

File:  l_fcompxe_2013_sp1.6.214.tgz 
Product for developing 32-bit and 64-bit applications

File:  l_fcompxe_2013_sp1.6.214_redist.tgz
Redistributable Libraries

Intel® C++ Composer XE 2013 SP1 for Windows*, Update 6

$
0
0

Intel® C++ Composer XE 2013 SP1 Update 6 includes the latest Intel C/C++ compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® C++ Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Integrated Performance Primitives (Intel® IPP) Version 8.1 Update 1, Intel® Threading Building Blocks (Intel® TBB) Version 4.2 Update 5, Intel(R) Debugger Extension 7.5-1.0 for Intel(R) Many Integrated Core Architecture.

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File: w_ccompxe_online_2013_sp1.6.241.exe
Online installer

File: w_ccompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications

File:  w_ccompxe_redist_msi_2013_sp1.6.241.zip
Redistributable Libraries for 32-bit and 64-bit msi files

File:  get-ipp-8.1-crypto-library.htm
Cryptography Library

Intel® Visual Fortran Composer XE 2013 SP1 for Windows*, Update 6

$
0
0

Intel® Visual Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Debugger Extension 7.5-1.0 for Intel® Many Integrated Core Architecture (Intel® MIC Architecture)

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  w_fcompxe_novsshell_online_2013_sp1.6.241.exe
Online installer (for customers who have Microsoft Visual Studio* already installed)

File:  w_fcompxe_novsshell_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (for customers who have Microsoft Visual Studio* already installed)

File:  w_fcompxe_redist_msi_2013_sp1.6.241.zip 
Redistributable Libraries for 32-bit and 64-bit msi files


Intel® Visual Fortran Composer XE 2013 SP1 for Windows* with Microsoft Visual Studio 2010 Shell & Libraries*, Update 6

$
0
0

Intel® Visual Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Debugger Extension 7.5-1.0 for Intel® Many Integrated Core Architecture (Intel® MIC Architecture)

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  w_fcompxe_online_2013_SP1.6.241.exe
Online installer

File:  w_fcompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, English version)

File:  w_fcompxe_all_jp_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, Japanese version)

File:  w_fcompxe_redist_msi_2013_sp1.6.241.zip 
Redistributable Libraries for 32-bit and 64-bit msi files

Intel® Visual Fortran Composer XE 2013 SP1 for Windows* with IMSL*, Update 6

$
0
0

Intel® Visual Fortran Composer XE 2013 SP1 Update 6 includes the latest Intel Fortran compilers and performance libraries for IA-32 and Intel® 64 architecture systems. This new product release now includes: Intel® Visual Fortran Compiler XE Version 14.0.6, Intel® Math Kernel Library (Intel® MKL) Version 11.1 Update 4, Intel® Debugger Extension 7.5-1.0 for Intel® Many Integrated Core Architecture (Intel® MIC Architecture), IMSL* Fortran Numerical Library Version 7.0.1

New in this release:

Note:  For more information on the changes listed above, please read the individual component release notes. See the previous releases's ReadMe to see what was new in that release.

Resources

Contents
File:  w_fcompxe_online_2013_sp1.6.241.exe
Online installer

File:  w_fcompxe_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, English version)

File:  w_fcompxe_all_jp_2013_sp1.6.241.exe
Product for developing 32-bit and 64-bit applications (with Microsoft Visual Studio 2010 Shell & Libraries*, Japanese version)

File:  w_fcompxe_redist_msi_2013_sp1.6.241.zip 
Redistributable Libraries for 32-bit and 64-bit msi files

File:  w_fcompxe_imsl_2013_sp1.0.024.exe 
IMSL* Library for developing 32-bit and 64-bit applications

Android* Low-Latency Audio on x86-based Mobile Devices

$
0
0

Download PDF

Objective

This document explains how Android* low-latency audio is implemented on x86 devices starting with the Intel® Atom™ processor-based (codenamed Bay Trail) platform. You can use this guide to aid your investigation of low-latency audio development methods on Intel® devices with low-latency Android build (4.4.4).

Note: Android M Release audio is still under investigation.

Introduction

Android has long been unsuccessful at producing a low-latency audio solution for applications that are focused on sound creation. High latencies negatively impact music creation, game, DJ, and karaoke apps. User interactions on these applications create sound, and end users find the resulting delay in the audible signal to be too high, thus negatively impacting their user experience.

Latency is the delay resulting from when an audio signal is created to when it is played back by some interaction. Round-Trip Latency, or RTL, is the time delay from when an input action by a system or user prompts for a signal and the time it takes to generate an outbound signal.

Users experience audio latency in Android applications when they touch the object to generate sound and the sound is delayed for some time before it is output to the speaker. On most ARM*- and x86-based devices, audio RTL can be measured as low as 300ms and as high as 600ms, mostly in applications developed using the Android method for audio found here: Design for Reduced Latency. These ranges are not acceptable by the user base. The desired latencies have to be well below 100ms, and in most cases under 20ms is the most desired RTL. What also has to be taken into account is the overall latency generated by Android in touch-based music applications which is the total of Touch Latency, Audio Processing Latency, and Buffer Queuing.

This document will only focus on reducing Audio Latency and not total latency; however, it does account for the bulk of the total latency.

The Android Design for Audio

Android's audio Hardware Abstraction Layer (HAL) connects the higher level, audio-specific framework APIs in android.media to the underlying audio driver and hardware.

You can see how the audio framework is diagramed here: https://source.android.com/devices/audio/index.html

OpenSL ES*

Android specifies the use of OpenSL ES APIs to develop the most robust method for efficiently processing round-trip audio. Though it is not the best option for low-latency audio support, it is the recommended option. This is primarily due to the buffer queuing mechanism OpenSL utilizes, making it more efficient within the Android Media Framework. Since the implementation is native code, it may deliver better performance because native code is not subject to Java* or Dalvik VM overheads. We assume that this is the way forward for audio development on Android. As specified in the Android Native Development Kit (NDK) documents for Open SL, Android releases will continue to improve upon Open SL implementations.

This document will examine the use of the OpenSL ES API through the NDK. As an introduction to OpenSL, examine the three layers that make up the code base for Android Audio using OpenSL.

  • Top-level application programming environment is the Android SDK, which is Java based.
  • Lower-level programming environment, called the NDK, allows developers to write C or C++ code that can be used in the application via the Java Native Interface (JNI).
  • OpenSL ES API, which has been implemented since Android 2.3 and is built into the NDK.

OpenSL operates, like several other APIs, by employing a callback mechanism. In OpenSL the callback can only be used to notify the application that a new buffer can be queued (for playback or for recording). In other APIs, the callback also handles pointers to the audio buffers that are to be filled or consumed. But in OpenSL, by choice, the API can be implemented so the callbacks operate as a signaling mechanism to keep all the processing in your audio processing thread. This would include queuing the required buffers after the assigned signals are received.

Google recommends using a method within OpenSL called the Sched_FIFO policy. The Sched_FIFO policy is based on a ring or circular buffer technique.

Sched_FIFO Policy

Since Android is based on Linux*, Android institutes the Linux CFS scheduler. The CFS may allocate CPU resources in unexpected ways. For example, it may take the CPU away from a thread with numerically low niceness onto a thread with a numerically high niceness. In the case of audio, this can result in buffer timing issues.

The primary solution is to avoid CFS for high-performance audio threads and use the SCHED_FIFO scheduling policy instead of the SCHED_NORMAL (also called SCHED_OTHER) scheduling policy implemented by CFS.

Scheduling latency

Scheduling latency is the time between when a thread becomes ready to run and when the resulting context switch completes so that the thread actually runs on a CPU. The shorter the latency the better, and anything over two milliseconds causes problems for audio. Long scheduling latency is most likely to occur during mode transitions, such as bringing up or shutting down a CPU, switching between a security kernel and the normal kernel, switching from full-power to low-power mode, or adjusting the CPU clock frequency and voltage.

A Circular Buffer Interface

The first thing to do to test that the buffer is implemented correctly is to prepare a circular buffer interface that the code can use. You need four functions for this: 1) create a circular buffer, 2) Write to it, 3) Read from it, 4) Destroy it.

Code example:

circular_buffer* create_circular_buffer(int bytes);
int read_circular_buffer_bytes(circular_buffer *p, char *out, int bytes);
int write_circular_buffer_bytes(circular_buffer *p, const char *in, int bytes);
void free_circular_buffer (circular_buffer *p);

The intended effect is that the read operation will only read the number of requested bytes up to what has been written in the buffer already. The write function will only write the bytes for which there is space in the buffer. They will return a count of read/written bytes, which can be anything from zero to the requested number.

The consumer thread (the audio I/O callback in the case of playback, or the audio processing thread in the case of recording) reads from the circular buffer and then does something with the audio. At the same time, asynchronously, the supplier thread is filling the circular buffer, stopping only if it gets filled up. With an appropriate circular buffer size, the two threads will cooperate seamlessly.

Audio I/O

Using the interface as created in the example before, audio I/O functions can be written to use OpenSL callbacks. An example of an input stream I/O function is:

// this callback handler is called every time a buffer finishes recording
void bqRecorderCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
 OPENSL_STREAM *p = (OPENSL_STREAM *) context;
 int bytes = p->inBufSamples*sizeof(short);
 write_circular_buffer_bytes(p->inrb, (char *) p->recBuffer,bytes);
 (*p->recorderBufferQueue)->Enqueue(p->recorderBufferQueue,p->recBuffer,bytes);
}
// gets a buffer of size samples from the device
int android_AudioIn(OPENSL_STREAM *p,float *buffer,int size){
 short *inBuffer;
 int i, bytes = size*sizeof(short);
 if(p == NULL || p->inBufSamples == 0) return 0;
 bytes = read_circular_buffer_bytes(p->inrb, (char *)p->inputBuffer,bytes);
 size = bytes/sizeof(short);
 for(i=0; i < size; i++){
 buffer[i] = (float) p->inputBuffer[i]*CONVMYFLT;
 }
 if(p->outchannels == 0) p->time += (double) size/(p->sr*p->inchannels);
 return size;
}

In the callback function (lines 2-8), which is called every time a new full buffer (recBuffer) is ready, all of the data is written into the circular buffer. Then the recBuffer is ready to be queued again (line 7). The audio processing function, lines 10 to 21, tries to read the requested number of bytes (line 14) into inputBuffer and then copies that number of samples to the output (converting it into float samples). The function reports the number of copied samples.

Output Function:

// puts a buffer of size samples to the device</pre>
int android_AudioOut(OPENSL_STREAM *p, float *buffer,int size){

short *outBuffer, *inBuffer;
int i, bytes = size*sizeof(short);
if(p == NULL || p->outBufSamples == 0) return 0;
for(i=0; i < size; i++){
p->outputBuffer[i] = (short) (buffer[i]*CONV16BIT);
}
bytes = write_circular_buffer_bytes(p->outrb, (char *) p->outputBuffer,bytes);
p->time += (double) size/(p->sr*p->outchannels);
return bytes/sizeof(short);
}

// this callback handler is called every time a buffer finishes playing
void bqPlayerCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
 OPENSL_STREAM *p = (OPENSL_STREAM *) context;
 int bytes = p->outBufSamples*sizeof(short);
 read_circular_buffer_bytes(p->outrb, (char *) p->playBuffer,bytes);
 (*p->bqPlayerBufferQueue)->Enqueue(p->bqPlayerBufferQueue,p->playBuffer,bytes);
}

The audio processing function (lines 2-13) takes in a certain number of float samples, converts them to shorts, and then writes the full outputBuffer into the circular buffer, reporting the number of samples written. The OpenSL callback (lines 16-22) reads all of the samples and queues them.

For this to work properly, the number of samples read from the input needs to be passed along with the buffer to the output. Below is the processing loop code that loops the input back into the output:

while(on)

samps = android_AudioIn(p,inbuffer,VECSAMPS_MONO);

for(i = 0, j=0; i < samps; i++, j+=2)
 outbuffer[j] = outbuffer[j+1] = inbuffer[i];
 android_AudioOut(p,outbuffer,samps*2);
 }

In this snippet, lines 5-6 loop over the read samples and copy them to the output channels. It is a stereo-out/mono-in setup, and for this reason the input samples are copied into two consecutive output buffer locations. Now that the queuing is happening in the OpenSL threads, in order to start the callback mechanism, we need to queue a buffer for recording and another for playback after we start audio on the device. This will ensure the callback is issued when buffers need to be replaced.

This is an example of how to implement an audio I/O track thread for processing through OpenSL. Each implementation is going to be unique and will require modifications to the HAL and ALSA driver to get the most from the OpenSL implementation.

x86 Design for Android Audio

OpenSL implementations do not guarantee a low-latency path to the Android “fast mixer” for all devices with a desirable rate of delay (under 40ms). However, with the modifications to the Media Server, HAL, and the ALSA driver, different devices can achieve varying success in low-latency audio. While conducting research on what is required to drive latencies down on Android, Intel has implemented a low-latency audio solution on the Dell* Venue 8 7460 tablet.

The result of the experiment is a hybrid media processing engine in which the input processing thread is managed by a separate low-latency input server that processes the raw audio and then passes it to the Android-implemented media server that still uses the “fast mixer” thread. Both the input and output servers use the scheduling in the OpenSL Sched_FIFO policy.

Figure 1. Implementation Diagram.

Implementation Diagram

Diagram provided by Eric Serre

The result of this modification is a very satisfactory 45-ms RTL. This implementation is part of the Intel Atom SoC and tablet design used for this effort. This test is conducted on an Intel Software Development Platform and is available through the Intel Partner Software Development Program.

The implementation of OpenSL and the SCHED_FIFO policy exhibits efficient processing of the round-trip, real-time audio on the above-specified hardware platform and is not available for all devices. Testing any application using the examples given in this document must be conducted on specific devices and can be made available to partner software developers.

Summary

This article discussed how to use OpenSL to create a callback and buffer queue in an application that will adhere to the Android audio development methods. It also demonstrated the efforts put forth by Intel to provide one option for low-latency audio signaling using the modified Media Framework. To conduct this experiment and test the low latency path, developers must follow the Android development design for audio using OpenSL and an Intel Software Development Platform on Android Kit Kat 4.4.2 or higher.

Contributors:
Eric Serre, Intel Corporation
Victor Lazzarini

Intel® XDK FAQs - Cordova

$
0
0
Q1: How do I set app orientation?

If you are using Cordova* 3.X build options (Crosswalk* for Android*, Android*, iOS*, etc.), you can set the orientation under the Projects panel > Select your project > Cordova* 3.X Hybrid Mobile App Settings - Build Settings. Under the Build Settings, you can set the Orientation for your desired mobile platform.  

If you are using the Legacy Hybrid Mobile App Platform build options (Android*, iOS* Ad Hoc, etc.), you can set the orientation under the Build tab > Legacy Hybrid Mobile App Platforms Category- <desired_mobile_platform> - Step 2 Assets tab. 

[iPad] Create a plugin (directory with one file) that only has a config xml that includes the following: 

<config-file target="*-Info.plist" parent="UISupportedInterfaceOrientations~ipad" overwrite="true"><string></string></config-file><config-file target="*-Info.plist" parent="UISupportedInterfaceOrientations~ipad" overwrite="true"><array><string>UIInterfaceOrientationPortrait</string></array></config-file> 

Add the plugin on the build settings page. 

Alternatively, you can use this plugin: https://github.com/yoik/cordova-yoik-screenorientation. You can import it as a third-party Cordova* plugin using the Cordova* registry notation:

  • net.yoik.cordova.plugins.screenorientation (includes latest version at the time of the build)
  • net.yoik.cordova.plugins.screenorientation@1.3.2 (specifies a version)

Or, you can reference it directly from the GitHub repo: 

The second reference provides the git commit referenced here (we do not support pulling from the PhoneGap registry).

Q2: Is it possible to create a background service using Intel XDK?

Background services require the use of specialized Cordova* plugins that need to be created specifically for your needs. Intel XDK does not support development or debug of plugins, only the use of them as "black boxes" with your HTML5 app. Background services can be accomplished using Java on Android or Objective C on iOS. If a plugin that backgrounds the functions required already exists (for example, this plugin for background geo tracking), Intel XDK’s build system will work with it.

Q3: How do I send an email from my App?
You can use the Cordova* email plugin or use web intent - PhoneGap* and Cordova* 3.X.
Q4: How do you create an offline application?
You can use the technique described here by creating an offline.appcache file and then setting it up to store the files that are needed to run the program offline. Note that offline applications need to be built using the Cordova* or Legacy Hybrid build options.
Q5: How do I work with alarms and timed notifications?
Unfortunately, alarms and notifications are advanced subjects that require a background service. This cannot be implemented in HTML5 and can only be done in native code by using a plugin. Background services require the use of specialized Cordova* plugins that need to be created specifically for your needs. Intel XDK does not support the development or debug of plugins, only the use of them as "black boxes" with your HTML5 app. Background services can be accomplished using Java on Android or Objective C on iOS. If a plugin that backgrounds the functions required already exists (for example, this plugin for background geo tracking) the Intel XDK’s build system will work with it. 
Q6: How do I get a reliable device ID? 
You can use the Phonegap/Cordova* Unique Device ID (UUID) plugin for Android*, iOS* and Windows* Phone 8. 
Q7: How do I implement In-App purchasing in my app?
There is a Cordova* plugin for this. A tutorial on its implementation can be found here. There is also a sample in Intel XDK called ‘In App Purchase’ which can be downloaded here.
Q8: How do I install custom fonts on devices?
Fonts can be considered as an asset that is included with your app, not shared among other apps on the device just like images and CSS files that are private to the app and not shared. It is possible to share some files between apps using, for example, the SD card space on an Android* device. If you include the font files as assets in your application then there is no download time to consider. They are part of your app and already exist on the device after installation.
Q9: How do I access the device’s file storage?
You can use HTML5 local storage and this is a good article to get started with. Alternatively, there is a Cordova* file plugin for that.
Q10: Why isn't AppMobi* push notification services working?
This seems to be an issue on AppMobi’s end and can only be addressed by them. PushMobi is only available in the "legacy" container. AppMobi* has not developed a Cordova* plugin, so it cannot be used in the Cordova* build containers. Thus, it is not available with the default build system. We recommend that you consider using the Cordova* push notification plugin instead.
Q11: How do I configure an app to run as a service when it is closed?
If you want a service to run in the background you'll have to write a service, either by creating a custom plugin or writing a separate service using standard Android* development tools. The Cordova* system does not facilitate writing services.
Q12: How do I dynamically play videos in my app?

1) Download the Javascript and CSS files from https://github.com/videojs

2) Add them in the HTML5 header. 

<config-file target="*-Info.plist" parent="UISupportedInterfaceOrientations~ipad" overwrite="true"><string></string></config-file><config-file target="*-Info.plist" parent="UISupportedInterfaceOrientations~ipad" overwrite="true"><array><string>UIInterfaceOrientationPortrait</string></array></config-file> 

 3) Add a panel ‘main1’ that will be playing the video. This panel will be launched when the user clicks on the video in the main panel.

<div class=”panel” id=”main1” data-appbuilder-object=”panel” style=””><video id=”example_video_1” class=”video-js vjs-default-skin” controls=”” preload=”auto” width=”200” poster=”camera.png” data-setup=”{}”><source src=”JAIL.mp4” type=”video/mp4”><p class=”vjs-no-js”>To view this video please enable JavaScript*, and consider upgrading to a web browser that <a href=http://videojs.com/html5-video-support/ target=”_blank”>supports HTML5 video</a></p></video><a onclick=”runVid3()” href=”#” class=”button” data-appbuilder-object=”button”>Back</a></div>

 4) When the user clicks on the video, the click event sets the ‘src’ attribute of the video element to what the user wants to watch. 

Function runVid2(){

      Document.getElementsByTagName(“video”)[0].setAttribute(“src”,”appdes.mp4”);

      $.ui.loadContent(“#main1”,true,false,”pop”);

}

 5) The ‘main1’ panel opens waiting for the user to click the play button.

Note: The video does not play in the emulator and so you will have to test using a real device. The user also has to stop the video using the video controls. Clicking on the back button results in the video playing in the background.

Q13: How do I design my Cordova* built Android* app for tablets?
This page lists a set of guidelines to follow to make your app of tablet quality. If your app fulfills the criteria for tablet app quality, it can be featured in Google* Play's "Designed for tablets" section.
Q14: How do I resolve icon related issues with Cordova* CLI build system?

Ensure icon sizes are properly specified in the intelxdk.config.additions.xml. For example, if you are targeting iOS 6, you need to manually specify the icons sizes that iOS* 6 uses. 

<icon platform="ios" src="images/ios/72x72.icon.png" width="72" height="72" /><icon platform="ios" src="images/ios/57x57.icon.png" width="57" height="57" />

These are not required in the build system and so you will have to include them in the additions file. 

For more information on adding build options using intelxdk.config.additions.xml, visit: /en-us/html5/articles/adding-special-build-options-to-your-xdk-cordova-app-with-the-intelxdk-config-additions-xml-file

Q15: Is there a plugin I can use in my App to share content on social media?

Yes, you can use the PhoneGap Social Sharing plugin for Android*, iOS* and Windows* Phone.

Q16: Iframe does not load in my app. Is there an alternative?
Yes, you can use the inAppBrowser plugin instead.
Q17: Why are intel.xdk.istablet and intel.xdk.isphone not working?
Those properties are quite old and is based on the legacy AppMobi* system. An alternative is to detect the viewport size instead. You can get the user’s screen size using screen.width and screen.height properties (refer to this article for more information) and control the actual view of the webview by using the viewport meta tag (this page has several examples). You can also look through this forum thread for a detailed discussion on the same.
Q18: How do I work with the App Security plugin on Intel XDK?

Select the App Security plugin on the plugins list of the Project tab and build your app as a Cordova Hybrid app. Building it as a Legacy Hybrid app has been known to cause issues when compiled and installed on a device.

Q19: Why does my build fail with Admob plugins? Is there an alternative?

Intel XDK does not support the library project that has been newly introduced in the com.google.playservices@21.0.0 plugin. Admob plugins are dependent on "com.google.playservices", which adds Google* play services jar to project. The "com.google.playservices@19.0.0" is a simple jar file that works quite well but the "com.google.playservices@21.0.0" is using a new feature to include a whole library project. It works if built locally with Cordova CLI, but fails when using Intel XDK.

To keep compatible with Intel XDK, the dependency of admob plugin should be changed to "com.google.playservices@19.0.0".

Q20: Why does the intel.xdk.camera plugin fail? Is there an alternative?
There seem to be some general issues with the camera plugin on iOS*. An alternative is to use the Cordova camera plugin, instead and change the version to 0.3.3.
Q21: How do I resolve Geolocation issues with Cordova?

Give this app a try, it contains lots of useful comments and console log messages. However, use Cordova 0.3.10 version of the geo plugin instead of the Intel XDK geo plugin. Intel XDK buttons on the sample app will not work in a built app because the Intel XDK geo plugin is not included. However, they will partially work in the Emulator and Debug. If you test it on a real device, without the Intel XDK geo plugin selected, you should be able to see what is working and what is not on your device. There is a problem with the Intel XDK geo plugin. It cannot be used in the same build with the Cordova geo plugin. Do not use the Intel XDK geo plugin as it will be discontinued.

Geo fine might not work because of the following reasons:

  1. Your device does not have a GPS chip
  2. It is taking a long time to get a GPS lock (if you are indoors)
  3. The GPS on your device has been disabled in the settings

Geo coarse is the safest bet to quickly get an initial reading. It will get a reading based on a variety of inputs, but is usually not as accurate as geo fine but generally accurate enough to know what town you are located in and your approximate location in that town. Geo coarse will also prime the geo cache so there is something to read when you try to get a geo fine reading. Ensure your code can handle situations where you might not be getting any geo data as there is no guarantee you'll be able to get a geo fine reading at all or in a reasonable period of time. Success with geo fine is highly dependent on a lot of parameters that are typically outside of your control.

Q22: Is there an equivalent Cordova* plugin for intel.xdk.player.playPodcast? If so, how can I use it?

Yes, there is and you can find the one that best fits the bill from the Cordova* plugin registry.

To make this work you will need to do the following:

  • Detect your platform (you can use uaparser.js or you can do it yourself by inspecting the user agent string)
  • Include the plugin only on the Android* platform and use <video> on iOS*.
  • Create conditional code to do what is appropriate for the platform detected 

You can force a plugin to be part of an Android* build by adding it manually into the additions file. To see what the basic directives are to include a plugin manually:

  1. Include it using the "import plugin" dialog, perform a build and inspect the resulting intelxdk.config.android.xml file.
  2. Then remove it from your Project tab settings, copy the directive from that config file and paste it into the intelxdk.config.additions.xml file. Prefix that directive with <!-- +Android* -->. 

More information is available here and this is what an additions file can look like:

<preference name="debuggable" value="true" /><preference name="StatusBarOverlaysWebView" value="false" /><preference name="StatusBarBackgroundColor" value="#000000" /><preference name="StatusBarStyle" value="lightcontent" /><!-- -iOS* --><intelxdk:plugin intelxdk:value="nl.nielsad.cordova.wifiscanner" /><!-- -Windows*8 --><intelxdk:plugin intelxdk:value="nl.nielsad.cordova.wifiscanner" /><!-- -Windows*8 --><intelxdk:plugin intelxdk:value="org.apache.cordova.statusbar" /><!-- -Windows*8 --><intelxdk:plugin intelxdk:value="https://github.com/EddyVerbruggen/Flashlight-PhoneGap-Plugin" />

This sample forces a plugin included with the "import plugin" dialog to be excluded from the platforms shown. You can include it only in the Android* platform by using conditional code and one or more appropriate plugins.

Q23: How do I display a webpage in my app without leaving my app?

The most effective way to do so is by using inAppBrowser.

Q24: Does Cordova* media have callbacks in the emulator?

While Cordova* media objects have proper callbacks when using the debug tab on a device, the emulator doesn't report state changes back to the Media object. This functionality has not been implemented yet. Under emulation, the Media object is implemented by creating an <audio> tag in the program under test. The <audio> tag emits a bunch of events, and these could be captured and turned into status callbacks on the Media object.

Q25: Why does the Cordova version not match between the Projects tab Build Settings, Emulate tab, App Preview and my built app?

This is due to the difficulty in keeping different components in sync and is compounded by the version convention that the Cordova project uses to distinguish build tools (the CLI version) from frameworks (the Cordova version) and plugins.

The CLI version you specify in the Projects tab Build Settings section is the "Cordova CLI" version that the build system will use to build your app. Each version of the Cordova CLI tools come with a set of "pinned" Cordova framework versions, which vary as a function of the target platform. For example, the Cordova CLI 5.0 platformsConfig file is "pinned" to the Android Cordova framework version 4.0.0, the iOS Cordova framework version 3.8.0 and the Windows 8 Cordova framework version 3.8.1 (among other targets). The Cordova CLI 4.1.2 platformsConfig file is "pinned" to Android Cordova 3.6.4, iOS Cordova 3.7.0 and Windows 8 Cordova 3.7.1.

This means that the Cordova framework version you are using "on device" with a built app will not equal the version number that is in the CLI field that you specified in the Build Settings section of the Projects tab when you built your app. Technically, the target-specific Cordova frameworks can be updated [independently] within a given version of CLI tools, but our build system always uses the Cordova framework versions that were "pinned" to the CLI when it was released (that is, the Cordova framework versions specified in the platformsConfig file).

The reason you may see Cordova framework version differences between the Emulate tab, App Preview and your built app is:

  • The Emulate tab has one specific Cordova framework version it is built against. We try to make that version of the Cordova framework match as closely the default Intel XDK version of Cordova CLI.
  • App Preview is released independently of the Intel XDK and, therefore, may support a different version than what you will see reported by the Emulate tab and your built app. Again, we try to release App Preview so it matches the version of the Cordova framework that is the default version of the Intel XDK at the time App Preview is released; but since the various tools are not released in perfect sync, that is not always possible.
  • Your app always uses the Cordova framework version that is determined by the Cordova CLI version you specified in the Projects tab's Build Settings section, when you built your app.
  • BTW: the version of the Cordova framework that is built into Crosswalk is determined by the Crosswalk project, not by the Intel XDK build system. There is some customization the Crosswalk project team must do to the Cordova framework to include Cordova as part of the Crosswalk runtime engine. The Crosswalk project team generally releases each Crosswalk version with the then current version of the Android Cordova framework. Thus, the version of the Android Cordova framework that is included in your Crosswalk build is determined by the version of Crosswalk you choose to build against.

Do these Cordova framework version numbers matter? Not that much. There are some issues that come up that are related to the Cordova framework version, but they tend to be few and far between. The majority of the bugs and compatibility issues you will experience in your app have more to do with the versions and mix of Cordova plugins you choose and the specific webview present on your test devices. See this blog for more details about what a webview is and why the webview matters to your app: When is an HTML5 Web App a WebView App?.

p.s. The "default version" of the CLI that the Intel XDK uses is rarely the most recent version of the Cordova CLI tools distributed by the Cordova project. There is always a lag between Cordova project releases and our ability to incorporate those releases into our build system and the various Intel XDK components. Also, we are unable to implement every release that is made by the Cordova project; thus the reason why we do not support every Cordova release that is available to Cordova CLI users.

Q26: How do I add a third party plugin?
Please follow the instructions on this doc page to add a third-party plugin: Adding Plugins to Your Intel® XDK Cordova* App -- this plugin is not being included as part of your app. You will see it in the build log if it was successfully added to your build.
Q27: How do I make an AJAX call that works in my browser work in my app?
Please follow the instructions in this article: Cordova CLI 4.1.2 Domain Whitelisting with Intel XDK for AJAX and Launching External Apps.
Q28: I get an "intel is not defined" error, but my app works in Test tab, App Preview and Debug tab. What's wrong?

When your app runs in the Test tab, App Preview or the Debug tab the intel.xdk and core Cordova functions are automatically included for easy debug. That is, the plugins required to implement those APIs on a real device are already included in the corresponding debug modules.

When you build your app you must include the plugins that correspond to the APIs you are using in your build settings. This means you must enable the Cordova and/or XDK plugins that correspond to the APIs you are using. Go to the Projects tab and insure that the plugins you need are selected in your project's plugin settings. See Adding Plugins to Your Intel® XDK Cordova* App for additional details.

Q29: How do I target my app for use only on an iPad or only on an iPhone?

There is an undocumented feature in Cordova that should help you (the Cordova project provided this feature but failed to document it for the rest of the world). If you use the appropriate preference in the intelxdk.config.additions.xml file you should get what you need:

<preference name="target-device" value="tablet" />     <!-- Installs on iPad, not on iPhone --><preference name="target-device" value="handset" />    <!-- Installs on iPhone, iPad installs in a zoomed view and doesn’t fill the entire screen --><preference name="target-device" value="universal" />  <!-- Installs on iPhone and iPad correctly -->

If you need info regarding the additions.xml file, see the blank template or this doc file: Adding Intel® XDK Cordova Build Options Using the Additions File.

Q30: Why does my build fail when I try to use the Cordova* Capture Plugin?

The Cordova* Capture plugin has a dependency on the File Plugin. Please make sure you both plugins selected on the projects tab.

Q31: How can I pinch and zoom in my Cordova* app?

For now, using the viewport meta tag is the only option to enable pinch and zoom. However, its behavior is unpredictable in different webviews. Testing a few samples apps has led us to believe that this feature is better on Crosswalk for Android. You can test this by building the Hello Cordova sample app for Android and Crosswalk for Android. Pinch and zoom will work on the latter only though they both have:

.

Please visit the following pages to get a better understanding of when to build with Crosswalk for Android:

http://blogs.intel.com/evangelists/2014/09/02/html5-web-app-webview-app/

https://software.intel.com/en-us/xdk/docs/why-use-crosswalk-for-android-builds

Another device oriented approach is to enable it by turning on Android accessibility gestures.

Q32: How do I make my Android application use the fullscreen so that the status and navigation bars disappear?

The Cordova* fullscreen plugin can be used to do this. For example, in your initialization code, include this function AndroidFullScreen.immersiveMode(null, null);.

You can get this third-party plugin from here https://github.com/mesmotronic/cordova-fullscreen-plugin

Q33: How do I add XXHDPI and XXXHDPI icons to my Android or Crosswalk application?

The Cordova CLI 4.1.2 build system will support this feature, but our 4.1.2 build system (and the 2170 version of the Intel XDK) does not handle the XX and XXX sizes directly. Use this workaround until these sizes are supported directly:

  • copy your XX and XXX icons into your source directory (usually named www)
  • add the following lines to your intelxdk.config.additions.xml file
  • see this Cordova doc page for some more details

Assuming your icons and splash screen images are stored in the "pkg" directory inside your source directory (your source directory is usually named www), add lines similar to these into your intelxdk.config.additions.xml file (the precise name of your png files may be different than what is shown here):

<!-- for adding xxhdpi and xxxhdpi icons on Android --><icon platform="android" src="pkg/xxhdpi.png" density="xxhdpi" /><icon platform="android" src="pkg/xxxhdpi.png" density="xxxhdpi" /><splash platform="android" src="pkg/splash-port-xhdpi.png" density="port-xhdpi"/><splash platform="android" src="pkg/splash-land-xhdpi.png" density="land-xhdpi"/>

The precise names of your PNG files are not important, but the "density" designations are very important and, of course, the respective resolutions of your PNG files must be consistent with Android requirements. Those density parameters specify the respective "res-drawable-*dpi" directories that will be created in your APK for use by the Android system. NOTE: splash screen references have been added for reference, you do not need to use this technique for splash screens.

You can continue to insert the other icons into your app using the Intel XDK Projects tab.

Q34: Which plugin is the best to use with my app?

We are not able to track all the plugins out there, so we generally cannot give you a "this is better than that" evaluation of plugins. Check the Cordova plugin registry to see which plugins are most popular and check Stack Overflow to see which are best supported; also, check the individual plugin repos to see how well the plugin is supported and how frequently it is updated. Since the Cordova platform and the mobile platforms continue to evolve, those that are well-supported are likely to be those that have good activity in their repo.

Keep in mind that the XDK builds Cordova apps, so whichever plugins you find being supported and working best with other Cordova (or PhoneGap) apps would likely be your "best" choice.

See Adding Plugins to Your Intel® XDK Cordova* App for instructions on how to include third-party plugins with your app.

Q35: What are the rules for my App ID?

The precise App ID naming rules vary as a function of the target platform (eg., Android, iOS, Windows, etc.). Unfortunately, the App ID naming rules are further restricted by the Apache Cordova project and sometimes change with updates to the Cordova project. The Cordova project is the underlying technology that your Intel XDK app is based upon; when you build an Intel XDK app you are building an Apache Cordova app.

CLI 5.1.1 has more restrictive App ID requirements than previous versions of Apache Cordova (the CLI version refers to Apache Cordova CLI release versions). In this case, the Apache Cordova project decided to set limits on acceptable App IDs to equal the minimum set for all platforms. We hope to eliminate this restriction in a future release of the build system, but for now (as of the 2496 release of the Intel XDK), the current requirements for CLI 5.1.1 are:

  • Each section of the App ID must start with a letter
  • Each section can only consist of letters, numbers, and the underscore character
  • Each section cannot be a Java keyword
  • The App ID must consist of at least 2 sections (each section separated by a period ".").
Q36: iOS /usr/bin/codesign error: certificate issue for iOS app?

If you are getting an iOS build fail message in your detailed build log that includes a reference to a signing identity error you probably have a bad or inconsistent provisioning file. The "no identity found" message in the build log excerpt, below, means that the provisioning profile does not match the distribution certificate that was uploaded with your application during the build phase.


Signing Identity:     "iPhone Distribution: XXXXXXXXXX LTD (Z2xxxxxx45)"
Provisioning Profile: "MyProvisioningFile"
                      (b5xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxe1)

    /usr/bin/codesign --force --sign 9AxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxA6 --resource-rules=.../MyApp/platforms/ios/build/device/MyApp.app/ResourceRules.plist --entitlements .../MyApp/platforms/ios/build/MyApp.build/Release-iphoneos/MyApp.build/MyApp.app.xcent .../MyApp/platforms/ios/build/device/MyApp.app
9AxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxA6: no identity found
Command /usr/bin/codesign failed with exit code 1

** BUILD FAILED **


The following build commands failed:
	CodeSign build/device/MyApp.app
(1 failure)

The excerpt shown above will appear near the very end of the detailed build log. The unique number patterns in this example have been replaced with "xxxx" strings for security reasons. Your actual build log will contain hexadecimal strings.

Q37: iOS Code Sign error: bundle ID does not match app ID?

If you are getting an iOS build fail message in your detailed build log that includes a reference to a "Code Sign error" you may have a bad or inconsistent provisioning file. The "Code Sign" message in the build log excerpt, below, means that the bundle ID you specified in your Apple provisioning profile does not match the app ID you provided to the Intel XDK to upload with your application during the build phase.


Code Sign error: Provisioning profile does not match bundle identifier: The provisioning profile specified in your build settings (MyBuildSettings) has an AppID of my.app.id which does not match your bundle identifier my.bundleidentifier.
CodeSign error: code signing is required for product type 'Application' in SDK 'iOS 8.0'

** BUILD FAILED **

The following build commands failed:
	Check dependencies
(1 failure)
Error code 65 for command: xcodebuild with args: -xcconfig,...

The message above translates into "the bundle ID you entered in the project settings of the XDK does not match the bundle ID (app ID) that you created on Apples developer portal and then used to create a provisioning profile."

Back to FAQs Main 

Code Samples for "From Serial to Awesome: Advanced Code Vectorization and Optimization"

$
0
0

The attached zip file host the code samples which demonstrate the concepts explained in the webinar titled "From Serial to Awesome: Advanced Code Vectorization and Optimization".

Viewing all 3384 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>