Quantcast
Channel: Intel Developer Zone Articles
Viewing all 3384 articles
Browse latest View live

How to Detect Persistent Memory Programming Errors Using Intel® Inspector - Persistence Inspector

$
0
0

Overview

Persistent memory is an emerging class of memory storage technology with great potential to improve application performance and reliability. However, developers will need to address several programming challenges to get the best performing code. One challenge is that a store to persistent memory does not become persistent immediately due to caching. The data persists only after it is out of the cache hierarchy and is visible to the memory system. And because of processor out-of-order execution and caching, the order of persistence may not be the same as the order of store.

Intel® Inspector - Persistence Inspector, currently available as a technology preview, is a new run-time tool developers can use to detect these programming errors in persistent memory programs. In addition to cache flush misses, this tool detects

  • Redundant cache flushes and memory fences
  • Out-of-order persistent memory stores
  • Incorrect undo logging for the Persistent Memory Development Kit (PMDK)

This article describes the features of Intel Inspector - Persistence Inspector and includes information to help you get started using it.

Background

Persistent memory devices using new technologies such as 3D XPoint™ media developed by Intel and Micron* can be directly attached to memory controllers. Such a device is often referred to as a non-volatile dual in-line memory module (NVDIMM). Data in NVDIMMs is byte-addressable and can survive system or program crashes. The access latencies of NVDIMMs are comparable to those of DRAMs. Programs read from and write to NVDIMMs using regular CPU load/store instructions. Consider the following code example:

Example 1: Write an Address Book to Persistent Memory

#include <stdio.h>
#include <fcntl.h>
#include <sys/file.h>
#include <sys/mman.h>
#include <string.h>

struct address {
	char name[64];
	char address[64];
	int valid;
}

int main()
{
	struct address *head = NULL;
	int fd;
	fd = open("addressbook.pmem", O_CREAT|O_RDWR, 0666);
	posix_fallocate(fd, 0, sizeof(struct address));

	head = (struct address *)mmap(NULL, sizeof(struct address), PROT_READ|PROT_WRITE,    MAP_SHARED, fd, 0);
    close(fd);
 
    strcpy(head->name, "Clark Kent");
	strcpy(head->address, "344 Clinton St, Metropolis, DC 95308");
	head->valid = 1;
 
    munmap(head, sizeof(struct address));

	return 0;
}

In example 1, persistent memory is exposed as the file addressbook.pmem and mapped into the process address space using regular file system APIs. Once the persistent memory is mapped in, the program then directly accesses the memory starting with the calls to strcpy. If there is power loss before the call to munmap, one of the following scenarios can take place in the persistent memory:

  • None of head->name, head->address, and head->valid has made its way to the memory system and become persistent.
  • All of them have become persistent.
  • Any one or two of them, but not all, have become persistent.

The caching effect presents a challenge to persistent memory software development. To guarantee data is recoverable and consistent after a power failure or system crash, developers need to reason where and when to explicitly flush data out of cache hierarchy to the memory system.

Persistent Memory Application Behavior

Upon restarting after an unfortunate event such as a power failure or system crash, persistent memory applications must validate consistency and perform recovery of data stored in the memory. Typically, an unfortunate event can divide a persistent memory application into two phases: one that executes before the unfortunate event and one that executes after the unfortunate event. This can potentially cause data corruption or inconsistency. In the before-unfortunate-event phase, the application runs its normal flow reading from and writing to the persistent memory. The after-unfortunate-event phase checks data consistency and recovers inconsistent data to consistency states before the application resumes normal operations.

Map of  memory unfortunate event

How to use Intel Inspector - Persistence Inspector

Here is the usage workflow of Intel Inspector - Persistence Inspector:

A flowchart, usage workflow of Intel® Inspector

Setting Up the Environment

After the tool files are installed in a directory of choice, for example, /home/joe/pmeminsp, add the Intel Inspector - Persistence Inspector path to PATH and LD_LIBRARY_PATH environment variables. For example:

$ export PATH=/home/joe/pmeminsp/bin64:$PATH

$ export LD_LIBRARY_PATH=/home/joe/pmeminsp/lib64:$LD_LIBRARY_PATH To verify the tool is installed and set up correctly, you can type in “pmeminsp”.

$ pmeminp Type 'pmeminsp help' for usage. 

Preparing Your Application

As stated earlier, a persistent memory application typically consists of two phases. To use Intel Inspector -Persistence Inspector, you need to identify code in these two phases.

  • Before-unfortunate-event phase. Executes the code you want the tool to check.
  • After-unfortunate-event phase. Executes the code you would run after a power failure or system crash.

If your application implements persistent memory support for transactions using the Persistent Memory Developer Kit (PMDK), the PMDK transaction runtime support is responsible for data consistency and recovery inside the transaction block. The after-unfortunate-event code resides in the PMDK transaction runtime. Therefore there is no need to use Intel Inspector - Persistence Inspector for verification of that particular phase.

Notifying Intel® Inspector - Persistence Inspector of An After-Unfortunate-Event phase

Once the after-unfortunate-event phase is identified, to significantly reduce analysis time, we highly recommend you notify the tool of exactly where the after-unfortunate-event phase starts and stops. Intel Inspector - Persistence Inspector provides a set of APIs for the application to notify the tool at runtime of the start and stop of the after-unfortunate-event phase analysis.

#define PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT 0x2 void __pmeminsp_start(unsigned int phase); void __pmeminsp_pause(unsigned int phase); void __pmeminsp_resume(unsigned int phase); void __pmeminsp_stop(unsigned int phase);

To notify the tool of the start of the after-unfortunate-event phase, all you need is a call to __pmeminsp_start (PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT) right before the after-unfortunate-event phase starts in your application. Similarly, to notify the tool of the stop of the after-unfortunate-event phase, all you need is a call to __pmeminsp_stop(PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT) right after the after-unfortunate-event phase ends in your application.

The __pmeminsp_pause(PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT) and __pmeminsp_resume (PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT) calls give you finer control over pausing and resuming analysis after the analysis is started and before it is stopped.

For example, if the after-unfortunate-event phase is the duration of function recover() call, you can simply place __pmeminsp_start(PMEMINSP_PHASE_AFTER_UNFORTUNATE_ EVENT) at the entry of the function and __pmeminsp_stop(PMEMINSP_PHASE_AFTER_ UNFORTUNATE_EVENT) at the exit of the function.

Example 2: Notify Intel Inspector - Persistence Inspector of the Start and Stop of the After-Unfortunate-Event Phase.

#include “pmeminsp.h”
 
 …

 … … 

void recover(void) 
{
    __pmeminsp_start(PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT);
 
    … 
 
    __pmeminsp_stop((PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT); 
}
 
void main()
{
 
    …
 
    … = mmap(…..);
 
    __pmeminsp_stop((PMEMINSP_PHASE_AFTER_UNFORTUNATE_EVENT);

    …
    recover();
}

The Intel Inspector - Persistence Inspector APIs are defined in libpmeminsp.so. When you build your applications, make sure to specify the correct options. For example,

-I /home/joe/pmeminsp/include –L /home/joe/pmeminsp/lib64 –lpmeminsp.

Analyzing the Before-Unfortunate-Event Phase

The command to run the before-unfortunate-event analysis phase is

pmeminsp check-before-unfortunate-event [options] --

If your application maps the persistent memory file directly using the system API, you also need to specify the path to that file (even if it doesn't exist before application run) using option -pmem-file. For example:

pmeminsp check-before-unfortunate-event -pmem-file ./addressbook.pmem [options] --

If the application creates several files, it should be possible to specify the path to the folder with such files. Any file mmap'ed from that location will be interpreted as a persistent memory file.

Note that these two options are redundant if your application is PMDK-based. Intel Inspector - Persistence Inspector automatically traces all persistent memory files managed by PMDK even if these options are absent.

Analyzing the After-Unfortunate-Event Phase

The command to run the after-unfortunate-event analysis phase is

pmeminsp check-after-unfortunate-event [options] --

Make sure that the location of the persistent memory file is specified unless your application uses PMDK to operate with persistent memory. Option names are similar to those of the check-before-unfortunate-event command.

Reporting Issues Detected

To generate a report of persistent memory issues detected, run

pmeminsp report [option] --

Here are some examples of diagnostics generated by Intel Inspector - Persistent Inspector:

Missing Cache Flush

A missing cache flush of a persistent memory store (first store) is always with reference to a later persistent memory store (second store). Its potential adverse effect is that the second store is made persistent but the first store is not if an unfortunate event occurs after the second store.

The first memory store,

in /home/joe/pmeminsp/addressbook/writeaddressbook!main at writeaddressbook.c:24 - 0x6ED, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main, at: - 0x21F43 in /home/joe/pmeminsp/addressbook/writeaddressbook!_start at: - 0x594

,

is not flushed before the second memory store,

in /home/joe/pmeminsp/addressbook/writeaddressbook!main at: writeaddressbook.c:26 - 0x73F, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main, at: - 0x21F43, in /home/joe/pmeminsp/addressbook/writeaddressbook!_start at: - 0x594

.

While, the memory load from the location of the first store,

in /lib/x86_64-linux-gnu/libc.so.6!strlen at: - 0x889DA

depends on the memory load from the location of the second store,

in /home/joe/pmeminsp/addressbook/readaddressbook!main at readaddressbook.c:22 - 0x6B0.

Redundant or Unnecessary Cache Flushes

A redundant or unnecessary cache flush is one that can be removed from the analyzed execution path without affecting the correctness of the program. Though a redundant or unnecessary cache flush does not affect the program correctness, it can potentially affect the program performance.

Cache flush

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_redundant_flush at main.cpp:134 - 0x1721, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:52 - 0x151F,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:48 - 0x14FD,in /home/joe/pmeminsp//tests/pmemdemo/src/pemmdemo!main at main.cpp:231 - 0x1C74,in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main, at: - 0x21F43,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4

is redundant with regard to cache flush

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_redundant_flush at main.cpp:135 - 0x1732,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:52 - 0x151F,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:48 - 0x14FD,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:231 - 0x1C7,in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4

of the memory store

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_redundant_flush at main.cpp:133 - 0x170F,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:52 - 0x151F,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!create_data_file at main.cpp:48 - 0x14FD,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:231 - 0x1C74,in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4

Out-of-Order Persistent Memory Stores

Out-of-order persistent memory stores are two stores whose correct persistence order cannot be enforced. It is also worth pointing out that explicit cache flushes will not enforce a correct order.
Out-of-order persistent memory stores could be enabled by -check-out-of-order-store option in 'report' command.

The memory store

in /home/joe/pmemcheck/mytest/writename6!writename at writename6.c:13 - 0x6E0, in /home/joe/pmemcheck/mytest/writename6!main at writename6.c:21 - 0x72D, in /home/joe/pmemcheck/mytest/writename6!main at writename6.c:20 - 0x721, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43,in /home/joe/pmemcheck/mytest/writename6!_start at: - 0x594

should be after the memory store

in /home/joe/pmemcheck/mytest/writename6!writename at writename6.c:14 - 0x6EB,in /home/joe/pmemcheck/mytest/writename6!main at writename6.c:21 - 0x72D,in /home/joe/pmemcheck/mytest/writename6!main at writename6.c:20 - 0x721, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43,in /home/joe/pmemcheck/mytest/writename6!_start at: - 0x594

Update Without Undo Logging

It is the developer's responsibility to undo log a memory location before it is updated in a PMDK transaction. Usually, developers call PMDK function pmemobj_tx_add_range() or pmemobj_tx_add_range_direct(), or use macro TX_ADD() to undo log a memory location. If the memory location is not undo logged before it is updated, PMDK will fail to rollback changes to the memory location if the transaction is not successfully committed.

An issue of update without undo logging is reported with reference to a memory store and a transaction in which the memory is updated.

Memory store

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_tx_without_undo at main.cpp:190 - 0x1963,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:175 - 0x1877,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:163 - 0x180D,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:245 - 0x1CF4,in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4

not undo logged in transaction

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_tx_without_undo at main.cpp:185 - 0x1921,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:175 - 0x1877,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:163 - 0x180D,in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:245 - 0x1CF4,in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12

Undo Log Without Update

It is the developer's responsibility to undo log a memory location before it is updated in a PMDK transaction. Usually, developers call PMDK function pmemobj_tx_add_range() or pmemobj_tx_add_range_direct(), or use macro TX_ADD() to undo log a memory location. If the memory location is undo logged but it is never updated inside a PMDK transaction, performance may be degraded and/or the memory may be rolled back back to a dirty/uncommitted/stale value if the transaction is not successfully committed.

An issue of undo log without update is reported with the reference to a PMDK undo logging call and a transaction in which the memory is undo logged.

Memory region is undo logged in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!test_tx_without_update  at main.cpp:190 - 0x1963, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:175 - 0x1877, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:163 - 0x180D, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:245 - 0x1CF4, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4

but is not updated in transaction

in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo! test_tx_without_update at main.cpp:185 - 0x1921, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:175 - 0x1877, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!run_tx_test at main.cpp:163 - 0x180D, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!main at main.cpp:245 - 0x1CF4, in /lib/x86_64-linux-gnu/libc.so.6!__libc_start_main at: - 0x21F43, in /home/joe/pmeminsp/tests/pmemdemo/src/pemmdemo!_start at: - 0x12A4.

Conclusion

Persistent memory is an exciting new technology. As we discussed in this article, this technology presents some programming challenges. These challenges can be mitigated using tools like Intel Inspector - Persistence Inspector, which allow you to discover issues earlier in the program life cycle and have a very high return on investment.

Next Steps

To join the beta of Intel® Inspector - Persistent Inspector, visit the beta registration site and download the technology preview.


Get Noticed: Expanding the Pool of Customers for Your Indie Game

$
0
0

You’re a game developer, you’ve got a game that you think is really cool, the people you show it to agree with you, but what next? You need to spread the word. It’s never too early to start building a community of interested people, with the ultimate goal of turning them into customers. This is the essence of lead generation in promotional marketing.

Lead generation and relationship marketing are a lot like dating: you grab someone’s attention, start a committed relationship together, based on mutual exchange and respect, then, when the time’s right, enter into a binding contract. Unlike dating, however, the contract you want isn’t marriage, but a purchase—and your goal is to enter into that contract with as many people as possible.

Funneling Fans

When talking about this process, marketers often refer to the customer journey and the sales funnel. Various versions of the sales funnel exist in marketing literature, all of which work in essentially the same way. Their aim is to use different marketing tactics to get as many people as possible into the wide top of your funnel (leads), then pull them through (the journey), to eventually become customers at the end. The number of leads you turn into customers is your conversion rate.

One version of the funnel is the AIDA model: Attention, Interest, Decision, Action. These stages describe the customer journey from never having heard of your game, to eventually buying it. Different marketing tactics are appropriate for each stage of this journey.

AIDA model

This might look like jargon, but it’s a useful framework when it comes to marketing your game. Let’s break it down into concrete actions.

Grabbing Attention: Lead Generation

This is first contact—that magical moment when a starry-eyed gamer sees your creation for the first time and adds it to their mental “follow” list. But don’t just trust them to remember you—you need to be able to contact them. That means gathering their data, using it to build relationships, and turning contacts into leads.

What you want is permission to communicate with your leads on an ongoing basis—via email, a website newsletter sign-up, or a like/follow/subscription option on a social-media channel. Here are some of the principal tactics and how to use them:

Channel Partners

As an independent developer, it’s very difficult to go it completely alone. Whether or not you end up with a publisher, you’ll always need the support of channel partners to help get the word out there. Channel partner go-to-market programs generally take the form of a negotiated cooperation—potentially involving a discount or exclusive content—rather than being paid for directly, so it’s an efficient avenue to explore for independents. Channel partners include physical retailers such as Gamestop*; online retail portals Steam*, Green Man Gaming, and Humble Bundle; and hardware manufacturers.

The value each kind of partner can bring varies, but among the things they can offer are: access to their existing customer databases, promotion at point-of-sale (whether an online storefront feature or racking in store), pre-order campaigns, product bundling, and post-launch promotional campaigns—all of which can help generate leads and start relationships.

Calls-to-action: Use promotional messages sent via channel partners (email or website, for example) to push people towards your own social channels or website signup.

Whether you end up working with channel partners to bring your game to market or do it yourself, the following tools and tactics will help you fill that sales funnel with leads.

Website

Having a website for your game is absolutely essential, whether it’s dedicated or part of your studio’s site. Make sure the site looks great on mobile as well as desktop, and that you have the ability to capture email addresses. Keep things simple: apply Hick’s law of user experience (UX) to make user choices easy. Find more tips here.

Call-to-action: Encourage visitors to sign up for your email newsletter with incentives for exclusive and up-front info.

Social Media

One of the first places you’ll probably think about communicating your game’s virtues is on social media. Managing and feeding content to social-media communities takes resources, planning, and creativity, and you need to be social—personally responding to as many people as you can, and always with a smile.

The analytics provided by platforms such as Facebook* offer great insights into where and who your followers are, which helps you spot opportunities and target them more effectively. Services such as Followerwonk*, Sprout Social* and Hootsuite* can help you identify leads and automate your social-media management process.

If you have an offer for a free trial, use it.

Calls-to-action: Every social post needs a link to follow, a video trailer to watch, an opinion to share, or a question to answer, letting you keep the conversation going. Never let a potential customer or influencer go cold.

Facebook Analytics
Figure 1. Snap shot of some of the useful data Facebook Analytics can provide.

Paid Social Media Campaigns

Getting content to go viral organically on social networks can be difficult, as the social networks naturally want you to pay for the convenience of reaching their millions of users. Paid campaigns for specific pieces of content can gather page likes and followers, are economical to run, and the targeting tools on networks such as Facebook are surprisingly powerful.

Calls-to-action: Ask viewers of your content to like/follow/subscribe, depending on the platform.

Kickstarter*

Kickstarter is primarily a way to help fund game development during the early stages of a project, but it’s also a very powerful tool in building community and generating leads. As anyone who has run one will tell you, a Kickstarter campaign is a full-time job that requires careful planning of time, resources, and content. Done well, however, it will give you an audience who are hungry for interaction and information. 

Keep those relationships alive once the campaign is over. Use the email-update tool to send calls-to-action to followers and drive them to your social channels. Respond to questions and keep the conversation going. This Kickstarter post-mortem from independent studio Antagonist shows what’s involved and suggests best practices.

Through the Woods
Figure 2. Through the Woods* by Antagonist was successfully funded on Kickstarter.

Events

Gaming events—be they trade or public shows—give you opportunities to show your game, gather contact information and valuable feedback, and, with a bit of luck, generate some social-media buzz. Speaking at events is also a great way to meet potential leads and is surprisingly easy to get into. If you don’t come home with a four-inch stack of business cards, you’re not doing it right.

If you speak at an event, make sure to upload your presentation to SlideShare* afterwards. Not only can this improve your profile, search visibility, and ranking (it’s owned by LinkedIn, incidentally), but you can redirect viewers directly back to your website landing page and turn them into leads.

Calls to action: Ask everyone you meet for their email address, to follow you on social media, and to directly share their experience of your game or talk with their own online networks.

Video Trailers

The importance of a great video trailer in creating impact for your game is hard to overstate, but it’s equally important to add a call-to-action at the video’s close.

Calls-to-action: Add your website URL on the endslate, ask viewers to follow you on social media, and add a YouTube* endslate overlay to ask viewers to subscribe.

Public Relations (PR)

PR is all about building relationships with the media and journalists, with the aim of having them say good things about your game to their audiences. Start by researching who to target (high-value media with relevant audiences); next, issue a press release using an email tool such as MailChimp*; lastly, and most importantly, proactively and continuously reach out to interested parties to turn them into friends who want to help.

When it comes to the emails themselves, make sure your email signature contains your contact information and logo, and, if you’ve won any awards or received notable praise for your game, add a note. Pro tip: to increase the chances of recipients opening your email (the “open rate”), include video content in the email and the word “video” in the subject line.

It’s useful to create an online press kit where media professionals can instantly get their hands on basic assets such as bios, logos, press releases, video, screenshots, and art. Rami Ismail from independent developer Vlambeer created a simple, free online press kit solution aimed at helping developers manage PR more efficiently.

Calls-to-action: Press releases should contain a website URL, social-media links, embedded video, and a press-kit link, so your media contacts have everything they need about your game to pass on to their audiences.

Young Horses
Figure 3. The online press kit for Young Horses, developers of Octodad*, built using presskit.

YouTube* gamers /Streamers

YouTube gamers and streamers have become vitally important communication channels. The most popular influencers can be an elusive bunch, even when they’re being paid, offering few guarantees of coverage. However, contacting as many relevant influencers as you can is worth your time and effort, as you might strike gold. People with smaller followings, but with an affinity for your game, can also prove valuable in reaching your niche. It’s worth looking at tools such as Keymailer that can do a lot of the heavy lifting for you.

Calls-to-action: Ask YouTube gamers /streamers to direct their viewers to your lead social-media channel.

Keymailer
Figure 4. Check out Keymailer and similar services to see if they can help reach video influencers.

Data Management

Social-media platforms keep your contacts for you, but you need to store other contacts yourself. Google Sheets* is a good place to start, and be sure to keep it organized, updated, and clean. From there you can export email addresses to email marketing software such as MailChimp.

Sparking Interest: Direct Relationships

Awareness becomes interest through the information you share, the conversations you have, and the relationships you foster. Your aim is to turn every single contact whose data you have collected into a lasting relationship. The key to this is the value exchange. Put yourself in your leads’ shoes—if you want them to interact with you, they need a good reason to give you their precious time.

Value comes in many shapes and forms. Here are a few examples:

  • Exclusive content: Offer your leads something they can’t get anywhere else. This could be new information (which also acts as social currency), or it could be something more tangible, such as merchandise.
  • Developer access: Game fans love a direct line to the developers, and to feel that their opinion is valued. Make yourself available, and respond, personally, to as many people as you can. Tell your story—everyone loves a good story.
  • Closed beta/early access: Letting them try out an early version of your game is one of the most powerful ways you can directly engage with your leads. If the experience is positive, it also drives advocacy (more on that soon).

Twitter feedback
Twitter feedback
Figure 5. Independent developer Fourattic responds to the majority of Twitter feedback on Crossing Souls*, and works to give value to its community.

If you really want to stand out, you can go further and create a unique community experience for your leads. Developer No More Robots did this for its upcoming game Descenders, creating a custom meta-game for beta leads on the game’s official Discord channel. On joining a particular team, fans become a part of a special community with access to a private Discord channel, events, and exclusive prizes.

Creating Advocates

If you give your leads value, respond when they call, and feed them great content, not only is it going to be much easier to persuade them to buy your game, but there’s a good chance they will become advocates

The loyalty ladder is a similar marketing model to the sales funnel—it’s concerned with the journey from lead to customer, to repeat customer (client), and finally to the top of the ladder, where they become an advocate. This essentially means that they like your game so much, they tell their friends. (Advocacy applies equally to the pre-launch phase, too.)

This is why driving word-of-mouth—virality, in other words—is such a preoccupation of marketers. While we can’t make people tell their friends, we can maximize the chances of it happening. The useful “STEPPS” model for this was proposed by professor and author Jonah Berger, in his book Contagious.

STEPPS
Figure 6. Jonah Berger’s STEPPS model for driving word-of-mouth viral communication.

Email Marketing

According to research, email is today’s consumers’ preferred way to receive marketing messages. You’ll find email marketing a cheap and easy way to stay in touch with thousands of followers, and send them enticing content, offers, and calls-to-action. Build your email database by incentivizing people to opt-in at every opportunity—on your website, social-media profiles, videos, and at events. You can also buy or rent email addresses, although it’s a much better idea to build your own list. This blog from online marketing agency HubSpot provides some useful tips.

Use an automated tool such as AWeber or MailChimp to send great-looking and engaging monthly newsletter emails to your subscriber base of potential customers. Take some time to understand how to use the powerful analytical tools for tracking and monitoring your subscribers’ behavior, and improving your emails’ open rates, click-throughs, and read times.

Community Management

Gaming communities of fans live in a number of different places online, from Facebook and Twitter, to Reddit and specialist forums. You don’t need to be proactive on all of them, but you do need to make sure you feed the ones that are most important to you. You also need to monitor the rest for mentions of your game.

Whenever an opportunity arises to insert yourself into a conversation, do it. More often than not, people love hearing from a game’s development team. Just remember to always keep a virtual smile on your face—never respond in anger. If you’re not sure your response is the right one, park it, ask someone else, and come back to it later. Remember, once posted, anyone can see what you write, and every one of those people is a potential customer.

Using Data and Analytics

Gathering masses of data on your leads, fans and followers, and then not doing anything with that information is an easy situation to fall into—either through lack of time or just not knowing where to start. Nearly every marketing platform from MailChimp to Facebook provides great data-analysis tools and advice, and you will benefit by planning time each week to review what’s happening with your fan base and to make notes. Only by doing that can you see what works, what doesn’t, and figure out what you need to do next.

MailChimp
Figure 7. Example of MailChimp* email marketing analytics on its reporting dashboard.

Decision Time: Influencing Choice

One key thing to remember—that seems almost counter-intuitive in this era of hyper-choice—is that marketing wants to make our decisions easier, and it does this by reducing the choices we’re faced with.

The way we do this depends on the situation. If it’s a first-time purchase—for example, a brand-new game IP—your potential customers are going to be looking at what other people are saying. Before the Internet opened the floodgates on user reviews and YouTube gamers started earning six-figure salaries, specialist game-journalists were the real influencers. They still have an important role (not least when it comes to reviews on Metacritic, which has become a go-to yardstick for game quality) and often have very large or specialized audiences.

The power of traditional media, however, has been unseated by user reviews and video influencers. We now tend to favor the opinions of people who are just like us. You can’t control what people write and say in reviews and streams, but, if you have a good relationship with as many fans, followers, and influencers as possible, you can build a groundswell of positivity for your game. 

We love hype trains. The phenomenon of social proof states that people imitate others because it’s the right thing in a given social context. This means that buzz—good or bad—snowballs. That buzz can manifest in the form of reviews or posts on social media. If you’ve done your relationship marketing well and the game is good, the chances are better that the buzz will be good and word will spread.

Spurring Action: Buy Your Game

If you’ve turned leads into relationships, pushed the right buttons to help their decision-making, and your game is as good as you think it is, then this part should be a shoo-in.

Sales promotion is the general name for marketing tactics that are concerned with nudging people to make that all-important purchase. This includes pre-order incentives, discounts, value-add deals, and point-of-sale retail marketing. The latter is channel-partner marketing, and covers everything from visibility in physical retail stores to being featured on the front page of Steam or GOG.com at launch, or during a sale. Physical retail marketing is an expensive business; but when it comes to online stores, if you have good interest in your game and can get a foot in the door, then you stand a chance of grabbing some of that valuable front-page real estate.

Green Man
Figure 8. Sales promotion in action during the Green Man Gaming 2017 winter sale.

Another marketing model that comes into play between release of your game(s) and its/their various versions is the loyalty loop. When the game is a sequel and the experience with the previous game was a good one, the decision is already easier for the buyer. Once invested in the brand, they want more—so, when it comes to a share-of-wallet buying decision there’s a good chance they’ll pick your game.

Indie Game
Figure 9. Illustration of the “loyalty loop” where customers turn into repeat customers.

In practice, the loyalty loop takes customers from the bottom of the sales funnel and reinserts them back at the decision stage the next time they’re triggered to purchase. With gaming, that trigger can be any of your marketing activities, from receiving an email or reading a social-media post, either about the sequel or about a new game that you are releasing. Your ongoing relationship with each customer will influence how long they stay with you as a player and whether or not they become interested in your subsequent product releases.

Aside from the experience with your game, loyalty is encouraged by factors including after-sales service, availability of relevant information in a knowledge base, FAQs or forums, and sustained communication through social media and email.

Every customer relationship is a long-term commitment. Stay on the best possible terms with every one of your customers, fans and followers, so they’re always ready to invest in your next DLC or game. You never know when you’ll need them again.

Look after your customers, treat them with respect, and they will look after you.

Glossary

  • Email open rate: The percentage of recipients who open the email you’ve sent.
  • Email click-throughs: The percentage of recipients who click on a link in your email.
  • Email read times: The amount of time a recipient spends reading your email.

Resources

Intel® Developer Zone (Intel® DZ): https://software.intel.com/en-us

Get Noticed: Packaging Your Indie Game

$
0
0

Packaging your game in the age of downloads and streaming doesn’t end with a cardboard box on a retail shelf. Standing out in today’s competitive and crowded PC games market takes a carefully coordinated plan for telling your story and promoting your game, in effect packaging it. Whether that packaging is physical or digital, its role is to attract the attention, interest, and desire of potential buyers.

Because indie developers have access to the same channels used by established studios — the web, social media, YouTube*, and retail shelves— physical and digital packaging is neither difficult nor expensive. The key to getting the most out of each channel isn’t about how much money you spend. Rather, it’s the time and effort you put into knowing your intended audience and telling your game’s story in ways that clearly communicate what makes it worth playing.

Depending on where you are in the development cycle, you may already have many of the necessary elements — a logo, screenshots of gameplay, gameplay trailer videos, and a playable demo. And what you don’t have isn’t too difficult or expensive to create.

This guide describes how to get your game in front of its intended audience, including:

  • What elements your website should include.
  • What to put on the game’s cardboard box.
  • Tips on writing copy, producing trailers, and making screenshots.
  • Tips on designing logos.
  • How to prepare your text and graphics for printing on the cardboard box.

You won’t encounter any rules or one-size-fits all solutions for promoting and packaging your game, so be creative, use your best judgment, and get feedback when in doubt. Your mileage can and will vary.

Telling Your Story

If you’ve been strategizing about how to start conversations and build relationships within the gaming community with the goal of enticing people to buy your new creation, congratulations. You’ve taken the first step toward marketing your game, and those activities will feed directly into your game’s packaging. Including pathways for players to give constructive feedback via your digital packaging can help you refine gameplay, making your game more appealing to potential and future players/customers.

If you haven’t already, start by identifying anything about your game that sets it apart from others. Whether it’s the way you coaxed an off-the-shelf game engine to do something it wasn’t supposed to be able to do, your visual style, the soundtrack, sound effects, or gameplay — if it stands out as different, work it into your story.

Reading the story out loud should take about 30 seconds. Introduce your story to people whose opinions matter to you, people who don’t pull punches. Ask other developers, or better yet, find your worst critics and get their feedback. Iterate and refine the story until you’re happy with the result. For marketing and packaging purposes, let it serve as the foundation of your game’s story.

When applied to your packaging online or on a box, your story must reinforce your game’s selling points with words and a tone informed by:

  • Your intended audience demographic:
    • Are they mostly male or female? A story can appeal more deeply to boys, for example, than to girls.
    • What are their age group(s)?
    • What kind of lingo will appeal to them?
    • How will their nationality or geo-location influence how you talk about your game? Jokes that play well in one locale may not go over in others.
  • The game’s genre.
  • The time period in which the game takes place (if applicable).

Make your game’s story personable by talking directly to your desired audience and refer to them as “you.”

All the other elements of your game — visual and sonic style, color palette, the website and in-game text font(s), and even the words that describe your game’s components — should be informed by the factors listed above.

Essentially, your game’s story is your brand. Be consistent in how you tell it. Use the same color palette, fonts, and wording and phraseology wherever you’re promoting and packaging your game. See the “Branding Checklist” sidebar.

Branding Checklist

Big companies go to great lengths to document where and how their trademarks and logos can be used, often defining the web and print colors, grammar style guides, and the spacing dos and don’ts.

You won’t need that level of detail, but you’ll save time by creating a brand and style guide that documents your color palette and fonts, the spelling of unusual character names, and game-centric jargon to ensure consistent use of those elements.

If more than one person is writing copy, create a style guide that covers grammar issues—how and when to use certain kinds of punctuation, game lingo, and so on. Consistency is key to building relationships and an instantly recognizable identity. The more people who are involved in writing your copy, the more important it is to maintain a consistent voice.

Here’s a sample style template:

Search the web to find free and inexpensive branding and style-guide creation tools to jumpstart the process.

Content Building Blocks

Like assets for your game engine, you need promotional assets that can be used to package and present your game. Create and store promotional assets so they can be used like building blocks. If you sign a new distribution deal to a streaming game service, you’ll be able to reach into your asset library and drop the content you need into their Content Management System (CMS) template.

The building blocks below, with the exception of music and sound clips, are must-haves.

A Logo

Once the game is named, it’ll need a logo—an image that instantly communicates what the game is … and isn’t. Logos should be readable and visually aligned with the game’s genre. Because the logo is the cornerstone of your game’s (brand) identity, once the logo is finalized, don’t change it unless you have an excellent reason. If the logo continually changes, who will know it’s the same game?


Crisp, clean, and easy to read, Vertigo Games’ Arizona Sunshine* logo communicates the name of the game and also hints at its content through the color-coded Z for zombies.

Practical matters: Create a vector version of your logo so it easily scales to any screen size. Also create a version with a transparent background so that background elements aren’t deleted when you place the logo onto an image, over text, or in a video.

Screenshots and Graphics

Take several gameplay screenshots. Focus on things like epic battle scenes, monsters, vehicles, puzzles, and anything else that will grab attention. A picture is worth a thousand words, so use screenshots to emphasize the best things in your game.

A hero graphic — an iconic image taken from or inspired by gameplay—is essential. Established studios often use illustrations or ultra-high-resolution renders from their graphics program rather than their game engine. Such images exemplify all that the game represents—action, fun, cool puzzles, and so forth. Use a hero graphic on your homepage, download or streaming landing pages, and on the cardboard box if you’re distributing via retail stores. Hero graphics should be powerful enough to draw your audience in entice them to keep reading below the fold of your homepage or interest them enough to turn the box over to learn more about your game.

Practical matters: Computer screens and printed materials require different resolutions and different color spaces. For online use, 72-dpi resolution is plenty. For printed materials, use a minimum resolution of 300 dpi. On-screen graphics need to be in RGB (red, green, blue) color space. Printed materials need to be in CMYK (cyan, magenta, yellow, black) color space.

Treat your promotional assets like you treat your game assets—keep them organized by naming files in a consistent, easy-to-remember manner. Saving images named “Screenshot 2017-11-09 12.30 PM” won’t make them easy to find when uploading game graphics to a new streaming service site or updating your Facebook* page.

Music and Sound

If your game has a soundtrack, consider adding it to your promotional toolkit. Upload audio clips to a streaming music service and embed its audio player in your website. Choose audio excerpts that emphasize the best your game has to offer. The audio clips don’t have to be bombastic. Ambient sound characterizes many successful AAA titles. If atmospheric audio sets the game’s mood, use it to package your game.

Practical matters: If you embed an audio player on your website, don’t set it to automatically play. Also, make it easy for the user to turn the volume down or off. You don’t want to get a potential customer in trouble because they visited your site at work and your ultra-cool soundtrack let everyone else in the room know what they were doing.

Text Blocks

Consider creating three versions of your game’s story:

  • Short—30 words or less
  • Medium—50 words or less
  • Long—85 words or less

These different lengths provide flexibility. Each version could consist of one concise sentence followed by bullet points that emphasize the game’s primary selling points. Longer versions could add more bullet points or include another sentence or two. The key is to create short, fun, and easy-to-read blocks of text that can be plugged into a content template or design.

Other important text blocks include:

  • The game’s vital statistics—the number of players, their age range, and playing time.
  • Information about your team, its history, and other games you’ve created. Keep the team’s story short—no more than 200 words—and break the story into sections using headers such as The Team and Our History.
  • Unless you created the game yourself, create a ready-to-use list of credits.
  • Legal indicia, including a copyright notice, trademark info, “created with” licensing info, and so on.
  • For boxed games on retail shelves, include minimum system requirements and the logos of the operating system(s) your game runs on.

Trailer Videos

Game trailers, like Hollywood movie trailers, are an extremely effective way to attract attention and pique the interest of potential players. Whether you produce one trailer or release new trailers throughout your development process, keep them focused on promoting what makes your game a blast to play. See the “Tips for Creating a Compelling Trailer Video” sidebar for details.

Post your trailer(s) on YouTube or go live on YouTube Gaming to showcase features that make your game stand out.

Post gameplay videos on your YouTube channel to showcase your game. Keep the videos focused on action that entices people to want to learn more about your game or—better yet—buy it to play it.

Some Assembly Required

Your growing collection of promotional assets—including the game’s retail packaging—can be used on your company website, a standalone website dedicated solely to your game, landing pages on streaming service sites that carry your game, your Facebook page, or your Twitter* feed.

For inspiration, we’ve included a few examples of how your promotional assets can be used.


Notice the calls-to-action in the navigation options along the top of the page and in the box beneath the hero graphic that dominates this portion of the Arizona Sunshine website.

Standalone Game Website

Your homepage ingredients can include:

  1. Hero graphic
  2. Logo
  3. Copy block(s) for your game’s story
  4. Call-to-action links to download/buy/play a demo and read reviews of your game as applicable
  5. Screenshots (presented as stills or in a slideshow)
  6. Trailer(s)
  7. News—Link to new builds, new reviews, new characters, new levels, or anything that will generate excitement, keep your game’s momentum, and create the impression that people are playing and enjoying your game
  8. Events—Places where people can try your game live and meet you or hear you speak
  9. Feedback—A way for people to offer suggestions and comment on your game, both while you’re still building it and after it’s released
  10. Links to your game and reviews of your game on third-party sites and forums
  11. Blog
  12. Links for others to share your site on social media sites

Your website’s homepage design objectives are to attract attention, generate interest, and make it easy for people to take action (read reviews, play a demo, watch trailers, and purchase the game).


The game’s story—located below the fold so people need to scroll down to see it—is told in a paragraph, followed by its selling points highlighted in short blocks of text.

For design inspiration, visit these sites:

Webdesign-Inspiration

Design Shack


On the Arizona Sunshine landing page on Steam*, the hero graphic is actually a window for playing the currently selected game trailer or for displaying gameplay screenshots. The game’s story is told in the copy block on the right beneath the logo and tagline for new content.

Landing Pages

Game publisher sites and streaming service sites use landing pages to promote individual games or games from a specific studio. If you plan to distribute your game through a publisher or streaming service, adapt your promotion materials to the publisher’s content guidelines and requirements.

Essential landing page ingredients typically include:

  1. Hero graphic
  2. Logo
  3. Copy block(s) of your game’s story
  4. Screenshots
  5. List of supported operating systems
  6. Call-to-action links (start/download/buy)


An assortment of gameplay screenshots available on Amazon.com* clearly communicates that Arizona Sunshine puts players in the middle of the zombie apocalypse. Fun!

Your Game in a Box

In-store packaging needs to physically conform to certain requirements dictated by retailers. The overall size of the box should match boxes for similar games so they can be displayed together on the same shelf. Consult your distributor or retailer for further guidance.

If you can afford it, hire an experienced package designer. In retail settings, your box may be the first and only thing a potential player sees before deciding whether to buy your game. If you design the box yourself, check out graphic design applications for pre-built templates that can assist in jumpstarting the process.

For printed packaging, you need:

  • Hero graphic
  • Logo
  • Copy block(s) of your game’s story
  • Screenshots that clearly communicate what gameplay is like
  • Number of players, their age range, and playing time
  • List of supported operating systems (or just include their logos)
  • Minimum system requirements
  • Universal Product Code (UPC) code

A. Front of the box design objective: Entice potential buyers to pick up the box and read the details listed on the back of the box.

  1. Hero graphic
  2. Logo tagline
  3. List of supported operating systems (or just include their logos)
  4. Number of players, ages, time-to-play icons (optional)

B. Sides of the box design objective: Call attention to the game when it’s on a shelf.

  1. Logo
  2. Publisher info and logo
  3. Number of players, ages, time-to-play icons

C. Back of the box design objective: Communicate that what’s inside is worth buying.

  1. Logo
  2. Copy block for your game’s story
  3. Screenshots that clearly communicate what gameplay is like
  4. Number of players, their age range, and playing-time icons
  5. Credits
  6. List of supported operating systems (or just include their logos)
  7. Minimum system requirements

D. Top and bottom edge of the box design objective: Identify the game if the box is lying flat. This should include the logo or name of the game.

Summary

The key to getting your game noticed among the 4,000 new games being released every year is knowing your intended audience and using carefully crafted words, pictures, trailers, videos, and demos to clearly communicate what makes your game worth playing. And make it clear where they can buy your game!

Tips for Creating a Compelling Trailer Video


The launch trailer for Arizona Sunshine as seen on the game developer’s homepage. Notice the call-to-action, “Stay alive, get the updates” (by subscribing to the email list).

Armed with video-editing software, background music, and a set of video captures and gameplay screenshots, you should be able to assemble a compelling trailer video that gives viewers a lasting impression of how entertaining your game is to play.

When capturing gameplay video, if possible, mute the background music before capturing the video, but leave the sound effects and dialogue tracks turned on. The former will make creating smooth transitions easier; the latter will help propel the action.

When capturing your gameplay video, capture the highest resolution supported by your game engine to make the best possible impression. Video-editing software usually allows for several kinds of transitions. Most are great for wedding videos, but you don’t need them for game trailers. Stick with straight cuts (jumping from one scene to the next instantly) or sparingly use dissolves and other transitions.

For background sound, write music that’s tailored to match your trailer video, or use a portion of your game’s soundtrack and cut (edit) the trailer to it. Another option is to loop a portion of your soundtrack, assuming the music lends itself to being looped.

Make sure the tempo and mood of the chosen background music matches the game’s mood. Slow and atmospheric are fine for a moody mystery game but not for an action-packed, first-person shooter.

Finally, be sure to end your game’s trailer video with a call-to-action. Don’t simply say, “buy it now” and show your website link. Instead, announce the release date and state where more info can be found or how to download it.

Resources

Brand Consistency and Packaging Considerations

7 Branding Tools to Effectively Establish Your Brand

How to Make an Indie Game Trailer With No Budget

Packaging Your Game so Stores Can, Y’know, Sell It

Installing the Intel® Computer Vision SDK on Linux*

$
0
0

Installing the Intel® Computer Vision SDK on Linux*

The installation package comes as an archive that contains components and installation scripts.

These instructions describe:

  • Installation prerequisitess
  • Pre-installation steps
  • Two installation options: Using a graphical user interface or a command-line script
  • Caffe* Installation

Prerequisites

The following are required to install all components and enable the full set of the Intel® Computer Vision SDK (Intel® CV SDK) features:

  • CMake* version 2.8 or higher
  • GCC* version 4.8 or higher
  • OpenCV version 3.4
  • Python* version 3.5 or higher

In addition, install the Caffe framework if you will use advanced functions of the Deep Learning Deployment Toolkit Model Optimizer. 

The steps in the next section direct you to use a bash script to install:

  • Dependencies for Intel-optimized OpenCV 3.4, the Inference Engine, and Model Optimizer tools
  • The OpenCL™ 2.0 GPU/CPU driver package for Linux*, required for applications that offload computation to your Intel® GPU.

Pre-installation Steps

Use these steps to prepare your development machine for the Intel CV SDK software.

  1. Create a new directory. this example uses a directory called install:
    mkdir ~/install
  2. Go to the new directory. Move the downloaded archive package to this directory and unpack the archive. This example uses the Downloads directory:
    cd ~/install
    mv ~/Downloads/intel_cv_sdk_<version>.tgz
    tar -xavf intel_cv_sdk_<version>.tgz
  3. Go to the directory with the unpacked files. This is intel_cv_sdk <version>
    cd intel_cv_sdk_<version>
  4. Run install_cv_sdk_dependencies.sh to install the external dependencies. These dependencies are the packages required for Intel-optimized OpenCV 3.4, the Inference Engine, and the Model Optimizer tools:
    ./install_cv_sdk_dependencies.sh
  5. For applications that offload computation to your Intel® GPU, the OpenCL™ 2.0 GPU/CPU driver package for Linux* is required. To install the driver, run the install_OCL_driver.sh script from the installation package directory. 

Using the GUI Installation Wizard

  1. After you complete the pre-installation steps, run install_GUI.sh to start the GUI-based installation wizard:
    ./install_GUI.sh
  2. Select an elevation mode, and then follow the on-screen instructions:
  3. The Prerequisites screen tells you if you are missing any required or recommended components, and the effect the missing component has on installing or using the product. If you are missing a critical component, resolve the issue and then click Re-check.
  4. When all critical issues are resolved, you will be able to click Next to begin the installation, or make final changes to your component selections and choose your installation directory.
  5. The Installation summary screen shows you the options that will be installed if you make no changes.
  6. If you want to change the selected components and/or specify an installation destination directory, click Customize:
  7. A Complete screen indicates the software is installed. Click Finish to close the wizard and open the Getting Started page, which contains information to get you started quickly with Intel CV SDK.

Installing from an Installation Script

If you want to install the Intel CV SDK from a command line after you're done with the pre-installation steps, run ./install.sh script and follow the on-screen instructions.

The install.sh options are:

-h, --help - display help message

-v, --version - display version information

-s, --silent [FILE] - run setup silently using settings in the specified configuration file instead of from screen prompts

-d, --duplicate [FILE] - run setup interactively, and record the user input in the configuration file

-l, --lang - set user interface language

-t, --tmp-dir [DIRECTORY] - specify a temporary directory

-D, --download-dir [DIRECTORY] - specify a download directory

--user-mode - run setup with the current user privileges

--ignore-signature - skip signature validation

--ignore-cpu - skip the CPU model check

--nonrpm-db-dir [DIRECTORY] - set a directory for the installation database

--SHARED_INSTALL - install to a network-mounted drive or shared file system. This is typically for multiple users

During the installation you can:

  • Choose the elevation mode for installation:
    • root
    • sudo
    • current user
  • Chose the installation directory:
    • Default directory for root user is /opt/intel/computer_vision_sdk_<version>
    • Default directory for non-root user is ~/intel/computer_vision_sdk_<version>
  • Customize components to install.

Your installation is complete. Continue with Getting Started to help you get you started quickly with the Intel CV SDK.

What You Installed

The installation created this directory structure:

install_dirIntel CV SDK root
/bin 
/deployment_toolsDeep Learning Deployment Toolkit
/inference_engineInference Engine
/model_optimizerModel Optimizer
/model_optimizer_caffeModel Optimizer for Caffe* directory. Executable binary and required dependencies.
/model_optimizer_mxnetModel Optimizer for MXNet* Python source files
/model_optimizer_tensorflowPython source files for the Model Optimizer that supports supported frameworks
/documentationDocumentation
/install_dependencies 
/opencvOpenCV
/openvxOpenVX
/uninstall 

In addition to these directories, a symlink to the installation directory is added to the /opt/intel/ directory. If you install multiple versions of the Intel CV SDK, the versions are installed side-by-side and the symlink /opt/intel/computer_vision_sdk points to the latest version.

Installing the Caffe* Framework

The Caffe* deep learning framework is required if your deployed Caffe topology has layers that are not implemented in the Model Optimizer and you do not register them as "Custom Operations".

To build a Caffe* framework with Python 3.5:

  1. Use the commands:
    export CAFFE_HOME=PATH_TO_CAFFE
    cd $CAFFE_HOME
    rm -rf  ./build
    mkdir ./build
    cd ./build
    cmake -DCPU_ONLY=ON -DOpenCV_DIR=<your_opencv_install_dir> -DPYTHON_EXECUTABLE=/usr/bin/python3.5 ..
    make all # also builds pycaffe
    make install
    make runtest # optional
  2.  Add the Caffe Python directory to PYTHONPATH to import it from the Python program:
    export PYTHONPATH=$CAFFE_HOME/python;$PYTHONPATH
  3. Check the Caffe installation:
    python3 import caffe

If Caffe is installed correctly, the Caffe module is imported without errors.

Next Steps

Your installation is complete. Return to the Getting Started page for information to get started quickly.

Intel® Computer Vision SDK Overview

$
0
0

The Intel® Computer Vision SDK (Intel® CV SDK) is a comprehensive toolkit that you can use to develop and deploy vision-oriented solutions on Intel platforms. You can use the Intel® CV SDK in things like autonomous vehicles, digital surveillance cameras, robotics, and mixed-reality headsets.

The figure below shows the end-to-end computer vision application process. Some of the components in the figure are not part of the Intel® CV SDK, but are included in the diagram to illustrate the full end-to-end computer vision process. 

Boxes that display the deep learning workflow

After your model is trained, you are ready to use the Intel® CV SDK in a developer environment to write an application and optimize its performance on Intel® hardware. In this part of the development, you use the Model Optimizer.

In addition to optimizing your model, in this development phase, you integrate the components within your application. Depending on your specific needs, you might use other tools in your application, such as the Intel® Media SDK.

If you need to retrain your model after you optimize it with the Intel® CV SDK, you will need to return to the tasks listed in the first two columns of the figure to retrain your model, and then run the Model Optimizer again . 

Once you have used the Model Optimizer and integrated the applicable components, you are ready to deploy your application at the edge. This piece of the process uses the Inference component of the Intel® CV SDK. In this tutorial, deployment is represented by using the video provided with this tutorial instead of using data directly from a video camera or on a separate piece of hardware.

Product Contents

Deep Learning Model Optimizer tool to help you with deploying Convolutional Neural Networks. It's a cross-platform command line tool that performs static model analysis and adjusts deep learning models for optimal execution on end-point target devices.

  • Deep Learning Inference Engine
  • Deep Learning Inference Engine Samples
  • Deep Learning Deployment Toolkit provides the Inference Engine and Model Optimizer tools as a separate package
  • Intel-optimized implementation of the Khronos* OpenVX* 1.1 API, extended with both additional API and kernels, including support for Convolutional Neural Networks (CNN)
  • Pre-built and fully-validated community OpenCV* 3.3 binaries with additional fast and accurate Face capabilities

Intel® CV SDK Components

Model Optimizer

The Model Optimizer is the first of two key components of the Intel® CV SDK. The second key component is the Inference Engine.

The Model Optimizer is a cross-platform command-line tool that converts trained models into Immediate Representations (IRs). In the process of optimization, the Model Optimizer: 

  • Performs horizontal fusion of the network layers
  • Merges the network layers
  • Prunes unused branches in the network
  • Applies weight compression methods

Note: Layers require the use of a framework, like Caffe*

Three-stage model processing

The Model Optimizer uses three-stage model processing to transform a deep learning network:

Learning stage:

  • Iteratively runs the networks on a set of input samples (typically, the validation set) and collects the network statistics on each layer output. This allows the Model Optimizer to estimate the dynamic range of all layer activations, weights and biases. This is required only if the target data type differs from the original data type that the network was trained with.
  • Reports collected statistics for offline analysis that contains the following metrics: min, max, standard deviation, mean, percentiles (99%, 99.5%, 99.95%).
  • Builds an optimal configuration for the target precision network and creates an inference network converted to run in the target data type

Feedback stage:

  • Simulates the inference process to estimate any potential accuracy loss. Each layer in the produced network has a bit accurate implementation for a specific Intel® platform, which simulates both the mode of operation of the hardware, and the required data type precision.
  • Reports the network performance in terms of accuracy and loss. These metrics are identical to those that would have reported using the dedicated scoring API (for example, OpenVX*, Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN) and so on).

Deployment stage: The output provided by the Model Optimizer is an Intermediate Representation (IR) of the network. The IR is an input into the Inference Engine. The IR consists of two files:

  • A topology file - an XML file that describes the network topology
  • A trained data file - a .bin file that contains the weights and biases binary data

Workflow

Use the following typical workflow diagram to see where the Model Optimizer fits in when performing inference on a trained deep neural network model.

Intel Computer Vision Basic Workflow

A brief description of using the Model Optimizer workflow is:

  1. Optional / if you use layers: Configure the Model Optimizer for the deep learning framework that you used to train the model.
  2. Provide as input a trained network that contains the network topology, parameters, and the adjusted weights and biases.
  3. Run the Model Optimizer to perform specific model optimizations, such as horizontal fusion of specific network layers.

Inference Engine

The Inference Engine is the second of the two key components of the Intel® CV SDK.

The Inference Engine uses the Intermediate Representation files that result from running the Model Optimizer and provides an optimized C++ application on embedded platforms. The Engine helps application execution with computational graph analysis, scheduling, model compression.

The Inference Engine has:

  • A core library
  • Four hardware-specific libraries
  • A plugin for Intel® Xeon® and Intel® Core™ processors with Intel® AVX2, Intel Atom® processors ("CPU Plugin")
  • A plugin for Intel® HD Graphics ("GPU Plugin")
  • A plugin for Intel® Arria® A10 discrete cards ("FPGA Plugin")
  • Third party libraries

Use the following typical workflow diagram to see where the Inference Engine fits in when performing inference on a trained deep neural network model:

Intel Computer Vision Basic Workflow

Inference Engine Workflow

A brief description of the Inference Engine workflow is:

  1. Use the model as input. The model is in the form of Intermediate Representation (IR) that was produced by Model Optimizer.
  2. Optimize inference execution for target hardware.
  3. Deliver the inference solution to one or more embedded inference platforms.

Intel® Trace Analyzer and Collector Release Notes for Windows* OS

$
0
0

Overview

Intel® Trace Analyzer and Collector is a powerful tool for analyzing MPI applications, which essentially consists of two parts:

  • Intel® Trace Collector is a low-overhead tracing library that performs event-based tracing in applications at runtime. It collects data about the application MPI and serial or OpenMP* regions, and can trace custom set functions. The product is completely thread safe and integrates with C/C++, FORTRAN and multithreaded processes with and without MPI. Additionally it can check for MPI programming and system errors.
  • Intel® Trace Analyzer is a GUI-based tool that provides a convenient way to monitor application activities gathered by the Intel Trace Collector. You can view the desired level of detail, quickly identify performance hotspots and bottlenecks, and analyze their causes.

To receive technical support and updates, you need to register your product copy. See Technical Support below.

What's New

Intel® Trace Analyzer and Collector 2018 Update 2

  • User Interface improvements.
  • Deprecated ITC static libraries on Windows.

Intel® Trace Analyzer and Collector 2018 Update 1

  • Fix for the --summary CLI option.
  • Performance improvements in Imbalance Diagram building.

Intel® Trace Analyzer and Collector 2018

  • MPI Performance Snapshot is no longer a part of Intel Trace Analyzer and Collector and is available as a separate product. See http://www.intel.com/performance-snapshot for details.
  • Removed the macOS* support.
  • Documentation is now removed from the product package and is available online.

Intel® Trace Analyzer and Collector 2017 Update 4

  • Bug fixes.

Intel® Trace Analyzer and Collector 2017 Update 3

  • Bug fixes.

Intel® Trace Analyzer and Collector 2017 Update 2

  • Enhancements in function color selection on timelines.

Intel® Trace Analyzer and Collector 2017 Update 1

  • Added zooming support with a mouse wheel on timelines.
  • Deprecated support for the ITF format.

Intel® Trace Analyzer and Collector 2017

Key Features

  • Advanced GUI: user-friendly interface, high-level scalability, support of STF trace data
  • Aggregation and Filtering: detailed views of runtime behavior grouped by functions or processes
  • Fail-Safe Tracing: improved functionality on prematurely terminated applications with deadlock detection
  • Intel® MPI Library Interface: support of tracing on internal MPI states, support of MPI-IO
  • Correctness Checking: check for MPI and system errors at runtime (including distributed memory checking)
  • ROMIO*: extended support of MPI-2 standard parallel file I/O
  • Comparison feature: compare two trace files and/or two regions (in one or two trace files)
  • Command line interface for the Intel Trace Analyzer

System Requirements

Hardware Requirements

  • Systems based on the Intel® 64 architecture, in particular:
    • Intel® Core™ processor family
    • Intel® Xeon® E5 v4 processor family recommended
    • Intel® Xeon® E7 v3 processor family recommended
    • 2nd Generation Intel® Xeon Phi™ Processor (formerly code named Knights Landing)
  • 1 GB of RAM per core (2 GB recommended)
  • 1 GB of free hard disk space

Software Requirements

  • Operating systems:
    • Microsoft* Windows Server* 2008, 2008 R2, 2012, 2012 R2, 2016
    • Microsoft* Windows* 7, 8.x, 10
  • MPI implementations:
    • Intel® MPI Library 5.0 or newer
  • Compilers:
    • Intel® C++/Fortran Compiler 15.0 or newer (required for OpenMP* support)
    • Microsoft* Visual Studio* Compilers 2013, 2015, 2017

Known Issues and Limitations

  • Tracing of MPI applications, which include the MPI_Comm_spawn function calls, is not supported.
  • Intel® Trace Analyzer may get into an undefined state if too many files are opened at the same time.
  • In some cases symbols information may appear incorrectly in the Intel® Trace Analyzer if you discarded symbols information from object files.
  • MPI Correctness Checking is available with the Intel® MPI Library only.

Technical Support

Every purchase of an Intel® Software Development Product includes a year of support services, which provides Priority Support at our Online Service Center web site.

In order to get support you need to register your product in the Intel® Registration Center. If your product is not registered, you will not receive priority support.

Intel® Trace Analyzer and Collector Release Notes for Linux* OS

$
0
0

Overview

Intel® Trace Analyzer and Collector is a powerful tool for analyzing MPI applications, which essentially consists of two parts:

  • Intel® Trace Collector is a low-overhead tracing library that performs event-based tracing in applications at runtime. It collects data about the application MPI and serial or OpenMP* regions, and can trace custom set functions. The product is completely thread safe and integrates with C/C++, FORTRAN and multithreaded processes with and without MPI. Additionally it can check for MPI programming and system errors.
  • Intel® Trace Analyzer is a GUI-based tool that provides a convenient way to monitor application activities gathered by the Intel Trace Collector. You can view the desired level of detail, quickly identify performance hotspots and bottlenecks, and analyze their causes.

To receive technical support and updates, you need to register your product copy. See Technical Support below.

What's New

Intel® Trace Analyzer and Collector 2018 Update 2

  • User Interface improvements.

Intel® Trace Analyzer and Collector 2018 Update 1

  • Fix for the --summary CLI option.
  • Performance improvements in Imbalance Diagram building.

Intel® Trace Analyzer and Collector 2018

  • Added support for OpenSHMEM* applications.
  • MPI Performance Snapshot is no longer a part of Intel Trace Analyzer and Collector and is available as a separate product. See http://www.intel.com/performance-snapshot for details.
  • Removed the macOS* support.
  • Removed support for the Intel® Xeon Phi™ coprocessor (code named Knights Corner).
  • Removed support for the indexed trace file (ITF) format.
  • Documentation is now removed from the product package and is available online.

Intel® Trace Analyzer and Collector 2017 Update 4

  • GStreamer* dependencies removal.

Intel® Trace Analyzer and Collector 2017 Update 3

  • Bug fixes.

Intel® Trace Analyzer and Collector 2017 Update 2

  • Enhancements in function color selection on timelines.

Intel® Trace Analyzer and Collector 2017 Update 1

  • Added zooming support with a mouse wheel on timelines.
  • Deprecated support for the ITF format.

Intel® Trace Analyzer and Collector 2017

  • Introduced an OTF2 to STF converter otf2-to-stf (preview feature).
  • Introduced a new library for collecting MPI load imbalance (libVTim).
  • Introduced a new API function VT_registerprefixed.
  • Custom plug-in framework is now removed.
  • All product samples are moved online to https://software.intel.com/en-us/product-code-samples.

Key Features

  • Advanced GUI: user-friendly interface, high-level scalability, support of STF and OTF2 trace data
  • Aggregation and Filtering: detailed views of runtime behavior grouped by functions or processes
  • Fail-Safe Tracing: improved functionality on prematurely terminated applications with deadlock detection
  • Intel® MPI Library Interface: support of tracing on internal MPI states, support of MPI-IO
  • Correctness Checking: check for MPI and system errors at runtime (including distributed memory checking)
  • ROMIO*: extended support of MPI-2 standard parallel file I/O
  • Comparison feature: compare two trace files and/or two regions (in one or two trace files)
  • Command line interface for the Intel Trace Analyzer

System Requirements

Hardware Requirements

  • Systems based on the Intel® 64 architecture, in particular:
    • Intel® Core™ processor family
    • Intel® Xeon® E5 v4 processor family recommended
    • Intel® Xeon® E7 v3 processor family recommended
    • 2nd Generation Intel® Xeon Phi™ Processor (formerly code named Knights Landing)
  • 1 GB of RAM per core (2 GB recommended)
  • 1 GB of free hard disk space

Software Requirements

  • Operating systems:
    • Red Hat* Enterprise Linux* 6, 7
    • Fedora* 23, 24
    • CentOS* 6, 7
    • SUSE* Linux Enterprise Server* 11, 12
    • Ubuntu* LTS 14.04, 16.04
    • Debian* 7, 8
  • MPI implementations:
    • Intel® MPI Library 5.0 or newer
  • Compilers:
    • Intel® C++/Fortran Compiler 15.0 or newer (required for OpenMP* support)
    • GNU*: C, C++, Fortran 77 3.3 or newer, Fortran 95 4.4.0 or newer

Known Issues and Limitations

  • Static Intel® Trace Collector libraries require Intel® MPI Library 5.0 or newer.
  • Tracing of MPI applications, which include the MPI_Comm_spawn function calls, is not supported.
  • Intel® Trace Analyzer may get into an undefined state if too many files are opened at the same time.
  • In some cases symbols information may appear incorrectly in the Intel® Trace Analyzer if you discarded symbols information from object files.
  • MPI Correctness Checking is available with the Intel® MPI Library only.
  • Intel® Trace Analyzer requires libpng 1.2.x (libpng12.so), otherwise the Intel Trace Analyzer GUI cannot be started.
  • Intel® Trace Analyzer and Collector does not support Fortran applications or libraries compiled with the -nounderscore option. Only functions with one or two underscores at the end of the name are supported. See details on Fortran naming conventions at https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gfortran/Naming-conventions.html

Technical Support

Every purchase of an Intel® Software Development Product includes a year of support services, which provides Priority Support at our Online Service Center web site.

In order to get support you need to register your product in the Intel® Registration Center. If your product is not registered, you will not receive Priority Support.

Intel® MPI Library Release Notes for Windows* OS

$
0
0

Overview

Intel® MPI Library is a multi-fabric message passing library based on ANL* MPICH3* and OSU* MVAPICH2*.

Intel® MPI Library implements the Message Passing Interface, version 3.1 (MPI-3) specification. The library is thread-safe and provides the MPI standard compliant multi-threading support.

To receive technical support and updates, you need to register your product copy. See Technical Support below.

Product Contents

  • The Intel® MPI Library Runtime Environment (RTO) contains the tools you need to run programs including scalable process management system (Hydra), supporting utilities, and dynamic libraries.
  • The Intel® MPI Library Development Kit (SDK) includes all of the Runtime Environment components and compilation tools: compiler wrapper scripts (mpicc, mpiicc, etc.), include files and modules, static libraries, debug libraries, and test codes.

You can redistribute the library under conditions specified in the License.

What's New

Intel® MPI Library 2018 Update 2

  • Improved shm performance with collective operations (I_MPI_SCHED_YIELD_MPI_SCHED_YIELD_MT_OPTIMIZATION).

Intel® MPI Library 2018 Update 1

  • Bug fixes.

Intel® MPI Library 2018

  • Deprecated support for the IPM statistics format.
  • Hard finalization is now the default.
  • Documentation has been removed from the product and is now available online.

Intel® MPI Library 2017 Update 4

  • Minor changes.

Intel® MPI Library 2017 Update 3

  • Minor changes.

Intel® MPI Library 2017 Update 2

  • Added an environment variable I_MPI_HARD_FINALIZE.

Intel® MPI Library 2017 Update 1

  • Support for topology-aware collective communication algorithms (I_MPI_ADJUST family).
  • Deprecated support for cross-OS launches.

Intel® MPI Library 2017

  • Support for the MPI-3.1 standard.
  • Removed the SMPD process manager.
  • Removed the SSHM support.
  • Deprecated support for the Intel® microarchitectures older than the generation codenamed Sandy Bridge.
  • Bug fixes and performance improvements.
  • Documentation improvements.

Key Features

  • MPI-1, MPI-2.2 and MPI-3.1 specification conformance.
  • MPICH ABI compatibility.
  • Support for any combination of the following network fabrics:
    • RDMA-capable network fabrics through DAPL*, such as InfiniBand* and Myrinet*.
    • Sockets, for example, TCP/IP over Ethernet*, Gigabit Ethernet*, and other interconnects.
  • (SDK only) Support for Intel® 64 architecture clusters using:
    • Intel® C++/Fortran Compiler 14.0 and newer.
    • Microsoft* Visual C++* Compilers.
  • (SDK only) C, C++, Fortran 77, and Fortran 90 language bindings.
  • (SDK only) Dynamic linking.

System Requirements

Hardware Requirements

  • Systems based on the Intel® 64 architecture, in particular:
    • Intel® Core™ processor family
    • Intel® Xeon® E5 v4 processor family recommended
    • Intel® Xeon® E7 v3 processor family recommended
  • 1 GB of RAM per core (2 GB recommended)
  • 1 GB of free hard disk space

Software Requirements

  • Operating systems:
    • Microsoft* Windows Server* 2008, 2008 R2, 2012, 2012 R2, 2016
    • Microsoft* Windows* 7, 8.x, 10
  • (SDK only) Compilers:
    • Intel® C++/Fortran Compiler 15.0 or newer
    • Microsoft* Visual Studio* Compilers 2013, 2015, 2017
  • Batch systems:
    • Microsoft* Job Scheduler
    • Altair* PBS Pro* 9.2 or newer
  • Recommended InfiniBand* software:
    • Windows* OpenFabrics* (WinOF*) 2.0 or newer
    • Windows* OpenFabrics* Enterprise Distribution (winOFED*) 3.2 RC1 or newer for Microsoft* Network Direct support
    • Mellanox* WinOF* Rev 4.40 or newer
  • Additional software:
    • The memory placement functionality for NUMA nodes requires the libnuma.so library and numactl utility installed. numactl should include numactlnumactl-devel and numactl-libs.

Known Issues and Limitations

  • Cross-OS runs using ssh from a Windows* host fail. Two workarounds exist:
    • Create a symlink on the Linux* host that looks identical to the Windows* path to pmi_proxy.
    • Start hydra_persist on the Linux* host in the background (hydra_persist &) and use -bootstrap service from the Windows* host. This requires that the Hydra service also be installed and started on the Windows* host.
  • Support for Fortran 2008 is not implemented in Intel® MPI Library for Windows*.
  • Enabling statistics gathering may result in increased time in MPI_Finalize.
  • In order to run a mixed OS job (Linux* and Windows*), all binaries must link to the same single or multithreaded MPI library.  The single- and multithreaded libraries are incompatible with each other and should not be mixed. Note that the pre-compiled binaries for the Intel® MPI Benchmarks are inconsistent (Linux* version links to multithreaded, Windows* version links to single threaded) and as such, at least one must be rebuilt to match the other.
  • If a communication between two existing MPI applications is established using the process attachment mechanism, the library does not control whether the same fabric has been selected for each application. This situation may cause unexpected applications behavior. Set the I_MPI_FABRICS variable to the same values for each application to avoid this issue.
  • If your product redistributes the mpitune utility, provide the msvcr71.dll library to the end user.
  • The Hydra process manager has some known limitations such as:
    • stdin redirection is not supported for the -bootstrap service option.
    • Signal handling support is restricted. It could result in hanging processes in memory in case of incorrect MPI job termination.
    • Cleaning up the environment after an abnormal MPI job termination by means of mpicleanup utility is not supported.
  • ILP64 is not supported by MPI modules for Fortran 2008.
  • When using the -mapall option, if some of the network drives require a password and it is different from the user password, the application launch may fail.

Technical Support

Every purchase of an Intel® Software Development Product includes a year of support services, which provides Priority Support at our Online Service Center web site.

In order to get support you need to register your product in the Intel® Registration Center. If your product is not registered, you will not receive Priority Support.


Intel® MPI Library Release Notes for Linux* OS

$
0
0

Overview

Intel® MPI Library is a multi-fabric message passing library based on ANL* MPICH3* and OSU* MVAPICH2*.

Intel® MPI Library implements the Message Passing Interface, version 3.1 (MPI-3) specification. The library is thread-safe and provides the MPI standard compliant multi-threading support.

To receive technical support and updates, you need to register your product copy. See Technical Support below.

Product Contents

  • The Intel® MPI Library Runtime Environment (RTO) contains the tools you need to run programs including scalable process management system (Hydra), supporting utilities, and shared (.so) libraries.
  • The Intel® MPI Library Development Kit (SDK) includes all of the Runtime Environment components and compilation tools: compiler wrapper scripts (mpicc, mpiicc, etc.), include files and modules, static (.a) libraries, debug libraries, and test codes.

You can redistribute the library under conditions specified in the License.

What's New

Intel® MPI Library 2018 Update 2

  • Improved shm performance with collective operations (I_MPI_SCHED_YIELD, _MPI_SCHED_YIELD_MT_OPTIMIZATION).
  • Intel® MPI Library is now available to install in YUM and APT repositories.

Intel® MPI Library 2018 Update 1

  • Improved startup performance on many/multicore systems (I_MPI_STARTUP_MODE).
  • Bug fixes.

Intel® MPI Library 2018

  • Improved startup times for Hydra when using shm:ofi or shm:tmi.
  • Hard finalization is now the default.
  • The default fabric list is changed when Intel® Omni-Path Architecture is detected.
  • Added environment variables: I_MPI_OFI_ENABLE_LMT, I_MPI_OFI_MAX_MSG_SIZE, I_MPI_{C,CXX,FC,F}FLAGS, I_MPI_LDFLAGS, I_MPI_FORT_BIND.
  • Removed support for the Intel® Xeon Phi™ coprocessor (code named Knights Corner).
  • I_MPI_DAPL_TRANSLATION_CACHE, I_MPI_DAPL_UD_TRANSLATION_CACHE and I_MPI_OFA_TRANSLATION_CACHE are now disabled by default.
  • Deprecated support for the IPM statistics format.
  • Documentation is now online.

Intel® MPI Library 2017 Update 4

  • Performance tuning for processors based on Intel® microarchitecture codenamed Skylake and for Intel® Omni-Path Architecture.

Intel® MPI Library 2017 Update 3

  • Hydra startup improvements (I_MPI_JOB_FAST_STARTUP).
  • Default value change for I_MPI_FABRICS_LIST.

Intel® MPI Library 2017 Update 2

  • Added environment variables I_MPI_HARD_FINALIZE and I_MPI_MEMORY_SWAP_LOCK.

Intel® MPI Library 2017 Update 1

  • PMI-2 support for SLURM*, improved SLURM support by default.
  • Improved mini help and diagnostic messages, man1 pages for mpiexec.hydra, hydra_persist, and hydra_nameserver.
  • Deprecations:
    • Intel® Xeon Phi™ coprocessor (code named Knights Corner) support.
    • Cross-OS launches support.
    • DAPL, TMI, and OFA fabrics support.

Intel® MPI Library 2017

  • Support for the MPI-3.1 standard.
  • New topology-aware collective communication algorithms (I_MPI_ADJUST family).
  • Effective MCDRAM (NUMA memory) support. See the Developer Reference, section Tuning Reference > Memory Placement Policy Control for more information.
  • Controls for asynchronous progress thread pinning (I_MPI_ASYNC_PROGRESS).
  • Direct receive functionality for the OFI* fabric (I_MPI_OFI_DRECV).
  • PMI2 protocol support (I_MPI_PMI2).
  • New process startup method (I_MPI_HYDRA_PREFORK).
  • Startup improvements for the SLURM* job manager (I_MPI_SLURM_EXT).
  • New algorithm for MPI-IO collective read operation on the Lustre* file system (I_MPI_LUSTRE_STRIPE_AWARE).
  • Debian Almquist (dash) shell support in compiler wrapper scripts and mpitune.
  • Performance tuning for processors based on Intel® microarchitecture codenamed Broadwell and for Intel® Omni-Path Architecture (Intel® OPA).
  • Performance tuning for Intel® Xeon Phi™ Processor and Coprocessor (code named Knights Landing) and Intel® OPA.
  • OFI latency and message rate improvements.
  • OFI is now the default fabric for Intel® OPA and Intel® True Scale Fabric.
  • MPD process manager is removed.
  • Dedicated pvfs2 ADIO driver is disabled.
  • SSHM support is removed.
  • Support for the Intel® microarchitectures older than the generation codenamed Sandy Bridge is deprecated.
  • Documentation improvements.

Key Features

  • MPI-1, MPI-2.2 and MPI-3.1 specification conformance.
  • Support for Intel® Xeon Phi™ processors (formerly code named Knights Landing).
  • MPICH ABI compatibility.
  • Support for any combination of the following network fabrics:
    • Network fabrics supporting Intel® Omni-Path Architecture (Intel® OPA) devices, through either Tag Matching Interface (TMI) or OpenFabrics Interface* (OFI*).
    • Network fabrics with tag matching capabilities through Tag Matching Interface (TMI), such as Intel® True Scale Fabric, Infiniband*, Myrinet* and other interconnects.
    • Native InfiniBand* interface through OFED* verbs provided by Open Fabrics Alliance* (OFA*).
    • Open Fabrics Interface* (OFI*).
    • RDMA-capable network fabrics through DAPL*, such as InfiniBand* and Myrinet*.
    • Sockets, for example, TCP/IP over Ethernet*, Gigabit Ethernet*, and other interconnects.
  • (SDK only) Support for Intel® 64 architecture and Intel® MIC Architecture clusters using:
    • Intel® C++/Fortran Compiler 14.0 and newer.
    • GNU* C, C++ and Fortran 95 compilers.
  • (SDK only) C, C++, Fortran 77, Fortran 90, and Fortran 2008 language bindings.
  • (SDK only) Dynamic or static linking.

System Requirements

Hardware Requirements

  • Systems based on the Intel® 64 architecture, in particular:
    • Intel® Core™ processor family
    • Intel® Xeon® E5 v4 processor family recommended
    • Intel® Xeon® E7 v3 processor family recommended
    • 2nd Generation Intel® Xeon Phi™ Processor (formerly code named Knights Landing)
  • 1 GB of RAM per core (2 GB recommended)
  • 1 GB of free hard disk space

Software Requirements

  • Operating systems:
    • Red Hat* Enterprise Linux* 6, 7
    • Fedora* 23, 24
    • CentOS* 6, 7
    • SUSE* Linux Enterprise Server* 11, 12
    • Ubuntu* LTS 14.04, 16.04
    • Debian* 7, 8
  • (SDK only) Compilers:
    • GNU*: C, C++, Fortran 77 3.3 or newer, Fortran 95 4.4.0 or newer
    • Intel® C++/Fortran Compiler 15.0 or newer
  • Debuggers:
    • Rogue Wave* Software TotalView* 6.8 or newer
    • Allinea* DDT* 1.9.2 or newer
    • GNU* Debuggers 7.4 or newer
  • Batch systems:
    • Platform* LSF* 6.1 or newer
    • Altair* PBS Pro* 7.1 or newer
    • Torque* 1.2.0 or newer
    • Parallelnavi* NQS* V2.0L10 or newer
    • NetBatch* v6.x or newer
    • SLURM* 1.2.21 or newer
    • Univa* Grid Engine* 6.1 or newer
    • IBM* LoadLeveler* 4.1.1.5 or newer
    • Platform* Lava* 1.0
  • Recommended InfiniBand* software:
    • OpenFabrics* Enterprise Distribution (OFED*) 1.5.4.1 or newer
    • Intel® True Scale Fabric Host Channel Adapter Host Drivers & Software (OFED) v7.2.0 or newer
    • Mellanox* OFED* 1.5.3 or newer
  • Virtual environments:
    • Docker* 1.13.0
  • Additional software:
    • The memory placement functionality for NUMA nodes requires the libnuma.so library and numactl utility installed. numactl should include numactlnumactl-devel and numactl-libs.

Notes for Cluster Installation

When installing the Intel® MPI Library on all the nodes of your cluster without using a shared file system, you need to establish passwordless SSH connection between the cluster nodes. This process is described in detail in the Intel® Parallel Studio XE Installation Guide (see section 2.1).

Known Issues and Limitations

  • Performance degradation may be observed with the MPI_Gatherv and MPI_Igatherv collective operations on small messages. Use I_MPI_ADJUST_GATHERV_USE_SSEND=off to get a performance gain.
  • Performance degradation may be observed with an MPI application utilizing the RMA functionality. Use I_MPI_SCALABLE_OPTIMIZATION=0 to get a performance gain.
  • The I_MPI_JOB_FAST_STARTUP variable takes effect only when shm is selected as the intra-node fabric.
  • ILP64 is not supported by MPI modules for Fortran* 2008.
  • In case of program termination (like signal), remove trash in the /dev/shm/ directory manually with:
    rm -r /dev/shm/shm-col-space-*
  • In case of large number of simultaneously used communicators (more than 10,000) per node, it is recommended to increase the maximum numbers of memory mappings with one of the following methods:
    • echo 1048576 > /proc/sys/vm/max_map_count
    • sysctl -w vm.max_map_count=1048576
    • disable shared memory collectives by setting the variable: I_MPI_COLL_INTRANODE=pt2pt
  • On some Linux* distributions Intel® MPI Library may fail for non-root users due to security limitations. This was observed on Ubuntu* 12.04, and could impact other distributions and versions as well. Two workarounds exist:
    • Enable ptrace for non-root users with:
      echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
    • Revert the Intel® MPI Library to an earlier shared memory mechanism, which is not impacted, by setting: I_MPI_SHM_LMT=shm
  • Ubuntu* does not allow attaching a debugger to a non-child process. In order to use -gdb, this behavior must be disabled by setting the sysctl value in /proc/sys/kernel/yama/ptrace_scope to 0.
  • Cross-OS runs using ssh from a Windows* host fail. Two workarounds exist:
    • Create a symlink on the Linux* host that looks identical to the Windows* path to pmi_proxy.
    • Start hydra_persist on the Linux* host in the background (hydra_persist &) and use -bootstrap service from the Windows* host. This requires that the Hydra service also be installed and started on the Windows* host.
  • The OFA fabric and certain DAPL providers may not work or provide worthwhile performance with the Intel® Omni-Path Fabric. For better performance, try choosing the OFI or TMI fabric.
  • Enabling statistics gathering may result in increased time in MPI_Finalize.
  • In systems where some nodes have only Intel® True Scale Fabric or Intel® Omni-Path Fabric available, while others have both Intel® True Scale and e.g. Mellanox* HCAs, automatic fabric detection will lead to a hang or failure, as the first type of nodes will select ofi/tmi, and the second type will select dapl as the internode fabric. To avoid this, explicitly specify a fabric that is available on all the nodes.
  • In order to run a mixed OS job (Linux* and Windows*), all binaries must link to the same single or multithreaded MPI library.  The single- and multithreaded libraries are incompatible with each other and should not be mixed. Note that the pre-compiled binaries for the Intel® MPI Benchmarks are inconsistent (Linux* version links to multithreaded, Windows* version links to single threaded) and as such, at least one must be rebuilt to match the other.
  • Intel® MPI Library does not support using the OFA fabric over an Intel® Symmetric Communications Interface (Intel® SCI) adapter. If you are using an Intel SCI adapter, such as with Intel® Many Integrated Core Architecture, you will need to select a different fabric.
  • The TMI and OFI fabrics over PSM do not support messages larger than 232 - 1 bytes. If you have messages larger than this limit, select a different fabric.
  • If a communication between two existing MPI applications is established using the process attachment mechanism, the library does not control whether the same fabric has been selected for each application. This situation may cause unexpected applications behavior. Set the I_MPI_FABRICS variable to the same values for each application to avoid this issue.
  • Do not load thread-safe libraries through dlopen(3).
  • Certain DAPL providers may not function properly if your application uses system(3), fork(2), vfork(2), or clone(2) system calls. Do not use these system calls or functions based upon them. For example, system(3), with OFED* DAPL provider with Linux* kernel version earlier than official version 2.6.16. Set the RDMAV_FORK_SAFE environment variable to enable the OFED workaround with compatible kernel version.
  • MPI_Mprobe, MPI_Improbe, and MPI_Cancel are not supported by the TMI and OFI fabrics.
  • You may get an error message at the end of a checkpoint-restart enabled application, if some of the application processes exit in the middle of taking a checkpoint image. Such an error does not impact the application and can be ignored. To avoid this error, set a larger number than before for the -checkpoint-interval option. The error message may look as follows:
    [proxy:0:0@hostname] HYDT_ckpoint_blcr_checkpoint (./tools/ckpoint/blcr/
    ckpoint_blcr.c:313): cr_poll_checkpoint failed: No such process
    [proxy:0:0@hostname] ckpoint_thread (./tools/ckpoint/ckpoint.c:559):
    blcr checkpoint returned error
    [proxy:0:0@hostname] HYDT_ckpoint_finalize (./tools/ckpoint/ckpoint.c:878)
     : Error in checkpoint thread 0x7
  • Intel® MPI Library requires the presence of the /dev/shm device in the system. To avoid failures related to the inability to create a shared memory segment, make sure the /dev/shm device is set up correctly.
  • Intel® MPI Library uses TCP sockets to pass stdin stream to the application. If you redirect a large file, the transfer can take long and cause the communication to hang on the remote side. To avoid this issue, pass large files to the application as command line options.
  • DAPL auto provider selection mechanism and improved NUMA support require dapl-2.0.37 or newer.
  • If you set I_MPI_SHM_LMT=direct, the setting has no effect if the Linux* kernel version is lower than 3.2.
  • When using the Linux boot parameter isolcpus with an Intel® Xeon Phi™ processor using default MPI settings, an application launch may fail. If possible, change or remove the isolcpus Linux boot parameter. If it is not possible, you can try setting I_MPI_PIN to off.
  • In some cases, collective calls over the OFA fabric may provide incorrect results. Try setting I_MPI_ADJUST_ALLGATHER to a value between 1 and 4 to resolve the issue.

Technical Support

Every purchase of an Intel® Software Development Product includes a year of support services, which provides Priority Support at our Online Service Center web site.

In order to get support you need to register your product in the Intel® Registration Center. If your product is not registered, you will not receive Priority Support.

Intel® Trace Analyzer and Collector Release Notes

$
0
0

This page provides the current Release Notes for Intel® Trace Analyzer and Collector. The notes are categorized by year, from newest to oldest, with individual releases listed within each year.

Click a version to expand it into a summary of new features and changes in that version since the last release, and access the download buttons for the detailed release notes, which include important information, such as pre-requisites, software compatibility, installation instructions, and known issues.

You can copy a link to a specific version's section by clicking the chain icon next to its name.

All files are in PDF format - Adobe Reader* (or compatible) required.
To get product updates, log in to the Intel® Software Development Products Registration Center.
For questions or technical support, visit Intel® Software Developer Support.

2018

Update 2

Release Notes for Linux*Release Notes for Windows*

Overview:

  • User Interface improvements.
  • Deprecation of Intel® Trace Collector static libraries on Windows.
Update 1

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Fix for the --summary CLI option.
  • Performance improvements in Imbalance Diagram building.
Initial Release

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Support for OpenSHMEM* applications.
  • MPI Performance Snapshot distribution model change.
  • Feature removals.

2017

Update 4

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Bug fixes.
Update 3

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Bug fixes.
Update 2

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Enhancements in function color selection on timelines.
Update 1

Release Notes for Linux*Release Notes for Windows*

Overview:

  • Zooming support with a mouse wheel on timelines.
Initial Release

Release Notes for Linux*Release Notes for Windows*

Overview:

  • New OTF2 to STF converter.
  • New library for collecting MPI load imbalance.

Intel® MPI Library Release Notes

$
0
0

This page provides the current Release Notes for Intel® MPI Library. The notes are categorized by year, from newest to oldest, with individual releases listed within each year.

Click a version to expand it into a summary of new features and changes in that version since the last release, and access the download buttons for the detailed release notes, which include important information, such as pre-requisites, software compatibility, installation instructions, and known issues.

You can copy a link to a specific version's section by clicking the chain icon next to its name.

All files are in PDF format - Adobe Reader* (or compatible) required.
To get product updates, log in to the Intel® Software Development Products Registration Center.
For questions or technical support, visit Intel® Software Developer Support.

2018

Update 2

Linux* Release NotesWindows* Release Notes

  • Improved shm performance with collective operations (I_MPI_SCHED_YIELD, _MPI_SCHED_YIELD_MT_OPTIMIZATION).
  • Intel® MPI Library is now available to install in YUM and APT repositories.
Update 1

Linux* Release NotesWindows* Release Notes

  • Startup performance improvements.
Initial Release

Linux* Release NotesWindows* Release Notes

  • Hydra startup improvements.
  • Improved support for Intel® Omni-Path Architecture.
  • Support removal for the Intel® Xeon Phi™ coprocessor (code named Knights Corner).
  • New deprecations.

2017

Update 4

Linux* Release NotesWindows* Release Notes

  • Performance tuning for processors based on Intel® microarchitecture codenamed Skylake and for Intel® Omni-Path Architecture.
  • Deprecated support for the IPM statistics format.
Update 3

Linux* Release NotesWindows* Release Notes

  • Hydra startup improvements.
  • Default fabrics list change.
Update 2

Linux* Release NotesWindows* Release Notes

  • New environment variables: I_MPI_HARD_FINALIZE and I_MPI_MEMORY_SWAP_LOCK.
Update 1

Linux* Release NotesWindows* Release Notes

  • PMI-2 support for SLURM*, improved SLURM support by default.
  • Improved mini-help and diagnostic messages, man1 pages for mpiexec.hydra, hydra_persist, and hydra_nameserver.
  • New deprecations.
Initial Release

Linux* Release NotesWindows* Release Notes

  • Support for the MPI-3.1 standard.
  • New topology-aware collective communication algorithms.
  • Effective MCDRAM (NUMA memory) support.
  • Controls for asynchronous progress thread pinning.
  • Performance tuning.
  • New deprecations.

5.1

Update 3 Build 223

Linux* Release Notes

  • Fix for issue with MPI_Abort call on threaded applications (Linux* only)
Update 3

Linux* Release NotesWindows* Release Notes

  • Fixed shared memory problem on Intel® Xeon Phi™ processor (codename: Knights Landing)
  • Added new algorithms and selection mechanism for nonblocking collectives
  • Added new psm2 option for Intel® Omni-Path fabric
  • Added I_MPI_BCAST_ADJUST_SEGMENT variable to control MPI_Bcast
  • Fixed long count support for some collective messages
  • Reworked the binding kit to add support for Intel® Many Integrated Core Architecture and support for ILP64 on third party compilers
  • The following features are deprecated in this version of the Intel MPI Library. For complete list of all deprecated and removed features, visit our deprecation page.
    • SSHM
    • MPD (Linux*)/SMPD (Windows*)
    • Epoll
    • JMI
    • PVFS2
Update 2

Linux* Release NotesWindows* Release Notes

  • Intel® MPI Library now supports YARN* cluster manager (Linux* only)
  • DAPL library UCM settings are automatically adjusted for MPI jobs of more than 1024 ranks, resulting in more stable job start-up (Linux* only)
  • ILP64 support enhancements, support for MPI modules in Fortran 90
  • Added the direct receive functionality for the TMI fabric (Linux* only)
  • Single copy intra-node communication using Linux* supported cross memory attach (CMA) is now default (Linux* only)
Update 1

Linux* Release NotesWindows* Release Notes

  • Changes to the named-user licensing shceme. See more details in the Installation Instructions section of Intel® MPI Library Installation Guide.
  • Various bug fixes for general stability and performance.
Initial Release

Linux* Release NotesWindows* Release Notes

  • Added support for OpenFabrics Interface* (OFI*) v1.0 API
  • Added support for Fortran* 2008
  • Updated the default value for I_MPI_FABRICS_LIST
  • Added brand new Troubleshooting chapter to the Intel® MPI Library User's Guide
  • Added new application-specific features in the Automatic Tuner and Hydra process manager
  • Added support for the MPI_Pcontrol feature for improved internal statistics
  • Increased the possible space for MPI_TAG
  • Changed the default product installation directories
  • Various bug fixes for general stability and performance

Building FreeFEM++ with Intel Software tools for developers

$
0
0

FreeFEM++ is a package that targeted for researchers who needs a powerful tool for solving partial differential equations.

It can be found at http://www.freefem.org/

By default build instructions for FreeFEM++ uses open-source products http://www.freefem.org/ff++/linux.php .

Intel provides for software developers a wonderful product Intel(R) Parallel Studio XE https://software.intel.com/en-us/parallel-studio-xe. It can be used to get additional optimizations and performance of FreeFEM++ on Intel platforms.

Prework

Let's download Intel(R) Parallel Studio XE from https://software.intel.com/en-us/parallel-studio-xe and install it to <IPSXE_install_dir>.

Currently version 2018 Update 1 is released and will be used.

Prepare build directories structure:

	$ mkdir FreeFEM++
	$ cd FreeFEM++

Download FreeFED++:

	$ wget http://www.freefem.org/ff++/ftp/freefem++-3.59.tar.gz 

And unzip it

	$ tar -xzvf ./freefem++-3.59.tar.gz
	$ cd freefem++-3.59/ 

Building

We will use Intel(R) Composer, Intel(R) Math Kernel Library(Intel(R) MKL) and Intel(R) MPI Library (Intel(R) MPI):

$ source <IPSXE_install_dir>/bin/psxevars.sh intel64

FreeFEM++ for building uses traditional autotools way: configure - make. So let's setup it with Intel tools (single line will be given below):

$ export OPTF=-xCOMMON-AVX512 
$ export CC=icc 
$ export CFLAGS=$OPTF 
$ export CXX=icpc 
$ export CXXFLAGS=$OPTF 
$ export FC=ifort 
$ export FCFLAGS=$OPTF 
$ export F77=ifort 
$ export FFLAGS=$OPTF 
$ ./configure --enable-download --with-mpiinc=-I${I_MPI_ROOT}/intel64/include --with-mpilibs="-L${I_MPI_ROOT}/intel64/lib/release_mt -L${I_MPI_ROOT}/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread" --with-mpilibsc="-L${I_MPI_ROOT}/intel64/lib/release_mt -L${I_MPI_ROOT}/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"

Where icc is Intel(R) C compiler, icpc - Intel(R) C++ compiler, ifort - Intel(R) Fortran Compiler.

We are using --with-mpiinc and --with-mpilbs to notify FreeFEM++ how to build packages with Intel(R) MPI Library. Unfortunately traditional mpiicc, mpiicpc and mpiifort cause compatibility issues.

The same with single line is:

	$ CC=icc CFLAGS=-axCOMMON-AVX512 CXX=icpc CXXFLAGS=-axCOMMON-AVX512 FC=ifort FCFLAGS=-axCOMMON-AVX512 F77=ifort FFLAGS=-axCOMMON-AVX512 ./configure --enable-download --with-mpiinc=-I${I_MPI_ROOT}/intel64/include --with-mpilibs="-L${I_MPI_ROOT}/intel64/lib/release_mt -L${I_MPI_ROOT}/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread" --with-mpilibsc="-L${I_MPI_ROOT}/intel64/lib/release_mt -L${I_MPI_ROOT}/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread"

Also as we are using -xCOMMON-AVX512, so we need to set system default compiler to icc:

	$ alias cpp=icpc

And now we can build it all

	$ make

Unfortunately parallel build (-j<n>) is not supported, so it will take some time.

Conclustion

Enjoy your FreeFEM++ with Intel(R) Software Products for developers!

New Case Study: GeoVision Gets a 24x Deep Learning Algorithm Performance Boost

$
0
0

There’s never been a more urgent need for comprehensive security and surveillance solutions. GeoVision Inc.has built its business on helping meet this need, providing digital and networked video surveillance solutions to customers in 110 countries. Headquartered in Taiwan, it’s one of the top security 30 companies in the world, manufacturing professional-grade digital video recorder (DVR) and network video recorder (NVR) systems, IP cameras, and an in-house developing video management system (VMS). 

To succeed in its highly competitive and fast-changing industry, GeoVision must always be on the lookout for ways to give its customers leading-edge performance. For the latest version of its GV-VMS* comprehensive video management system, that meant finding new ways to get the most out of its Intel® architecture-based hardware.

GeoVision is working closely with Intel to maximize the performance of the hardware using the tools in Intel® System Studio, a comprehensive development tool suite to optimize the computer vision and deep learning workloads. The result has been an impressive 24x performance gain for its deep learning algorithm, which translates to a huge effciency advantage for GeoVision’s customers.

Learn all about it in our new case study.

Developer Success Stories Library

$
0
0

Intel® Parallel Studio XE | Intel® System Studio  Intel® Media Server Studio

Intel® Advisor | Intel® Computer Vision SDK | Intel® Data Analytics Acceleration Library 

Intel® Distribution for Python* | Intel® Inspector XEIntel® Integrated Performance Primitives

Intel® Math Kernel Library | Intel® Media SDK  | Intel® MPI Library | Intel® Threading Building Blocks

Intel® VTune™ Amplifer 

 


Intel® Parallel Studio XE


Altair Creates a New Standard in Virtual Crash Testing

Altair advances frontal crash simulation with help from Intel® Software Development products.


CADEX Resolves the Challenges of CAD Format Conversion

Parallelism Brings CAD Exchanger* software dramatic gains in performance and user satisfaction, plus a competitive advantage.


Envivio Helps Ensure the Best Video Quality and Performance

Intel® Parallel Studio XE helps Envivio create safe and secured code.


ESI Group Designs Quiet Products Faster

ESI Group achieves up to 450 percent faster performance on quad-core processors with help from Intel® Parallel Studio.


F5 Networks Profiles for Success

F5 Networks amps up its BIG-IP DNS* solution for developers with help from
Intel® Parallel Studio and Intel® VTune™ Amplifer.


Fixstars Uses Intel® Parallel Studio XE for High-speed Renderer

As a developer of services that use multi-core processors, Fixstars has selected Intel® Parallel Studio XE as the development platform for its lucille* high-speed renderer.


Golaem Drives Virtual Population Growth

Crowd simulation is one of the most challenging tasks in computer animation―made easier with Intel® Parallel Studio XE.


Lab7 Systems Helps Manage an Ocean of Information

Lab7 Systems optimizes BioBuilds™ tools for superior performance using Intel® Parallel Studio XE and Intel® C++ Compiler.


Massachusetts General Hospital Achieves 20X Faster Colonoscopy Screening

Intel® Parallel Studio helps optimize key image processing libraries, reducing compute-intensive colon screening processing time from 60 minutes to 3 minutes.


Moscow Institute of Physics and Technology Rockets the Development of Hypersonic Vehicles

Moscow Institute of Physics and Technology creates faster and more accurate computational fluid dynamics software with help from Intel® Math Kernel Library and Intel® C++ Compiler.


NERSC Optimizes Application Performance with Roofline Analysis

NERSC boosts the performance of its scientific applications on Intel® Xeon Phi™ processors up to 35% using Intel® Advisor.


Nik Software Increases Rendering Speed of HDR by 1.3x

By optimizing its software for Advanced Vector Extensions (AVX), Nik Software used Intel® Parallel Studio XE to identify hotspots 10x faster and enabled end users to render high dynamic range (HDR) imagery 1.3x faster.


Novosibirsk State University Gets More Efficient Numerical Simulation

Novosibirsk State University boosts a simulation tool’s performance by 3X with Intel® Parallel Studio, Intel® Advisor, and Intel® Trace Analyzer and Collector.


Pexip Speeds Enterprise-Grade Videoconferencing

Intel® analysis tools enable a 2.5x improvement in video encoding performance for videoconferencing technology company Pexip.


Schlumberger Parallelizes Oil and Gas Software

Schlumberger increases performance for its PIPESIM* software by up to 10 times while streamlining the development process.


Ural Federal University Boosts High-Performance Computing Education and Research

Intel® Developer Tools and online courseware enrich the high-performance computing curriculum at Ural Federal University.


Walker Molecular Dynamics Laboratory Optimizes for Advanced HPC Computer Architectures

Intel® Software Development tools increase application performance and productivity for a San Diego-based supercomputer center.


Intel® System Studio


CID Wireless Shanghai Boosts Long-Term Evolution (LTE) Application Performance

CID Wireless boosts performance for its LTE reference design code by 6x compared to the plain C code implementation.


GeoVision Gets a 24x Deep Learning Algorithm Performance Boost

GeoVision turbo-charges its deep learning facial recognition solution using Intel® System Studio and Intel® Computer Vision SDK.


NERSC Optimizes Application Performance with Roofline Analysis

NERSC boosts the performance of its scientific applications on Intel® Xeon Phi™ processors up to 35% using Intel® Advisor.


Daresbury Laboratory Speeds Computational Chemistry Software 

Scientists get a speedup to their computational chemistry algorithm from Intel® Advisor’s vectorization advisor.


Novosibirsk State University Gets More Efficient Numerical Simulation

Novosibirsk State University boosts a simulation tool’s performance by 3X with Intel® Parallel Studio, Intel® Advisor, and Intel® Trace Analyzer and Collector.


Pexip Speeds Enterprise-Grade Videoconferencing

Intel® analysis tools enable a 2.5x improvement in video encoding performance for videoconferencing technology company Pexip.


Schlumberger Parallelizes Oil and Gas Software

Schlumberger increases performance for its PIPESIM* software by up to 10 times while streamlining the development process.


Intel® Computer Vision SDK


GeoVision Gets a 24x Deep Learning Algorithm Performance Boost

GeoVision turbo-charges its deep learning facial recognition solution using Intel® System Studio and Intel® Computer Vision SDK.


Intel® Data Analytics Acceleration Library


MeritData Speeds Up a Big Data Platform

MeritData Inc. improves performance—and the potential for big data algorithms and visualization.


Intel® Distribution for Python*


DATADVANCE Gets Optimal Design with 5x Performance Boost

DATADVANCE discovers that Intel® Distribution for Python* outpaces standard Python.
 


Intel® Inspector XE


CADEX Resolves the Challenges of CAD Format Conversion

Parallelism Brings CAD Exchanger* software dramatic gains in performance and user satisfaction, plus a competitive advantage.


Envivio Helps Ensure the Best Video Quality and Performance

Intel® Parallel Studio XE helps Envivio create safe and secured code.


ESI Group Designs Quiet Products Faster

ESI Group achieves up to 450 percent faster performance on quad-core processors with help from Intel® Parallel Studio.


Fixstars Uses Intel® Parallel Studio XE for High-speed Renderer

As a developer of services that use multi-core processors, Fixstars has selected Intel® Parallel Studio XE as the development platform for its lucille* high-speed renderer.


Golaem Drives Virtual Population Growth

Crowd simulation is one of the most challenging tasks in computer animation―made easier with Intel® Parallel Studio XE.


Schlumberger Parallelizes Oil and Gas Software

Schlumberger increases performance for its PIPESIM* software by up to 10 times while streamlining the development process.


Intel® Integrated Performance Primitives


JD.com Optimizes Image Processing

JD.com Speeds Image Processing 17x, handling 300,000 images in 162 seconds instead of 2,800 seconds, with Intel® C++ Compiler and Intel® Integrated Performance Primitives.


Tencent Optimizes an Illegal Image Filtering System

Tencent doubles the speed of its illegal image filtering system using SIMD Instruction Set and Intel® Integrated Performance Primitives.


Tencent Speeds MD5 Image Identification by 2x

Intel worked with Tencent engineers to optimize the way the company processes millions of images each day, using Intel® Integrated Performance Primitives to achieve a 2x performance improvement.


Walker Molecular Dynamics Laboratory Optimizes for Advanced HPC Computer Architectures

Intel® Software Development tools increase application performance and productivity for a San Diego-based supercomputer center.


Intel® Math Kernel Library


GeoVision Gets a 24x Deep Learning Algorithm Performance Boost

GeoVision turbo-charges its deep learning facial recognition solution using Intel® System Studio and Intel® Computer Vision SDK.

 


MeritData Speeds Up a Big Data Platform

MeritData Inc. improves performance―and the potential for big data algorithms and visualization.


Qihoo360 Technology Co. Ltd. Optimizes Speech Recognition

Qihoo360 optimizes the speech recognition module of the Euler platform using Intel® Math Kernel Library (Intel® MKL), speeding up performance by 5x.


Intel® Media SDK


NetUP Gets Blazing Fast Media Transcoding

NetUP uses Intel® Media SDK to help bring the Rio Olympic Games to a worldwide audience of millions.


Intel® Media Server Studio


ActiveVideo Enhances Efficiency

ActiveVideo boosts the scalability and efficiency of its cloud-based virtual set-top box solutions for TV guides, online video, and interactive TV advertising using Intel® Media Server Studio.


Kraftway: Video Analytics at the Edge of the Network

Today’s sensing, processing, storage, and connectivity technologies enable the next step in distributed video analytics, where each camera itself is a server. With Kraftway* video software platforms can encode up to three 1080p60 streams at different bit rates with close to zero CPU load.


Slomo.tv Delivers Game-Changing Video

Slomo.tv's new video replay solutions, built with the latest Intel® technologies, can help resolve challenging game calls.


SoftLab-NSK Builds a Universal, Ultra HD Broadcast Solution

SoftLab-NSK combines the functionality of a 4K HEVC video encoder and a playout server in one box using technologies from Intel.


Vantrix Delivers on Media Transcoding Performance

HP Moonshot* with HP ProLiant* m710p server cartridges and Vantrix Media Platform software, with help from Intel® Media Server Studio, deliver a cost-effective solution that delivers more streams per rack unit while consuming less power and space.


Intel® MPI Library


Moscow Institute of Physics and Technology Rockets the Development of Hypersonic Vehicles

Moscow Institute of Physics and Technology creates faster and more accurate computational fluid dynamics software with help from Intel® Math Kernel Library and Intel® C++ Compiler.


Walker Molecular Dynamics Laboratory Optimizes for Advanced HPC Computer Architectures

Intel® Software Development tools increase application performance and productivity for a San Diego-based supercomputer center.


Intel® Threading Building Blocks


CADEX Resolves the Challenges of CAD Format Conversion

Parallelism Brings CAD Exchanger* software dramatic gains in performance and user satisfaction, plus a competitive advantage.


Johns Hopkins University Prepares for a Many-Core Future

Johns Hopkins University increases the performance of its open-source Bowtie 2* application by adding multi-core parallelism.


Pexip Speeds Enterprise-Grade Videoconferencing

Intel® analysis tools enable a 2.5x improvement in video encoding performance for videoconferencing technology company Pexip.


Quasardb Streamlines Development for a Real-Time Analytics Database

To deliver first-class performance for its distributed, transactional database, Quasardb uses Intel® Threading Building Blocks (Intel® TBB), Intel’s C++ threading library for creating high-performance, scalable parallel applications.


University of Bristol Accelerates Rational Drug Design

Using Intel® Threading Building Blocks, the University of Bristol helps slash calculation time for drug development—enabling a calculation that once took 25 days to complete to run in just one day.


Walker Molecular Dynamics Laboratory Optimizes for Advanced HPC Computer Architectures

Intel® Software Development tools increase application performance and productivity for a San Diego-based supercomputer center.


Intel® VTune™ Amplifer


CADEX Resolves the Challenges of CAD Format Conversion

Parallelism Brings CAD Exchanger* software dramatic gains in performance and user satisfaction, plus a competitive advantage.


F5 Networks Profiles for Success

F5 Networks amps up its BIG-IP DNS* solution for developers with help from
Intel® Parallel Studio and Intel® VTune™ Amplifer.


GeoVision Gets a 24x Deep Learning Algorithm Performance Boost

GeoVision turbo-charges its deep learning facial recognition solution using Intel® System Studio and Intel® Computer Vision SDK.

 


Nik Software Increases Rendering Speed of HDR by 1.3x

By optimizing its software for Advanced Vector Extensions (AVX), Nik Software used Intel® Parallel Studio XE to identify hotspots 10x faster and enabled end users to render high dynamic range (HDR) imagery 1.3x faster.


Walker Molecular Dynamics Laboratory Optimizes for Advanced HPC Computer Architectures

Intel® Software Development tools increase application performance and productivity for a San Diego-based supercomputer center.


 

An Approach to Parallel Processing with Unity*

$
0
0

Idea Behind This Project

The idea behind this project was to provide a demonstration of parallel processing in gaming with Unity* and how to perform gaming-related physics using this game engine. In this domain, realism is important as an indicator of success. In order to mimic the actual world, many things need to happen at the same time, which requires parallel processing. Two different applications were created and then compared to a single-threaded application run on a single core.

The first application was developed to run on a multi-threaded CPU, and the second to perform physics calculations on the GPU. To demonstrate the results of these techniques, the developed application presented schools of fish which were created utilizing a flocking algorithm.

Flocking Algorithm

Most flocking algorithms rely on three rules:


Figure 1. Description of the three flocking rules (source: http://www.red3d.com/cwr/boids/).

What is a Flock?

In this case, a flock was defined as a school of fish. Each fish was calculated to “swim” within a school if it was within a certain distance from any other fish in the school. Members of a school will not act as individuals, but only as members of a flock, sharing the same parameters such as speed and direction.


Figure 2. A flock containing four fish.

Complexity

The complexity of this algorithm is O(n2), where n is the number of fish. To update the movement of a single fish, the algorithm needs to look at every other n fish in the environment in order to know if the fish can: 1) remain in a school; 2) leave a school; or 3) join a new school. It is possible that a single fish could “swim” by itself for a time, until it has an opportunity to join a new school. This needs to be executed for every fish n times.

The algorithm is as follows:

For each fish (n).

Look at every other fish (n).

If this fish is close enough.

Apply rules: Cohesion, Alignment, Separation.

Implementation of the Flocking Algorithm Using C#

To apply the rules for each fish, a Calc function was created, which needed one parameter: the index of the actual fish inside the environment.

Data is stored inside two buffers that represent the state of each fish. The two buffers are used alternatively to read and to write. The two buffers are required to maintain in memory the previous state of each fish. This information is then used to calculate the next state of each fish. Before every frame, the current Read buffer is read in order to update the scene.


Figure 3. Functional flow block diagram.

State of Fish

The state of each fish contains the following:

fishState {
	float speed;
	Vector3 position, forward;
	Quaternion rotation;
}


Figure 4. A fish with its fishState.

The variable forward contains the direction the fish is facing.

The variable rotation is a quaternion, representing a 3D rotation, which allows the fish to rotate to face the direction it is aiming.

Flocking Algorithm

The complete flocking algorithm used was:

Neighbor State

Once the flocking algorithm was executed, each fish was identified as either being a member of a school or not.

The Neighbor function was created using as parameters the distance between two fish and the forward direction of each fish. The idea was to get the behavior to be more realistic. If the distance between any two fish was small enough, and the two fish were traveling in the same direction, there was a possibility that they could merge together. However, if they were not traveling in the same direction, they would be less likely to merge. This merging behavior was created using a piecewise quadratic function and a dot product of the forward vectors.


Figure 5. Representation of the mathematical function.

The distance between two fish must be smaller than the maximum distance, which is dynamically changed based on the dot product of the forward vectors.

Before calling the Neighbor function, which is pretty heavy, there is a call of another function: Call. The Call function tells the algorithm whether or not the Neighbor function is required to determine if any two fish are close enough to have a chance of being in the same flock or not. The Call function only checks the positions of these two elements (fish) regarding their x-position. The x-position is preferred because this is the widest dimension, allowing the fish to be distributed the furthest apart.

Update of the State

If a fish is alone, it moves forward in a certain direction and speed. However, if a fish has neighbors, it will need to adapt its direction and speed to the flock's direction and speed.

Speed is always changed linearly as a matter of smoothness. Speed does not change to another speed without a transition.

There is a defined environment. Fish are not permitted to swim beyond the dimensional limits of that environment. If a fish collides with a boundary, the fish is deflected back inside the defined environment.


Figure 6. Flocking behavior.

If a fish is about to swim out of bounds, the fish is given a new random direction and speed in order to remain inside the defined environment.


Figure 7. Boundary behavior.

It is also necessary to check if a fish is about to collide with a rock. The algorithm must calculate if a fish’s next position will be inside a rock. If so, the fish will avoid the rock in a similar fashion as avoiding a boundary.

Once the states have been calculated, all of the fish can be made to “swim,” along with making any required rotations. The next state of each fish is then updated with new direction, speed, position, and rotation variables (for n fish). This occurs for every new frame update.

For example, any fish has its direction added to its position, in order to “swim” inside the environment.

Integration Inside Unity*

The main programming component in Unity is a GameObject. Inside GameObjects you can add different things such as scripts to be executed, colliders, textures, or materials to customize the objects in order to have them behave as desired. It is then convenient to access these objects inside a C# script. Each public object in a script will create a field in the editor that allows you to drop any object which matches the desired requirements.

C# scripts were used to create flocking behavior.

Import Assets Into Unity

  1. Click Assets, click Import package, and then click Custom package
  2. Click on All.
  3. Click on Import.

Next, drag and drop the Main scene from the Project tab to the Hierarchy tab. Right click on the default scene and select “Remove Scene.”

All game objects needed to run the application, along with attached scripts, are ready to run. The only missing part is the rock model, which must be added manually.

Download “Yughues Free Rocks” from the Unity Asset Store. The Asset Store can be accessed within Unity (or by using this link: http://u3d.as/64P). Once downloaded, the window on the left appears. Choose “Rock 01” and import it.

Before using the rock model, an adjustment needs to be made because the scale of the default model is too large. The scale factor of the import settings of its mesh should be resized to 0.0058. Once a rock is added to the scene, if it has a scale of 1, it will match the scale of a 3D sphere of 1, which will be used as a collider for the object.


Figure 8. Pop-up window: Import Unity Package.

Drag and drop the Rock 01 prefab to the Rock field, inside the main script contained by the fish Manager object.


Figure 9. Inspector tab of an imported object – change of the scale.

Initializing the Application

The initialization of the application is done inside the “Start” function of the main script. This function is called once at start-up. Inside this initialization the following steps are executed:

  1. Creating a 2-dimensional array of fishState and creating as many arrays of 4x4 matrices as are needed.
  2. Every fish is given a random position, forward direction, and speed. Their rotation is calculated considering their other properties.
  3. The fishState image of each fish and the corresponding TRS matrix are initialized.

The application can be built as an .exe file. In this case, some parameters can be changed if launched from the command line prompt using the following arguments:

  • “-t”: Height of the allowed area for the fish ( <~> tank).
  • “-f”: Number of fish to display.
  • “-r”: Number of rocks to add to the scene.
  • “-n”: Maximum distance for two fish to be neighbors (i.e., interact together).
  • “-s”: Displays a simpler scene with a visible tank (see images below).
  • “-m”: Mode to launch the application. 0: CPU Single-thread; 1: CPU Multi-thread; 2: GPU.

Considering the size and the neighbor distance, these are unit-less parameters. As a matter of comparison, the smallest distance fish are allowed to swim side by side to avoid collision is 0.3.

For example, the command“-t 6 -f 1200 -n 1.5 -m -s” will launch a no-rocks multithreaded CPU application of 1,200 fish, with a maximum neighbor distance of 1.5, inside a tank having a height of 6.

The depth and length of the environment depend on the height. These coefficients are stored and can be changed in the code to alter the look of the scene.


Figure 10. Simpler Scene.


Figure 11. Fish area to interact with another fish.


Figure 12. Scene with underwater effects.

Constants which can be changed to tweak the behavior of the fish:

  • Speed: Maximum speed for a fish, rotation speed, speed to add to a fish when it nears a boundary.
  • Velocity: Parameters related to the three flocking rules: cohesion, alignment, and separation. (Cohesion is set to 1 by default. It is not recommended that this setting be changed.)
  • Dimension: Determines the depth and length of the environment. These parameters are calculated based on the given height.

How to Draw the Instances

The DrawMeshInstanced function of Unity is used to display a fish. This allows the drawing of N instances of the same fish. Each fish is represented by a mesh and a material. To use this function, three parameters are required: a mesh, a material, and an array of a 4x4 matrix.

Each 4x4 matrix is configured as a Translation Rotation and Scaling (TRS) matrix. The TRS matrix is reconfigured after each update of all of the fish's states, using their updated positions and rotations as parameters. The global variable scale is the same for each fish in order to resize them if needed (in this case, the factor is 1).

The mesh has been previously resized and rotated in Blender to avoid any mismatch.

Inside each script, Unity’s Update function is used to refresh the states for each frame.

DrawMeshInstanced provides good performance, but has a limit of 1,023 instances drawn per call. This means that in the case of more than 1,023 fish in an environment, this function has to be called more than once. The array containing the matrices to display the fish must be split to create chunks no larger than 1,023 items. Each chunk is then updated in the Calc function, which is called several times.

Several calls will be made to both the DrawMeshInstanced and Calc functions to update and then display all the fish for each frame.


Figure 13. Splitting the array of matrices – calculation of variables.

The variables are calculated: nbmat represents the number of arrays of matrices that the application is using; rest represents the number of fish in the last matrix.

Each fish is updated: represents the index of the matrix; j represents a fish's index inside the current chunk. This is needed in order to update the right matrix inside the array shown above.

Additional Features

Underwater effects

There are different assets that were added to this Unity project in order to create a number of underwater effects. Unity provides built-in packages, including water models, for example, that were used for this project. There are also many textures and associated materials (“skins”), which may be applied to any object. All of these (and more) can be found in the Unity Asset Store.

Caustics - Light reflections and shadows

Caustics lighting effects were added to the underwater scene of this project. A “projector” (a Unity object) is required for caustics. The projector displays caustic effects in a scene. The projected caustic is changed by assuming a certain frequency (Hz), which provides an effect that the caustics are moving.

Blur

A blur was added to the underwater scene. If the camera is below the surface level of the water, a progressive blur will be enabled and displayed. The background of the scene will be changed and become blue. The default background is a sky background (skybox). Additionally, the fog setting was enabled inside Unity. (Microsoft Windows*, Lightning ,Other Settings, Fog box checked.).

Moving the camera

A script was added to the camera object in order to move inside the scene using the keyboard and the mouse. This provides controls similar to a first-person shooter game. The directional keys may be used to move forward/backward or strafe left/right. The mouse allows for moving up/down, along with turning the camera to point left/right.

transform.Rotate (0, rotX, 0);

The move variables represent the directional keys input, while rot* represents the mouse orientation. Modifying the transform of an object, which holds a script, in this case the camera, makes it rotate and translate in the scene.

Building an .exe file

As previously mentioned, there is the possibility of building an .exe file to change the parameters of the application without changing the source code. To do so, follow these steps:

  1. Click: Edit and then click  Project Settings and then click  Quality.
  2. In the Quality tab, scroll down to Other, and find V Sync Count.
  3. Change the V Sync Count setting to “Don’t Sync.” This lets the application display more than 60fps, if possible.
  4. Click: File and then click  Build and Run to build the .exe file.

Note: instead of using Build and Run, you may go to the Build Settings in order to choose a specific platform (i.e., Microsoft  Windows, Linux*, Mac*, etc.)

Coding Differences: CPU vs. GPU

CPU

There is only one difference between coding for a single-threaded and multi-threaded application: How the Calc function is called. In this case, the Calc function is critical to the execution time, as it is being called n times for each frame.

Single-threaded

Coding for a single-threaded application is accomplished in a classic way, through a “for loop” as shown here:

Multi-threaded

Coding for a multi-threaded application is accomplished by utilizing the “Parallel.For” class. The goal of the Parallel.For class is to split multiple calls of a function and execute them in parallel in different threads. Each thread contains a chunk of multiple calls to execute. Application performance depends, of course, on the number of available cores of the CPU.

GPU

Compute shader

GPU processing is accomplished in a similar way to CPU multi-threading. By moving the process-heavy Calc function to the GPU, which has a larger number of cores than a CPU, faster results may be expected. To do so, a “shader” is used and executed on the GPU. A shader adds graphical effects to a scene. For this project a “compute shader” was used. The compute shader was coded using HLSL (High-Level Shader Language). The compute shader reproduces the behavior of the Calc function (e.g., speed, position, direction, etc.), but without the calculations for rotation.

The CPU, using the Parallel.For function, calls the UpdateStates function for each fish to calculate its rotation and create the TRS matrices before drawing each fish. The rotation of the fish is calculated using the Unity function Slerp, of the “Quaternion” class.

Adaptation of the code to the compute shader

Although the main idea is to move the Calc function loop to the GPU, there are some additional points to consider: random number generation and the need for data to be exchanged with the GPU.

The biggest difference between the Calc function for the CPU and the compute shader for the GPU is random number generation. In the CPU, an object from the Unity Random class is used to generate random numbers. In the compute shader, NVidia* DX10 SDK functions are used.

Data needs to be exchanged between the CPU and GPU.

Some parameters of the application, like the number of fish or rocks, are wrapped either inside vectors of floats or in a single float. For example, a Vector3 from C# in the CPU will match the memory mapping of a float3 in HLSL on the GPU.

Fish-state data (fishState) in the Read/Write buffers and rock-state data (s_Rock) in a third buffer in the CPU must be defined as three distinct ComputeBuffers of the compute shader on the GPU. For example, a Quaternion in the CPU matches the memory mapping of a float4 on the GPU. (A Quaternion is a structure containing 4 floats.) The Read/Write buffers are declared as RWStructureBuffer <State> in the compute shader on the GPU. The same is true for the structure describing rocks on the CPU with a float to represent the size of each rock and a vector of three floats to represent each rock’s position.

On the CPU, the RunShader function creates ComputeBuffer states and calls the GPU to execute its compute shader at the beginning of each frame.

Once the ComputeBuffer states are created on the CPU, they are set to match their associated buffers states on the GPU (for example, the Read Buffer on the CPU is associated with “readState” on the GPU). The two empty buffers are then initialized with fish-state data, the compute shader is executed, and the Write buffer is updated with the data from its associated ComputeBuffer.

On the CPU, the Dispatch function sets up the threads on the GPU and launches them. nbGroups represents the number of groups of threads to be executed on the GPU. In this case, each group contains 256 threads (a group of threads cannot contain more than 1,024 threads).

On the GPU, the “numthreads” property must correspond with the number of threads established on the CPU. As seen below, “int index = 16*8*8/4” provides for 256 threads. The index of each thread needs to be set to the corresponding index of each fish, and each thread needs to update the state of each fish.

Results


Figure 14. Results for a smaller number of fish.

The results show that for fewer than 500 fish, both the single-threaded and multi-threaded CPUs performed better than the GPU. This may be because of the data exchanges which were completed for each frame between the CPU and GPU.

When the number of fish reached 500, the performance for the single-threaded CPU diminished compared to the multi-threaded CPU and GPU (CPU ST = 164fps vs. CPU MT = 295fps and GPU = 200fps). When the number of fish reached 1,500, the performance of the multi-threaded CPU diminished (CPU ST = 23fps and CPU MT = 88fps vs. GPU = 116fps). This may be because of the larger number of cores inside the GPU.

For 1,500 fish and above, in all cases the GPU outperformed both the single-threaded and multi-threaded CPUs.


Figure 15. Results for a larger number of fish.

Although in all cases the GPU performed better than both instances of the CPU, the results show that 1,500 fish provided for the best overall GPU performance (116fps). As more fish were added, GPU performance degraded. Even so, at 2,000 fish, only the GPU performed better than 60fps, and at 2,500 fish, better than 30fps. The GPU finally degraded below 30fps at approximately 6,500 fish.

The most likely reason why the GPU’s performance degraded with larger numbers of fish is because of the complexity of the algorithm. For example, for 10,000 fish there were 10,0002 , or 100 million iterations for each fish to interact with every other fish in every frame.

Profiling the application resulted in highlighting several critical points inside the application. The function which calculated the distance between each fish was heavy, and the Neighbor function was slow due to the dot product. Replacing the Neighbor call with the distance between two fish (which must be smaller than the maximum distance) would increase the performances a bit. This would mean, however, that any two fish that swim within proximity of each other would now be caused to swim in the same direction.

Another way to possibly improve performance would be to focus on the O(n2) complexity of the algorithm. It is possible that an alternate sorting algorithm for the fish could yield an improvement.

(Consider two fish: f1 and f2. When the Calc function is called for f1, f1’s Neighbor state will be calculated for f2. This Neighbor state value could be stored and used later when the Calc function is called for f2.)

Hardware Used for this Benchmark


Figure 16. Hardware used to run the tests.


Get Big: Solving the Distribution Dilemma for Indie Gamers

$
0
0

Independent game developers face an important decision when selecting a distribution channel. While many devs plan to simply push their game to Steam*, a multichannel distribution model makes more sense. Using more than one channel takes a bit of additional work, but you could reach a far bigger audience and potentially make a lot more money.

Steam*
Figure 1: Steam* is the #1 choice for most indie gamers, but it's not the only one.

Physical boxes and shareware days

For independent game developers, distributing physical, boxed games to brick-and-mortar retailers was often prohibitively expensive. One early workaround was the concept of shareware titles, such as Doom* from id Software, which launched the first-person-shooter genre, courtesy of a free executable with a small footprint. Players could download the first 10 levels before purchasing the entire title, and demand was so intense on the first day that servers were overwhelmed. Players were encouraged to distribute the shareware version freely, and customers eventually bought over one million copies of the full game.

 Doom*
Figure 2: Doom* from id Software was released in 1993 as downloadable shareware.

Early online distribution services, such as GameLine*, for the Atari* 2600, and Stardock Central*, lacked any kind of marketing assistance or title curation, and had other distribution issues. In 2004, the Valve Corporation launched the Steam platform—and a revolution.

Steam soon became the largest digital distributor of games for PCs. The advantages are obvious, as Gabe Newell, creator of Steam, explained to RockPaperShotgun.com. “The worst days . . . were the cartridge days for the NES(Nintendo Entertainment System*) . It was a huge risk—you had all this money tied up in silicon in a warehouse somewhere, and so you’d be conservative in the decisions you felt you could make, very conservative in the IPs you signed, your art direction would not change, and so on. Now it’s the opposite extreme . . . there’s no shelf-space restriction.”

Distribution platforms

Figure 3: Distribution platforms to choose from or to include in an all-of-the-above approach.

Multiple sites now centralize purchasing and downloading digital content. Some platforms also serve as digital rights management systems (DRMs) to control the use, modification, and distribution of games and to handle in-game purchases, the keys to unlock content, and more. The three main models are:

  • Proprietary systems run by large publishers (such as Electronic Arts Inc.*, Ubisoft*, and Tencent*), which allow them to sell direct and to aggregate user information.
  • Retail systems that sell third-party titles and third-party DRMs. Examples: Green Man Gaming*, Humble Bundle*, and GamersGate*.
  • Digital distribution platforms selling third-party titles and proprietary DRMs. Table 1 shows the page visits of leading platforms.

Table 1: Top distribution platforms ranked by page visits (Source: Newzoo Q2’17: Global Games Market).

Platform

Web Address

Number of Games

Total Monthly Visits

Steam*

www.steampowered.com

14,000

163,000,000

Humble Bundle*

www.humblebundle.com

5,000

41,600,000

GOG

www.gog.com

2,000

19,000,000

itch.io

www.itch.io

63,000

10,100,000

Green Man Gaming*

www.greenmangaming.com

7,500

6,200,000

GamersGate*

www.gamersgate.com

6,000

2,000,000

OnePlay*

www.oneplay.com

2,000

127,000

Make it a true partnership

With so many options to choose from, it can be difficult to choose a distribution partner. Some key questions to answer before settling on a partner are:

  • What is your business model? If you are relying on in-game purchases, you’ll need a strong DRM system to manage those microtransactions.
  • Is your game free or fee-based? Pricing is a tough choice you should make early in your developer’s journey—find more information on our separate article “Get Ready: Pricing Your Indie Game”.
  • Who is your target audience? If you’re focused on a narrow niche, you may not want to risk getting lost on the largest distribution platforms. Look for sites dedicated to your target audience.
  • What devices are your potential customers using? If you are releasing a mobile puzzle-game, focus on sites that distribute such titles.
  • What channels are your potential customers using? Find out what site(s) your target audience relies on.

Direct distribution can still work

Alarmed at the thought of losing revenue to an online distributor, some indie devs might be thinking about distributing their game’s installation package by themselves. The average split for selling through a retailer is 70/30, but can vary depending on the platform and the leverage of the developer. Some sites even offer a Pay What You Want option or allow customers to direct some of the money to charity.

To keep more than 70 percent of the revenue for yourself, you can hook up with a full-stack digital commerce platform, such as Fastspring*, which enables global subscriptions and payments for online, mobile, and in-app experiences. Or you can set up your own digital store using the tools at Binpress*.

“Don’t just rely on distributors to sell your game for you . . . There is still significant money to be made from direct sales,” writes Paul Kilduff-Taylor, part of the team at Mode 7 in Oxford, United Kingdom. When setting up your own distribution channel, you’ll need a reliable payment provider, a clear, optimized website, and you’ll have to work hard to drive potential customers to your site with a good marketing plan, Kilduff-Taylor advises.

Use your own efforts to augment a complete, multichannel distribution strategy. “To have a decent success on the PC with a downloadable game, you’ll need to be on every major portal,” Kilduff-Taylor said.

Don’t stop with Steam

Steam controls a significant portion of the PC game market space, and by late 2017 the service had over 275 million active users—growing at 100,000 per week, according to SteamSpy.com. In 2017 an estimated 6,000+ games were released on Steam.

One of its key attractions is the Humble Indie Bundle, which curates indie titles, giving smaller games a chance to shine. The SteamDirect FAQ lays out some issues you should be aware of. Be sure to emphasize the points that make your game unique and be organized with your marketing efforts—with a plan, collateral pieces, press information, and a compelling trailer all ready to go. Trailers are a key ingredient, and producers such as Kert Gartner are highly sought after. <Link to Kert Gartner Explains the Genius Behind Mixed-Reality VR Game Trailers> Make sure your game stands out, or you could get lost in the daily release avalanche.

Patrick DeFreitas, software partner marketing manager at Intel, advises indies to consider distributing through multiple channels. “Many indie developers on the PC gaming side see Steam as the be-all and end-all for distribution. They believe that if they have their title on Steam, they’re good to go. But it’s important to consider additional digital and retail distribution channels to get your title out there.”

Secondary retailers and channels focus on curating high-quality games that are compelling to their base and may be able to perfectly match your title to their followers. DeFreitas also points out that some retailers may do a better job in a single region. “They’re all looking for a portfolio of titles that they can sell through their channels,” he said. “At the end of the day, you could potentially end up as an indie developer with a dozen different channels where you are selling your titles directly to consumers worldwide.”

Data gathering aids decision making

Investigate the statistics the various distribution channels can gather for you. Over time, you should have plenty of data to analyze as you determine sales trends, response to promotions, geographical strengths, and buyer personas. Steam is so big that third-party sites, such as SteamCharts.com, have sprung up providing data snapshots. At SteamAnalyst.com you can find out what in-game purchases are trending. Google Analytics* can be paired with Steam data to analyze your Steam Store page or Community Hub for anonymized data about your traffic sources and visitor behavior.

Steam*
Figure 4: Steam* conducts monthly surveys to help guide your decision making (source: Steampowered.com).

Be sure to take advantage of data collection opportunities, so you can develop and perfect the player personas in your target audience. The more you know about your sales and your customers the better are the decisions you can make about additional distribution choices.

SteamCharts.com
Figure 5: Third-party sites such as SteamCharts.com offer continual snapshots of Steam* data (source: Steamcharts.com)

Multiplatform releases boost incomes, headaches

Releasing a title across mobile devices, consoles, and PC operating systems is a good way to boost your income flow but probably not a good choice for most beginners with a single title. Learning the ropes for so many different systems all at once is a big challenge. Game engines such as Unity* software and Unreal* offer ways to reach multiple platforms from the same code base, but be prepared to make a big investment in testing and quality control. You might want to concentrate on making the very best PC game you are capable of, rather than extending yourself across every available platform.

Bundling for fun

Getting into an original equipment manufacturer bundle is a great way to jump-start distribution; you develop more of a business-to-business model, and the bundler handles much of the promotion. Instead of trying to stand out from dozens of titles released around the same time, you only compete with the handful of titles in the bundle. Reddit* maintains a good overview of the current bundles for sale, and a list of sites that offer game bundles. IndieGameBundles.com* keeps a similar list completely devoted to indies.

Writing at venturebeat.com*, Joe Hubert says, “You don’t enter into a bundle in hopes of retiring on a nice island. You enter a bundle for the residual influence it has on your game. (The exposure outweighs the low price-point of the sale.) Your game will get eyeballs, lots and lots of eyeballs, to look at your game, see what it’s about, and recognize it in the future,” Hubert wrote.

HumbleBundle.com offers an FAQ to help guide you through their submission steps. Fanatical.com* starts their process with an email, while Green Man Gaming has an online form.

You can also contact publishers and bundlers at shows and events as part of your own networking. Major gaming sites and magazines can steer you toward hot, new platforms. Chat up fellow devs for their takes on distribution trends as well.

Once you get into a bundle, you may be expected to participate with your own marketing efforts. The good news is that you’ll have more news and information to fill up your social networking feeds. For more information about promotion strategies and marketing deliverables, check out this article about attending events, <link to Get Noticed: Attending Your First Event as an Indie Game Developer> and this article about approaching industry influencers. <link to Get Big: Approaching Industry Influencers for Indie Game Developers>

Several organizations host annual contests for indie game developers. The Independent Games Festival offers cash prizes and publicity, while the Game Development World Championship offers trips to Finland and Sweden, and visits to top game studios. These are also great marketing bullets for your promotional materials. Also be on the lookout for contests that offer help in distributing your game in a bundle, or as a stand-alone title. The Intel® Level Up Game Developer Contest, for example, puts your game in front of Green Man Gaming. Check the PixelProspector.com* site for its updated list of contests to enter.

The power of good distribution

Bastion*, an action role-playing game from indie developer Supergiant Games, was nearly sunk by a troubled preview version at a recent Game Developer Conference (GDC). When they brought a playable version to the Penny Arcade Expo, however, it started picking up awards. Crucially, this led to Warner Bros. Interactive Entertainment publishing it on Microsoft Xbox*. It was next ported to Microsoft Windows* PC on Steam, and a browser game was created for Google Chrome*. It sold at least 500,000 copies in one year.

Bastion*
Figure 6: Bastion* overcame early obstacles to become available in multiple versions.

When Dustforce* was included in Humble Indie Bundle 6, they witnessed an enormous boost in sales. In a short two-week period after the bundle was rolled out, the game sold 138,725 copies and pulled in USD 178,235.

Promotions often provide a spur to plateauing sales. Alexis Santos, editor at Binpress*, said that Pocketwatch Games’ Monaco* made USD 215,000 by participating in Humble Indie Bundle 11. Monaco was included in 370,000 bidders out of 493,000 bundles sold; bidders had to beat the average price of USD 4.71 per bundle to receive Monaco. That meant that Pocketwatch didn’t receive a big income boost per title, but it distributed hundreds of thousands of copies of the game. What it did receive was a pretty good income boost to a game that had been on the market for 10 months, and there was no major impact on Steam sales of the full-priced title outside of the bundle.

Aki Järvinen, founder of GameFutures.com*, recently wrote on 10 trends shaping the gaming industry and pointed to the evolution of business models that benefit from new distribution schemes. “Companies like Playfield* and Itch.io are building services that try to tackle the indie discoverability issue,” he said, “both for the player community and the developers.” His guess is that with distribution platforms providing more support for marketing, public relations, and data analytics, in the future we may be seeing more of what Morgan Jaffit calls “triple-I” titles and studios.

Rather than the indiepocalypse that pundits worried about in 2015—a super-saturated indie market leading to smaller slices of a slow-growing pie—there will always be room for creative, unique games. The trick will be in making them easy to find and buy. With multiple evolving distribution channels, you’ll have to work hard to distribute your intellectual property through appropriate channels, in order to maximize reach, audience, and revenue. Don’t be afraid to ask for help, either. Intel and Green Man Gaming just teamed up to form a new digital content distribution site for publishers, retailers, and channel partners. To learn more about getting involved with the Intel® Software Distribution Hub, visit https://isdh.greenmangaming.com.

Resources

Intel® Developer Zone

Intel® Level Up Game Developer Contest

Indie Games on Steam

Get Big: Approach Industry Influencers to Build Awareness for Your Indie Game

$
0
0

Man in front of a positive urban landscape

Getting noticed in the vast digital world, with its myriad social networks and other channels of influence, might appear to require mountains of money and resources. This could be a problem for indie game developers with limited budgets. Expensive PR agencies might have once been the only option, but today's internet-based marketing channels are free for the asking. The networks and people who can provide the exposure you need often have as much to gain from your success as you do—it's your content that keeps them in business. More than they create, influencers endorse and attract. They need a constant flow of new and visionary material to keep viewers interested. Indie game developers can feed that appetite for content as well as any major game studio, but how do you make that connection?

Getting the word out today means more than sprinkling seeds to the four winds of the web and nurturing the ones that take root. Today's influencers—the streamers, YouTube* gamers and bloggers—can multiply your exposure many times over, and it's important to identify and target the ones that play your type of game. You also can gain exposure from your indie peers, traditional media outlets, gaming conferences, and from the consumers themselves.

The trick to approaching these disparate groups is in knowing how to identify the influencers of highest value to you, designing a plan of attack for each, and implementing and tracking the results of that plan.

This article covers strategies for approaching:

  • Social network communities
  • Streamers on Twitch*, YouTube*, and others
  • Game retailers
  • Gaming event attendees

It leans heavily on the know-how of Patrick DeFreitas and Dan Fineberg, marketing experts at Intel, who share time-tested techniques for publicizing and distributing titles on a budget that indies can afford.

Social Networks: Start with What You Know

Social media channels play heavily into an overall brand-building strategy. Identify the social networks you're already familiar with, and start promoting your game there.

During the early stages of development, indies can already identify the aspects of their game that make it unique and fun. Before a game is even playable, screenshots or renderings of game scenery can be used to promote the game on sites such as Facebook*, where it's easy for others to help spread the word, generate interest, and perhaps even spawn a community. "Even if you don't have anything to show but a single screenshot, if you have a good story, and something to share with the gaming community that they feel would be a value to their own work, then that's another way of bringing visibility to your game in the very, very early stages," says Patrick DeFreitas, Intel marketing manager for software, user experience, and media.

What is different about your game—its narrative, characters, or flow? Identify the characteristics of your game that will capture people's interest and post about them on social sites. One post could cover how the game riffs on a popular storyline, another its original setting, and the next how it augments reality in a way that's never been done before. It could be anything, but it should be something that's yours and yours alone. DeFreitas says that outreach should begin at an early stage. Dan Fineberg—a software marketing and planning consultant at Intel—points out that new social channels, such as Medium*, are being launched frequently and are getting a lot of attention. "It's a relatively new medium when you think about it. There's a lot of change, and you just have to stay abreast of it."

Dan also said that for generalized social media channels such as Facebook, Twitter*, Instagram*, and YouTube, your strategy must be carefully tailored. "There is a lot of nuance in terms of what each platform is best at doing." Different social media channels have differing value to gamers and the game developer. "Not just in gaming, but in general software-related areas, you might find that you can get a lot more engagement on one medium such as Facebook, but that for creating more awareness of your game another channel such as Instagram or Twitter might be better—but your results are unique to you."

The social aspect for some specialized sites might be relatively small, but having a presence can pay off later. "Places like Green Man Gaming and others like it have obvious relevance," says Fineberg. They are good places to get early visibility and make inroads to the opportunities the sites offer to increase the distribution of your title. He added that Twitch also has become a powerful social platform, and can lead to engagements with influencers and others that might be interested in promoting your title.

Conversation bubble, represents communication to a wide, different crowd
Figure 1. Identify and spread the word about the unique aspects of your process and game.

"You've also got Reddit*," notes DeFreitas. "There are so many different groups within those channels that really cater to developers and individual developer programs, and I think they'll continue to use those social channels." The trick is to balance your game development time with the time you spend on social networks. DeFreitas added: "You only have so much time in the day to dedicate to exchanging content and information with your digital community online, versus focusing on creating your game." He advises that selecting communities and channels that yield the best return-per-engagement might involve some trial and error. Make sure you're providing value and tapping into a channel that sees what you're doing is innovative, progressive or unique, and parallel to what that channel or community is all about. The community itself will let you know, either by silence or by storm.

Puzzle game with many pieces, represents puzzle resolution
Figure 2. As the pieces of your game come together, put them in the public eye to generate interest and build a community.

An Agenda for Events

A great place to make direct contact with influencers, industry figures, and game enthusiasts is at developer events, such as the Intel® Buzz Workshop series, where the focus isn't necessarily on showing people a game that's ready for market. "You could be talking about your development techniques, the challenges, and things that you've overcome," says DeFreitas. Also, on the agenda could be some of the different solutions you've implemented that other developers may find interesting or valuable.

The most important element of the indie's marketing campaign is to approach influencers who can spread the word about your game, and get people interested in buying and playing it. Schedule appointments ahead of time, and take advantage of events that draw together personalities that you otherwise would have to spend a vast amount of time and money tracking down individually. Think ahead and plan your campaign.

Once your game has reached playability, offering a closed beta is a great way to give gamers a sneak preview. DeFreitas described how Polish game developers Destructive Creations are using this strategy for the upcoming release of Ancestors Legacy. "They created videos on YouTube. They created a product page on Steam* and, to get the word out for the game, they created all of this content—and obviously the game's not even ready for market."

Promotion image for game - Ancestors Legacy
Figure 3. To create buzz, Destructive Creations created prerelease content for Ancestors Legacy.

"The closed beta means getting people to hammer on the game itself, and you're still capturing all the feedback," says DeFreitas. "That feedback won't necessarily affect your ratings on Steam, because everyone understands that this is a closed environment." Testers are made to feel like insiders, which likely means they'll talk about the game more, and it gives them a stake in your game's success—and a say in what goes into it. "It's a clever way of doing that, when you think about it. Besides them playing your game and giving feedback, if they love it they'll get a free copy, or a couple of copies to give to friends and family when everything is ready." You offer something interesting, exclusive, and unique to your audience, as well as a reward for participating.

Strategies for Streamers

Is your game a first-person shooter, side-scroller, or immersive adventure? Does it take place in space or a fantasy land? One-on-one or online multiplayer? Identify streamers who play your type of game. Be brave, aim high, and list them all, regardless of their status. "Don't be afraid to start with the mid-tier or upper-tier influencers and see if they'll be willing to stream," says DeFreitas. After all, streamers need content to fill their pipelines, gain new viewers, and increase their influence. Their inclination will be to listen, but your time with them will be limited. So, develop and practice your pitch—you might only get one shot.

"If you can engage with people like that, and implement some of their ideas in your game, they will likely feel really positive toward what you're doing and help promote it," Fineberg added. This can be a critical strategy in any market—luminaries and influencers get on your side to champion your cause because you've realized their vision. "That's important, because they're opinion leaders, and they have ideas that lots of people care about. You can help them build equity in their value to their audience, and they'll be inclined to help you, in return."

Of course, there could be roadblocks. "Once you start reaching the celebrity influencers and streamers that are out there, often they're committed to a specific title or genre, or under commitments made to a sponsor," says DeFreitas. This makes it a greater challenge to pull someone in to stream your game, especially if it doesn't have the level of success of other games they're currently streaming. There may be, however, opportunities to partner with the sponsors themselves. "Channel folks can help," Fineberg says, "because as you develop relationships in distribution, that can become an entrée into their joint go-to-market activities, including engaging with influencers."

DeFreitas agrees. He says that developers should also look to the hardware companies producing the kit used by gamers. "Some of the independent software vendor developers we're working with today wouldn't have reached out to some of the streamers that we work with, if we didn't insert ourselves as part of the equation."

Tap Existing Contacts

Identify the sponsors of the streamers you plan to approach and exploit any existing relationship or connection you might have with them. "Those influencers are already getting paid through sponsorship, so if they have a channel and they need to fill that channel with content they may be open to opportunities to insert your game into that channel, which is already being paid for and covered by the bigger partner or brand sponsoring it," says DeFreitas.

Intel, your game engine maker, and others, also might maintain influencer networks as part of their developer programs. Some might even have their own streaming channels. "It takes considerable energy, time, and resources to keep one of those up and running and filled with content," says DeFreitas. "So, they are probably always looking for opportunities to pull in new content, especially if it's a title that's related to their technology."

Talk to Game Retailers

"Companies like Green Man Gaming and Humble Bundle want to increase their revenues, so they engage the developers of games they distribute in go-to-market promotional activities to build interest and demand for the titles," says Fineberg. Retailers have affiliates, influencer channels, and networks, too, and all are aimed at generating revenue. Green Man Gaming maintains a network of about 3,000 influencers, explains DeFreitas, but access doesn't come for free. Retailers usually expect you to contribute time, effort, and possibly money, to the go-to-market program.

Some of that time should include putting together an influencer kit that describes the game in positive terms. Include artwork and other relevant game assets in your kit. And make it easy for retailers and influencers to understand, help promote, and sell your product. After signing a retail contract, you'll be working with either an account manager or with a marketing team. According to DeFreitas, your proposal might be to set aside 50 influencers and give each of them three keys to give out to their audience. "You're most likely going to get some visibility on their channels."

Repeat this process for other distribution channels. "Now you're taking their networks, and leveraging their audiences on your behalf, without really doing a lot of work," says DeFreitas. "Essentially, you're giving them the keys, you're giving them the artwork, you're giving them some interesting facts about the game itself, you're packaging it up, and you're pushing it out. Ultimately what they are trying to do is bring visibility to your game, drive audiences back to their respective retail channels, and convert those sales."

 Map of possible options, represents strategic planning and organization
Figure 4. A successful publicity strategy will include many interrelated components, working together.

Take a Holistic Approach

As an independent game developer your business strategy needs to begin early in your design process, evolve as the game does, and continue through release and distribution. You must find the right balance between coding and marketing, learn from mistakes, focus on the strategies that succeed, identify the most efficient influencers, and prioritize your contact and engagement with them.

photorealistic penguin about to wake polar bear with cymbals
Figure 5. Be brave. No matter how big the influencer, it's a mutually beneficial partnership.

Awareness marketing has historically been thought of as separate from lead generation. "That's all changed. Social media really combines both awareness and lead generation in one fell swoop," says DeFreitas. The reason for this is simply due to the sheer mass of sites such as Facebook, YouTube, and Twitter. "When viral content exists on one, or there's a controversy or what have you, it creates a ripple effect throughout the entire mass communication media spectrum," he explained.

Influencers need you as much as you need them. So, remember to be brave, and try to avoid being eaten.

Resources

Unreal Engine*: Blueprint CPU Optimizations for Cloth Simulations

$
0
0

Cloth Simulations

Realistic cloth movement can bring a great amount of visual immersion into a game. Using PhysX Clothing is one way to do this without the need of hand animating. Incorporating these simulations into Unreal Engine* 4 is easy, but, as it is a taxing process on the CPU, it’s good to understand their performance characteristics and how to optimize them.

Disabling cloth simulation

Cloth simulations in Unreal* are in the level they will be simulated, whether they can be seen or not. Optimization can prevent this risk. Do not rely on the Disable Cloth setting for optimizing simulated cloth, as this only works in the construction, and has no effect while the game is in play.

Unreal* Physics Stats

To get a better understanding of cloth simulation and its effect on a game and system, we can use a console command, Stat PHYSICS, in Unreal.

After entering Stat PHYSICS at the command line, the physics table overlay appears (Figure 1). To remove it, just enter the same command into the console.

Physics overlay table
Figure 1. Physics overlay table.

While there is a lot of information available, we need only worry about the first two (Cloth Total and Cloth Sim) for the purposes of this paper.

Cloth Total represents the total number of cloth draws within the scene, and Cloth Sim (simulation) represents the number of active cloth meshes currently simulated. Keeping these two numbers within a reasonable level to your target platform helps prevent a loss of frame rate due to the CPU being loaded down with processing cloth. By adding an increasing number of cloth meshes to the level, the number of simulations the CPU can handle at once becomes apparent.

Suspending Cloth Simulation

Added to the Unreal Engine Blueprint system is the ability to suspend and resume the cloth simulations on a skeletal mesh. These added nodes solve the previous issue of the cloth simulation being reset every time with the level of detail method of cloth optimization.

Rendered function switch
Figure 2. Resume and suspend on a was recently rendered function switch.

For the purpose of this document all of the methods discussed below in the Level of Detail section still apply, but now you can exchange the Set Min LOD nodes with the Resume and Suspend Clothing Simulation nodes.

Time delay switch

With cloth simulation suspension, we are able to be more dynamic with cloth while still being able to optimize performance. However, using only an occlusion switch can lead to a dropping banner problem; wherein a cloth simulation has dynamic movement, and the player turns away (which pauses the cloth simulation), and then after some time turns back to see the cloth simulation hovering in mid-air before continuing its movement.

To solve this issue, we can use an occlusion switch and add a Boolean check to our suspension; in this way a delay before suspending the simulation can be used, giving the cloth enough time to finish its movement before coming to a final rest and remaining suspended.

Time delay switch
Figure 3. Time delay switch.

Level of Detail

When creating a skeletal mesh and attaching an apex cloth file to it, that cloth simulation will always be tied to the zero value of the level of detail (LOD) of that mesh. If the mesh is ever switched off of LOD 0, the cloth simulation will no longer take place. Using this to our advantage, we can create an LOD 1 that is the same in every way as our LOD 0 (minus the cloth apex file), and use it as a switch whenever we want to use the cloth simulation (Figure 4).

Level of detail information
Figure 4. Level of detail information.

Boolean switch

Now that we have a switch we can set up a simple blueprint to control it. By creating an event (or function), we can branch using a Boolean switch between simulating the cloth (LOD 0) and not simulating the cloth (LOD 1). This event could be called on a trigger entered to begin simulating the cloth meshes in the next area, and again when the player leaves that area to stop those simulations or any number of methods, depending on the game level.

Occlusion Culling Switch
Figure 5. Switch blueprint.

Occlusion culling switch

If a more automated approach is desired, occlusion culling can be used as the switching variable. To do this, call the Was Recently Rendered function, and attach its return to the switch branch (Figure 6). This will stop the cloth simulation when the actor is no longer rendered.

Recently Rendered function in the switch blueprint.
Figure 6. The was recently rendered function in the switch blueprint.

The problem with this method comes from the simulation reset that occurs when the simulation is switched back on. If the cloth mesh is drastically different when it is simulated, the player will always see this transition. To mitigate the chance of this happening, the bounds of the mesh can be increased with import settings. However, this also means intentionally rendering objects that cannot be seen by the player, so make sure it is worthwhile in terms of the game’s rendering demands.

A level design approach to solving this issue would include making sure all dynamically capable cloth meshes (such as flags) are placed in the same direction as the wind.

It may be possible to program a method in C++ that will save the position data of every vertex of the cloth simulation and translate the mesh back into that position when the simulation in turned back on. That could be a very taxing method, depending on the data structure used and the amount of cloth simulations in the level.

Cloth simulations without occlusion culling switch
Figure 7. Cloth simulations without occlusion culling switch.

Cloth simulations with occlusion culling switch.
Figure 8. Cloth simulations with occlusion culling switch.

Combination/Set Piece Switch

If the level happens to have a very dynamic set piece that is important enough to always look its best, an additional branch that uses a Boolean switch can be attached to the actor; in figure 9 we call it Optimize Cloth?

Set piece switch.
Figure 9. Set piece switch.

With this new switch, importance can be given to certain cloth meshes that should always be simulated by switching their Optimize Cloth? value to false.

Using a set piece switch

In figure 10 below, three cloth meshes are flags that turn away and point backwards, relative to their starting position. It takes a few seconds for this to look natural, but because they really sell the fact that they are not hand animated, I set them to be set pieces (Optimize Cloth? false), so they are always being simulated.

Complex flags used with set piece switches.
Figure 10. Complex flags used with set piece switches.

Step Up Your 3D Asset Pipeline

$
0
0

Image of 3D Asset Pipeline

As computer hardware continues to advance, with software to match, we have entered an age where creating amazing-looking digital media is easier than ever before, even at an entry level. However, creating good design goes beyond just looks. It also must function well for the user, as well as for those down the pipeline who create and set up the functionality. Having a solid creative pipeline for creating assets is important in saving you time and frustration as well as develop a more interactive and playable asset.

When creating 3D assets, I break the process into four main parts: preplanning and research, the first pass, implementing and testing, and the final pass. This upfront work might seem like a lot of extra effort, but it saves time in the long run by identifying and resolving problems earlier in the pipeline.

Let’s dive in and see how these four steps can enhance your creative pipeline.

Understand What You Need to Make Before You Start to Make It

Novice digital artists often make the mistake of not taking the time to fully understand what they are going to make before they hop into their program of choice and start creating. The problem is that if you’re creating an asset for a game or commercial purpose, most likely some level of functionality and interactivity will need to be accounted for. By taking the time at the start to understand how the asset at hand is expected to work, you can break it down into components and piece it together as a whole. Otherwise, you run the risk of having to break apart an already completed asset and patch up the seams or, worse, start over.

Here are some tips to help you better understand what you need to make before jumping in to make it:

  • Find plenty of references and study them. As obvious as it sounds, many digital artists don’t spend enough time finding quality references to influence their design. I always ask the client to provide references, if possible, to get a better idea of what they want. I also allow myself a reasonable amount of time to find and study references to help me better understand the subject matter. Believe it or not, Pinterest* is a great tool for finding references and creating reference boards on a per-project basis.
  • Concept: quantity over quality. Normally you would think the opposite — quality over quantity — is true, but during the concept phase having numerous ideas gives you more options in selecting the best one, than does settling on one idea that is just okay. Make sure you have your references handy. A great practice is to take timed sprints of 25 minutes to an hour, spending at least 5 minutes per concept, but no more than 10. During this process, make sure to zoom out and focus on the overall form and not to get caught up in details, which you’ll do after you review the first few rounds of concepts.

Different versions of the main 3D figure

Watch concept video

  • A goodmodel sheet removes a lot of the guesswork. A standard model sheet will have a front, side, and top view of the asset you plan to make, acting as a great guide for the modeling process. To get the most from your model sheets, preplan any functioning parts to see if they mechanically and aesthetically work in all views. This allows you to also see the various functioning parts, so you have a better idea of how many pieces your asset may need. If something seems off, you can address any issues before the modeling process, saving you time and frustration.

Graphics of the main 3D figure

The benefits of taking adequate time to preplan will result in a more well thought out product, ultimately saving time for you and your teammates.

Now that we have a good understanding of what we want to make and a fancy model sheet as guide, let’s model it.

First Passes for First Impressions

The first pass is similar to the concept phase, in that we want to focus on the main forms and functionality. As mentioned before, the easy part of creating a 3D asset is making it look good, so at this stage we want to ensure our asset’s interactivity is center stage and developed properly. Once again, keeping it simple at this stage allows for more wiggle room if we need to address any issues after testing. Details can be added easily later, but having to work around them can become problematic and frustrating.

Here are some tips to speed up the modeling process as well as optimization:

  • Set the scene scale before modeling. It's easy to want to start creating as soon as we open our program of choice, but we need to ensure that what we're making matches the specifications given. Although scaling up or down after the fact isn't that hard to do, not having to do it at all is much easier.
  • Not every asset needs to start as a single mesh. It's much easier to combine meshes together and clean up loose ends rather than rack your brain on how a single mesh can be built outward. This is especially important for functioning parts, because having a separate mesh that can be altered on its own without affecting other meshes is easier to deal with.
  • Mind your edge flow and density. Having a smooth mesh is appealing to the eye, but at this phase having a higher poly density increases the amount of detail we have to maintain, sometimes for even the smallest changes. Keep it simple for now, because we can always add extra subdivisions once we're happy with the form as a whole.

3D figure set to Merge Center

  • For symmetrical pieces, focus on one side and mirror the appropriate axis. This approach guarantees that the mesh will be symmetrical, saving you a lot of time reviewing and cleaning up. Expect to do this multiple times as you develop the mesh to get a better sense of the object as a whole. If you end up with any gaps down the seams when mirroring, you can either sew the vertexes together with Merge To set to Center, or you can select all the vertices on the seam, move the pivot point to the central origin, and then scale them toward each other.
  • If duplicating a mesh, UV it first. UV unwrapping is already time intensive, so why spend extra time when you can do it once and have it carry over to the rest. Although you can copy and paste UVs for duplicated meshes, sometimes you may end up with undesirable results, which requires extra time to fix.

After the mesh resembles a rough form of what we're aiming for, I recommend setting up some base materials with simple colors to mock up the details. If we’ve managed our edge flow well enough, we should have proximate areas for a base level of textures that we'll apply per faces. Doing this saves a lot of time because we’re not committing to the arduous task of perfecting our UV unwrapping, which will come during the polish phase. We’ll also have more control updating colors on a per-material basis when we test in-scene.

Colored base materials to mock up the details

Now all we need to do is export our asset so we can test it in the engine to ensure it looks and functions as intended. However first we need to do some important prep work in order to avoid having to re-export and import multiple times.

Make sure to address the following details before exporting:

  • Is your hierarchy concise and are your meshes properly named? With any asset that's going to have various functioning parts, you will most likely be dealing with multiple meshes. Taking the time to have a concise hierarchy will make sense of what meshes interact together or independently, and properly naming them will avoid confusion as to what each part in the hierarchy is.

Example of properly named meshes

  • Are the origins of your points of interaction set up properly and locked in? Not every mesh you create will need its origin point at 0,0,0. This is especially true when you're working with multiple meshes and moving them about based on the hierarchy. So we want to be sure to set the pivot points to where they make sense and freeze the transformations when we have them where we want them. This will make it easier to manipulate any aspect of the asset in a scene.

Steps to set Freeze Transformations

  • Are your materials set up and managed well? Try to avoid using default materials and shaders, because it will overload the project with duplicate materials in various places, causing confusion when you need to assign them in the editor. When dealing with any faces that may have been missed and left as default material, I recommend going to the Hypershade menu, and then holding down right-click to select all the faces with that material. If there aren't any, we're good to go. If there are, they are now selected and we can assign them to what we want them to be.

Watch exporting setup video

With our asset set up and prepped, we can export without having to make any major changes and re-exporting. When using Maya*, I recommend using the Game Exporter because its tool settings for exporting are easy to understand and adjust. It's also good practice to set the exporter to Export Selection for more control over what you're exporting and so you don't end up with stray meshes, materials, cameras, and so on. Once we export our asset, we can now test it in scene and see how it works and feels.

Steps to export selection

Time to Test Your Mettle and See How It Came Out

With all the preparations we took to ensure a solid design before jumping into modeling and making a rough version of the model focusing on functionality instead of minor details, it's time to see where we stand. We'll start by adding our exported model to the project in an appropriate folder. Metadata files are automatically created, as well as a folder with any materials we assigned. Because we chose to make multiple materials as opposed to a single UV layout/material to save time by not overcommitting to details, we will have to clean up this folder once we have the final version of our asset.

Now, let's drag our asset into the scene to review and test it to ensure it works and feels as intended.

Here's what we'll want to watch for:

  • Is it in scale with the rest of the assets in the scene? This is the most obvious thing to check, and it will stand out the most if it's drastically off. If the scaling is off, it is important NOT to scale the in-scene asset. Doing this will not only have no effect on the scale of the base asset, but it will also cause frustration down the road because we’ll have to manually scale the asset each time it is brought into the scene. To adjust the scale in-engine, we want to do so via the asset's Import Settings under Scale Factor. However, I recommend noting the scaling difference and adjusting it when we make a final pass on our asset and then re-exporting, because any updates to scale factor are stored in the metadata file, which may change when reimporting, not retaining the scaling changes we make.
  • Do any areas of the mesh not read as intended? Even though we made a stellar model sheet to work out any odd details before we began modeling, sometimes once the mesh is in-scene to scale among other assets, some areas and features may not read as well when in context. Rather than reviewing the asset in Scene View, we want to view it from Game View to get a better idea of how it will look to the user and avoid being overly critical of areas that may not even be noticeable when in use. Our goal is to achieve overall readability. If the asset doesn't quite seem to be what we intended, we need to note why and figure out ways to make corrections for our final pass.
  • Are the pivot points where they need to be and zeroed out? We took the time before exporting to ensure our pivot points were where they needed to be and locked in, so now we want to ensure this carried over properly in-engine. Before pondering what might have gone wrong, check to make sure Tool Handle mode is set to Pivot and not Center. As we double-check each mesh within the asset's hierarchy, we also want to verify that the transforms are zeroed out as well. This way if any part accidentally gets moved out of place, anyone can zero it out and it'll be right where it needs to be.

Checking minor details like this will ensure things are done right the first time and let us progress more quickly to other tasks. That said, we can't always expect perfection on the first try, which is why it's important to keep things rough at first, and then refine them as we go along.

Ideally, with all the precautions we took to ensure high quality, minimal, if any, alterations will be necessary after we've reviewed our work, in which case we can get cracking at a final pass and make it nice and shiny.

Polish It up and Put a Bow on It

This is the moment we've been working toward. First, we need to address any shortcomings we noted when reviewing in-scene, keeping in mind the techniques we used during our first pass. After we have made any alterations to the base form, we can start cleaning up the mesh and adding those sweet deets we've wanted since the beginning.

Here are some pointers to keep in mind as we’re cleaning up and polishing:

  • Be selective when adding subdivisions. When smoothing out a mesh, it’s best to avoid just clicking the Smooth Tool button on the entire mesh, because it doubles the number of polygons. Yes, it will look nice and smooth, but at the cost of performance in the long run. Instead I like to be more selective with the areas that I add subdivisions to. Start with larger areas and smooth the edges to see if that does the trick. If not, resort to selecting faces of the area that needs smoothing and use the smooth tool then, because it allows better control of poly count.
  • Bevel hard edges to make them seem less sharp. This is similar to adding subdivisions to faces, but instead you’re adding to edges. This makes them appear softer where softening the edge doesn't quite do the trick (usually edges with faces at 135 degrees or less). It also has a nice effect when baking ambient occlusion. As with smoothing faces, we want to be selective when choosing which edges to bevel so as not to drastically increase poly count.

Example of beveled hard edges

  • Be mindful of edge flow and density. There's no need to have multiple edge loops on large flat surfaces. Minimize poly density when you’re able to as long as it does not disrupt edge flow. This will make UV unwrapping easier as well.

Example of minimized poly density

  • Don’t be afraid of triangles. There’s a common misconception with 3D modelers of various skill levels that a mesh needs to consist primarily of quads. Although we don’t want anything higher than a quad, there’s nothing wrong with tris. In fact, the majority of game engines, such as Unity3D* and Unreal*, will convert quads into tris for optimization.

Example of mesh made with triangles

  • Delete deformer history from time to time. The more we clean up our mesh, the more deformer history is being saved, crowding our attribute editor and possibly affecting the use of other deformer tools. When we're happy with the results after using deformers such as bevel and smooth, we can delete deformer history by selecting the mesh and then going to Edit>Delete All by Type>History or by pressing Alt+Shft+D. This will ensure a clean attribute editor and prevent other deforming tools from not performing properly down the line.

Steps to delete deformer history

Although we normally want to aim for a low poly count for our assets, when creating assets for virtual reality (VR), we don’t have that luxury. Keep in mind that because the user can get up close and personal with many of the assets in a VR environment, hard edges can look daunting thus requiring slightly higher poly counts.

Now that our mesh is cleaned up and polished, it's time to move on to one of the more tedious parts of 3D modeling: UV unwrapping. Here are a few pointers to make the most of your UVs:

  • Start by planar mapping. Automatic UV unwrapping may seem like a good idea, and for simple meshes it can be, but for more complex meshes it ends up slicing UVs into smaller components than you want, and then you have to spend time stitching them together. On the other hand, planar mapping projects the UVs on a plane silhouetting the mesh similar to a top, side, or front view, and makes a single shell that you can break into components of your choosing. I find it best to choose the plane with the largest surface area when planar mapping.

Steps to set planar mapping

  • Cut the shell into manageable sections. After planar mapping, you can create seams that will break the shell into smaller pieces by selecting edges and using the Cut UV tool. This makes it easier to manage sections as opposed to trying to unfold a larger shell and having to spend time making minor adjustments. You can always sew shells together for fewer seams after the fact, saving time and frustration.
  • Utilize UV image and texture tools for less confusion. UVs can be confusing at times, because a shell may look the way you want but will be flipped, giving you undesired results. To ensure you know which way your UVs are facing, enable the Shade UVs tool (blue=normal, red=reversed). Another tool worth enabling is the Texture Borders toggle. This clearly defines the edges of your UV shells in the UV editor, as well as on the mesh in the main scene, making it easier to see where your UV seams are.

Example of Shade UVs tool usage

Example of Texture Borders tool enabled

  • Areas that will have more details should have a larger UV space. Although it's nice to have all the UVs to scale with each other, often that will be areas in which we want more detail than in others. By having the areas that require more detail utilizing more UV space, we can ensure those sections are clear.
  • Think of UV unwrapping as putting together a puzzle can make the process seem more like a challenge and less like a dreaded chore.

Once we have our UVs laid out and reduced to as few materials as possible, we can export our mesh (using the same prep guidelines from the rough phase) and bring it into Substance Painter*.

Substance Painter is a great tool, because it gives you 3D navigation tools similar to that of many 3D modeling programs, layer and brush customizations of digital art programs, and the ability to paint on either the UV layout or mesh itself. I recommend starting with a few fill layers of the material of your choice to recreate the base materials from the rough-out phase. By using layer masks, we can quickly add or remove our selected materials per UV or UV shells. Custom brushes with height settings can add details such as mud, scratches, fabrics, and so on that can be baked into normal maps, adding a lot of life with a few simple strokes.

Before exporting our textures and materials, we need to do some prep work in order to get the most out of what Substance Painter has to offer:

  • Increase or decrease resolution. One of the advantages of Substance Painter is the quality at which it can increase or decrease resolution and go back again without forfeiting details. Not all assets need to be at a high resolution. If your asset reads well with little to no noise, pixilation, or distortion when at a lower resolution, change it. With Substance Painter you can always go back up in resolution without losing the original amount of detail. If your asset is going to be used in VR, it’s best to increase resolution to ensure all details are as crisp as possible.

Steps to increase resolution

  • Bake maps. This will make the most of any height details you create by baking them to a normal map, and ambient occlusion maps add subtle shadows that give assets that little extra pop to boost their readability. When baking maps, I usually set their resolution down a step from my material and texture maps as they tend to be subtler.

Steps to bake maps

  • Choose an export setting based on the designated engine to be used. Another great feature of Substance Painter is the export presets based on the various engines that can be used. This helps ensure you don’t get any strange effects when adding your maps to the asset in the engine.

Options to choose export settings

Watch exporting in painter video

We did it! We took the time to plan out our asset to its fullest, roughed out the major forms with functionality in mind, tested our asset in-engine, and detailed and cleaned up our asset in a final pass. Now we can hand off our hard work with the confidence that not only does our asset look great, but it’s also set up in a way that works efficiently and is easy to understand, so that someone down the pipeline can add interactivity and playability. And with all the time we saved and frustration we avoided, our level of creativity remains high for the next project.

Final image of the 3D figure

Artificial Intelligence and Healthcare Data

$
0
0

Introduction

Health professionals and researchers have access to plenty of healthcare data. However, the implementation of artificial intelligence (AI) technology in healthcare is very limited, primarily due to lack of awareness about AI. AI is still a problem for most healthcare professionals. The purpose of this article is to introduce AI to the healthcare professional, and its application to different types of healthcare data.

IT (information technology) professionals such as data scientists, AI developers, and data engineers are also facing challenges in the healthcare domain; for example, finding the right problem,1 lack of data availability for training of AI models, and various issues with the validation of AI models. This article highlights the various potential areas of healthcare where IT professionals can collaborate with healthcare experts to build teams of doctors, scientists, and developers, and translate ideas into healthcare products and services.

Intel provides educational software and hardware support to health professionals, data scientists, and AI developers. Based on the dataset type, we highlighted a few use cases in the healthcare domain wheref AI was applied using various medical datasets.

Artificial Intelligence


AI is an intelligent technique that enables computers to mimic human behavior. AI in healthcare uses algorithms and software analyzing of complex medical data to find the relationships between patient outcomes and prevention/treatment techniques.2 Machine learning (ML) is a subset of AI. It uses various statistical methods and algorithms, and enables a machine to improve with experience. Deep learning (DL) is subset of ML.3 It takes machine learning to the next level with multilayer neural network architecture. It indentifes a pattern or does other complex tasks like the human brain does. DL has been applied in many fields such as computer vision, speech recognition, natural language processing (NLP), object detection, and audio recognition.4 Deep neural networks (DNNs) and recurrent neural networks (RNNs), examples of deep learning architectures, are utilized in improving drug discovery and disease diagnosis.5

Relationship of AI, machine learning, and deep learning.

Figure 1. Relationship of artificial intelligence, machine learning, and deep learning.

AI Health Market

According to Frost & Sullivan (a growth partnership company), the AI market in healthcare may reach USD 6.6 billion by 2021, a 40 percent growth rate. AI has the potential to reduce the cost of treatment by up to 50 percent.6 AI applications in healthcare may generate USD 150 billion in annual savings by 2026, according to the Accenture analysis. AI-based smart workforce, culture, and solutions are consistently evolving to provide comfort to the healthcare industry in multiple ways, such as 7

  • Alleviating the burden on clinicians and giving medical professionals the tools to do their jobs more effectively.
  • Filling in gaps during the rising labor shortage in healthcare.
  • Enhancing efficiency, quality, and outcomes for patients.
  • Magnifying the reach of care by integrating health data across platforms.
  • Delivering benefits of greater efficiency, transparency, and interoperability.
  • Maintaining information security.

Healthcare Data

Hospitals, clinics, and medical and research institutes generate a large volume of data on a daily basis, which includes lab reports, imaging data, pathology reports, diagnostic reports, and drug information. Such data is expected to increase greatly in the next few years when people expand their use of smartphones, tablets, the IoT (Internet of things), and Fitness Gazette to generate information.8 Digital data is expected to reach 44 zettabytes by 2020, doubling every year.9 The rapid expansion of healthcare data is one of the greatest challenges for clinicians and physicians. Current literature suggests that big data ecosystem and AI are solutions to processing this massive data explosion along with meeting the social, financial, and technological demands of healthcare. Analysis of such big and complicated data is often difficult and it requires a high level of skill for data analysis. Moreover, the most challenging part is an interpretation of results and recommendations based on the outcome, and medical experience, and requires many years of medical involvement, knowledge, and specialized skill sets.

In healthcare the data are generated, collected, and stored in multiple formats including numerical, text, images, scans, and audios or videos. If we want to apply AI to our dataset, we first need to understand the nature of the data, and all questions that we want to answer from the target dataset. Data type helps us to formulate the neural network, algorithm, and architecture for AI modeling. Here, we introduce a few AI-based cases as examples to demonstrate the application of AI in healthcare, in general. Typically, it can be customized accordingly, based on the project and area of interest (that is, oncology, cardiology, pharmacology, internet medicine, primary care, urgent care, emergency, and radiology). Below is a list of AI applications based on the format of various datasets that are gaining momentum in the real world.

Healthcare Dataset: Pictures, Scans, Drawings

One of the most popular ways to generate data in healthcare is with images such as scan (PET Scan image with credit Susan Landau and William Jagust at UC Berkeley)10, tissue section11, drawing12, organ image13 (Figure 2A). In this scenario, specialists look for particular features in an image. A pathologist collects such images under the microscope from tissue sections (fat, muscle, bone, brain, liver biopsy, and so on). Recently, Kaggle organized the Intel and MobileODT Cervical Cancer Screening Competition to improve the precision and accuracy of cervical cancer screening using a big image data set (training, testing, and additional data set).14 The participants used different deep learning models such as the faster region-based convolution neural network (R-CNN) detection framework with VGG16,15 supervised semantics-preserving deep hashing (SSDH) (Figure 2B), and U-Net for convolutional networks.16 Dr Silva achieved 81 percent accuracy using the Intel® Deep Learning SDK and GoogLeNet* using Caffe* on the validation test.16

Similarly, Xu et al. investigated datasets of over 7,000 images of single red blood cells (RBCs) from eight patients with sickle cell disease. They selected the DNN classifier to classify the different RBC types.17 Gulshan et al. applied deep convolutional neural network (DCNN) in more than 10,000 retinal images collected from 874 patients to detect moderate and worse referable with about 90 percent sensitivity and specificity.18

Various types of healthcare image data

Figure 2. A) Various types of healthcare image data. B) Supervised semantics-preserving deep hashing (SSDH), a deep learning model, used in the Intel and MobileODT Cervical Cancer Screening Competition for image classification. Source: 10-13,16

Positron Emission Tomography (PET), computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound images (Figure 2A) are another source of healthcare data where images of tissue inside are collected from internal organ (like brain, tumors) without invasion. Deep learning models can be used to measure the tumor growth over time in cancer patients on medication. Jaeger et al. applied convolutional neural network (CNN) architecture on a diffusion-weighted MRI. Based on an estimation of the properties of the tumor tissue, this architecture reduced false-positive findings, and thereby decreased the number of unnecessary invasive biopsies. The researchers noticed that deep learning reduced the motion and vision error, and thus provided more stable results in comparison to manual segmentation.19 A study conducted in China showed that deep learning helped to achieve 93 percent accuracy in distinguishing malignant and benign cancer on the elastogram of ultrasound shear-wave elastography of 200 patients.20,21

Healthcare Dataset: Numerical

Example of numerical data

Figure 3. Example of numerical data.

Healthcare industries collect a lot of patient/research-related information such as age, height, weight, blood profile, lipid profile, sugar, blood pressure, and heart rate. Similarly, gene expression data (for example, fold change) and metabolic information (for example, level of metabolites) are also expressed by the numbers.

The literature showed several cases where the neural network was successfully applied in healthcare. For instance, Danaee and Ghaeini from Oregon State University (2017) used a deep architecture, stacked denoising autoencoder (SDAE) model, for the extraction of meaningful features from gene expression data of 1097 breast cancer and 113 healthy samples. This model enables the classification of breast cancer cells and identification of genes useful for cancer prediction (as biomarkers) or as the potential for therapeutic targets.22 Kaggle shared the breast cancer dataset from the University of Wisconsin containing formation radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, and fractal dimension of the cancer cell nucleus. In the Kaggle competition, the participants had successfully built a DNN classifier to predict breast cancer type (malignant or benign). 23

Healthcare Dataset: Textual

Example of textual data

Figure 4. Example of textual data.

Plenty of medical information is recorded as text; for instance, clinical data (cough, vomiting, drowsiness, and diagnosis), social, economic, and behavioral data (such as poor, rich, depressed, happy), social media reviews (Twitter, Facebook, Telegram*, and so on), and drug history. NLP, a type of neural network, translates free text into standardized data. It enhances the completeness and accuracy of electronic health records (EHRs). NLP algorithms extract risk factors from notes available on the EHR.
For example, NLP was applied on 21 million medical records. It identified 8500 patients who were at risk of developing congestive heart failure with 85 percent accuracy.24 The Department of Veterans Affairs used NLP techniques to review more than two billion EHR documents for indications of post-traumatic stress disorder (PTSD), depression, and potential self-harm in veteran patients.25 Similarly, NLP was used to identify psychosis with 100 percent accuracy on schizophrenic patients based on speech patterns.26 IBM Watson* analyzed 140,000 academic articles, which cannot be read, understood, or remembered by humans, and suggested recommendations about a course of therapy for cancer patients.24

27,31

Figure 5. Examples of electrogram data. Source:27,31

Healthcare Dataset: Electrogram

Architecture of deep learning with convolutional neural network model

Figure 6. Architecture of deep learning with convolutional neural network model useful in classification of EEG data. (Source: 28-29)

Electrocardiogram (ECG)27, electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), and sleep test are some examples of graphical healthcare data. Electrogram is the process of recording the electrical activity of the target organ (such as heart, brain, and muscle) over a period of time using electrodes placed on the skin. 

Schirrmeister et al. from the University of Freiburg designed and trained a deep ConvNets (deep learning with convolutional network) model to decode raw EEG data, which is useful for EEG-based brain mapping.28,29 Paurbabaee et al. from Concordia University, Canada used a large volume of raw ECG time-series data and built a DCNN model. Interestingly, this model learned key features of the paroxysmal atrial fibrillation (PAF)—a life-threatening heart disease, and was thereby useful in PAF patient screening. This method can be a good alternative to traditional ad hoc and time-consuming user's handcrafted features.30 Sleep stage classification is an important preliminary exam of sleep disorders. Using 61 polysomnography (PSG) time series data, Chambon et al. built a deep learning model for classification of sleep stage. The model showed a better performance, relative to traditional method, with little run time and computational cost.31

Healthcare Dataset: Audio and Video

Example of audio data

Figure 7. Example of audio data.

Sound event detection (SED) deals with detection of the onset and offset times for each sound event in an audio recording and associates a textual descriptor. SED has been drawing great interest recently in the healthcare domain for healthcare monitoring. Cakir et al. combined CNNs and RNNs in a convolutional recurrent neural network (CRNN) and applied it to a polyphonic sound event detection task. They observed a considerable improvement in the CRNN model.32

Videos are a sequence of images; in some cases they can be considered as a time series, and in very particular cases as dynamical systems. Deep learning techniques helps researchers in both computer vision and multimedia communities to boost the performance of video analysis significantly and initiate new research directions to analyze video content. Microsoft started a research project called InnerEye* that uses machine learning technology to build innovative tools for the automatic, quantitative analysis of three-dimensional radiological images. Project InnerEye employs algorithms such as deep decision forests as well as CNNs for the automatic, voxel-wise analysis of medical images.33 Khorrami et al. built a model on videos from the Audio/Visual Emotion Challenge (AVEC 2015) using both RNNs and CNNs, and performed emotion recognition on video data.34

Healthcare Dataset: Molecular Structure

Molecular structure of 4CDG

Figure 8. Molecular structure of 4CDG (Source: rcbs.org)

Figure 8 shows a typical example of the molecular structure of one drug molecule. Generally, the design of a new molecule is associated with the historical dataset of old molecules. In quantitative structure-activity relationship (QSAR) analysis, scientists try to find known and novel patterns between structures and activity. At the Merck Research Laboratory, Ma et al. used a dataset of thousands of compounds (about 5000), and built a model based on the architecture of DNNs (deep neural nets).35 In another QSAR study, Dahl et al. built neural network models on 19 datasets of 2,000‒14,000 compounds to predict the activity of new compounds.36 Aliper and colleagues built a deep neural network–support vector machine (DNN–SVM) model that was trained on a large transcriptional response dataset and classified various drugs into therapeutic categories.37 Tavanaei developed a convolutional neural network model to classify tumor suppression genes and proto-oncogenes with 82.57 percent accuracy. This model was trained on tertiary structures proteins obtained from protein data bank.38 AtomNet* is the first structure-based DCNN. It incorporates structural target information and consequently predicts the bioactivity of small molecules. This application worked successfully to predict new, active molecules for targets with no previously known modulators.39

AI: Solving Healthcare Problems

Here are a few practical examples where AI developers, startups, and institutes are building and testing AI models:

  • As emotional intelligence indicators that detect subtle cues in speech, inflection, or gesture to assess a person’s mood and feelings
  • Help in tuberculosis detection
  • Help in the treatment of PTSD
  • AI chatbots (Florence*, SafedrugBot*, Babylon Health*, SimSensei*)
  • Virtual assistants in helping patients and clinicians
  • Verifying insurance
  • Smart robots that explain lab reports
  • Aging-based AI centers
  • Improving clinical documentation
  • Personalized medicine

Data Science and Health Professionals: A Combined Approach

Deep learning has great potential to help medical and paramedical practitioners by:

  • Reducing the human error rate40 and workload
  • Helping in diagnosis and the prognosis of disease
  • Analyzing complex data and building a report

The examination of thousands of images is complex, time consuming, and labor intensive. How can AI help?

A team from Harvard Medical School’s Beth Israel Deaconess Medical Center noticed a 2.9 percent error rate with the AI model and a 3.5 percent error rate with pathologists for breast cancer diagnosis. Interestingly, the pairing of “deep learning with pathologist” showed a 0.5 percent error rate, which is an 85 percent drop.40 Litjens et al. suggest that deep learning holds great promise in improving the efficacy of prostate cancer diagnosis and breast cancer staging. 41,42

Intel® AI Academy

Intel provides educational software and hardware support to health professionals, data scientist and AI developers, and makes available free AI training and tools through the Intel® AI Academy.

Intel recently published a series of AI hands-on tutorials, walking through the process of AI project development, step-by-step. Here you will learn:

  • Ideation and planning
  • Technology and infrastructure
  • How to build an AI model (data and modeling)
  • How to build and deploy an app (app development and deployment)

Intel is committed to providing a solution for your healthcare project. Please read the article on the Intel AI Academy to learn more about solutions using Intel® architecture (Intel® Processors for Deep Learning Training). In the next article, we explore examples of healthcare datasets where you will learn how to apply deep learning. Intel is committed to help you to achieve your project goals.

References

  1. Faggella, D. Machine Learning Healthcare Applications – 2018 and Beyond. Techemergence.
  2. Artificial intelligence in healthcare - Wikipedia. (Accessed: 12th February 2018)
  3. Intel® Math Kernel Library for Deep Learning Networks: Part 1–Overview and Installation | Intel® Software. (Accessed: 14th February 2018)
  4. Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015).
  5. Mamoshina, P., Vieira, A., Putin, E. & Zhavoronkov, A. Applications of Deep Learning in Biomedicine. Molecular Pharmaceutics13, 1445–1454 (2016).
  6. From $600 M to $6 Billion, Artificial Intelligence Systems Poised for Dramatic Market Expansion in Healthcare. (Accessed: 12th February 2018)
  7. Accenture. Artificial Intelligence in Healthcare | Accenture.
  8. Marr, B. How AI And Deep Learning Are Now Used To Diagnose Cancer. Foboes
  9. Executive Summary: Data Growth, Business Opportunities, and the IT Imperatives | The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things. Available at: . (Accessed: 12th February 2018)
  10. Lifelong brain-stimulating habits linked to lower Alzheimer’s protein levels | Berkeley News. (Accessed: 21st February 2018)
  11. Emphysema H and E.jpg - Wikimedia Commons (Accessed : 23rd February 2018). https://commons.wikimedia.org/wiki/File:Emphysema_H_and_E.jpg
  12. Superficie_ustioni.jpg (696×780). (Accessed: 23rd February 2018). https://upload.wikimedia.org/wikipedia/commons/1/1b/Superficie_ustioni.jpg
  13. Heart_frontally_PDA.jpg (1351×1593). (Accessed: 27th February 2018).  https://upload.wikimedia.org/wikipedia/commons/5/57/Heart_frontally_PDA.jpg
  14. Kaggle competition-Intel & MobileODT Cervical Cancer Screening. Intel & MobileODT Cervical Cancer Screening. Which cancer treatment will be most effective? (2017).
  15. Intel and MobileODT* Competition on Kaggle*. Faster Convolutional Neural Network Models Improve the Screening of Cervical Cancer. December 22 (2017).
  16. Kaggle*, I. and M. C. on. Deep Learning Improves Cervical Cancer Accuracy by 81%, using Intel Technology. December 22 (2017).
  17. Xu, M. et al. A deep convolutional neural network for classification of red blood cells in sickle cell anemia. PLoS Comput. Biol.13, 1–27 (2017).
  18. Gulshan, V. et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA316, 2402 (2016).
  19. Jäge, P. F. et al. Revealing hidden potentials of the q-space signal in breast cancer. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics)10433 LNCS, 664–671 (2017).
  20. Ali, A.-R. Deep Learning in Oncology – Applications in Fighting Cancer. September 14 (2017).
  21. Zhang, Q. et al. Sonoelastomics for Breast Tumor Classification: A Radiomics Approach with Clustering-Based Feature Selection on Sonoelastography. Ultrasound Med. Biol.43, 1058–1069 (2017).
  22. Danaee, P., Ghaeini, R. & Hendrix, D. A. A deep learning approach for cancer detection and relevant gene indentification. Pac. Symp. Biocomput.22, 219–229 (2017).
  23. Kaggle: Breast Cancer Diagnosis Wisconsin. Breast Cancer Wisconsin (Diagnostic) Data Set: Predict whether the cancer is benign or malignant.
  24. What is the Role of Natural Language Processing in Healthcare? (Accessed: 1st February 2018)
  25. VA uses EHRs, natural language processing to spot suicide risks. (Accessed: 1st February 2018)
  26. Predictive Analytics, NLP Flag Psychosis with 100% Accuracy. (Accessed: 1st February 2018)
  27. Heart_block.png (450×651). (Accessed: 23rd February 2018)
  28. Schirrmeister, R. T. et al. Deep learning with convolutional neural networks for brain mapping and decoding of movement-related information from the human EEG Short title: Convolutional neural networks in EEG analysis. (2017).
  29. Schirrmeister, R. T. et al. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp.38, 5391–5420 (2017).
  30. Pourbabaee, B., Roshtkhari, M. J. & Khorasani, K. Deep Convolutional Neural Networks and Learning ECG Features for Screening Paroxysmal Atrial Fibrillation Patients. IEEE Trans. Syst. Man, Cybern. Syst. 1–10 (2017). doi:10.1109/TSMC.2017.2705582
  31. Chambon, S., Galtier, M. N., Arnal, P. J., Wainrib, G. & Gramfort, A. A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series. arXiv:1707.0332v2 (2017).
  32. Cakir, E., Parascandolo, G., Heittola, T., Huttunen, H. & Virtanen, T. Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection. IEEE/ACM Trans. Audio, Speech, Lang. Process.25, 1291–1303 (2017).
  33. Project InnerEye – Medical Imaging AI to Empower Clinicians. Microsoft
  34. Khorrami, P., Le Paine, T., Brady, K., Dagli, C. & Huang, T. S. HOW DEEP NEURAL NETWORKS CAN IMPROVE EMOTION RECOGNITION ON VIDEO DATA.
  35. Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E. & Svetnik, V. Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model.55, 263–274 (2015).
  36. Dahl, G. E., Jaitly, N. & Salakhutdinov, R. Multi-task Neural Networks for QSAR Predictions. (University of Toronto, Canada. Retrieved from http://arxiv.org/abs/1406.1231, 2014).
  37. Aliper, A. et al. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm.13, 2524–2530 (2016).
  38. Tavanaei, A., Anandanadarajah, N., Maida, A. & Loganantharaj, R. A Deep Learning Model for Predicting Tumor Suppressor Genes and Oncogenes from PDB Structure. bioRxiv  October 22, 1–10 (2017).
  39. Wallach, I., Dzamba, M. & Heifets, A. AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery. 1–11 (2015). doi:10.1007/s10618-010-0175-9
  40. Kontzer, T. Deep Learning Drops Error Rate for Breast Cancer Diagnoses by 85%. September 19 (2016).
  41. Litjens, G. et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci. Rep.6, (2016).
  42. Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal.42, 60–88 (2017).
Viewing all 3384 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>