By Shwetha Doss, Sr. Application Engineer, Intel Corporation
Chethan Raj, Developer, CodeCraft Technologies/Focus Medica
Abstract
Perceptual computing is reshaping the way we interact with our devices, making it more natural, intuitive, and immersive. Devices will be able to perceive our actions through hand gestures, finger articulation, speech recognition, face tracking, augmented reality, and more. To support perceptual computing, Intel introduced the Intel® Perceptual Computing SDK, a library for pattern detection and recognition algorithms.
Focus Medica’s Anatomy of Heart application was developed using Intel Perceptual Computing technology. Finger tracking and gesture recognition features made the application user-friendly and flexible. It is easy to navigate to any section any time, making this a unique, first of its kind learning tool ideal for understanding the anatomy of the heart. The gestures allow the user to learn about the clinical aspects virtually with the flick of a finger. Anatomy of Heart offers vast amounts of information in small chunks, which helps in higher retention of information. Navigating through this content is made easy by gesture support.
1. Introduction
Intel introduced Ultrabook™ systems like tablets, convertibles, and hybrids with touch functionality making 2D interactions with applications easy. As a next step, we need to include multi model interactions that combine voice, gesture, and facial expressions, so that devices can see us, hear us, perceive our intentions, and interact with us. This man-machine interface should be natural, intuitive, and immersive and presents a great opportunity for developers to create interesting new use cases and applications. Perceptual computing is not just limited to Ultrabook™ systems and phones. It is the future of computing in automobiles, refrigerators, home automation, and beyond.
To help developers create man-machine interactions that are natural, intuitive, and immersive, Intel has introduced the Intel Perceptual Computing SDK.
2. Intel Perceptual Computing SDK
The Intel Perceptual Computing SDK is a library of pattern detection and recognition algorithm implementations exposed through standardized interfaces. The library aims to lower barriers to using these algorithms and shift the application developers’ focus from coding the algorithm details to innovating on the usage of these algorithms for next generation human-computer experience. Intel Perceptual Computing SDK supports different modules related to speech; facial tracking; and close-range tracking which includes finger tracking, hand tracking, gestures, and 2D/3D object tracking.
Intel Perceptual Computing SDK requires a Creative* Senz 3D camera, which is small, light-weight, USB-powered camera optimized for close-range interactivity. It is a USB plugin and is designed for ease of setup and portability. It includes an HD webcam, infrared depth sensor camera, and built-in dual-array microphones for capturing and recognizing voice, gestures, and images.
Figure 1: Creative* Senz 3D camera
2.1 Finger tracking and gesture recognition
Intel Perceptual Computing SDK finger tracking module tracks hands and finger locations and performs pose/gesture recognition. The module produces four types of processing results: blob information, geometric node tracking result, pose/gesture notification, and alert notification.
Geometric nodes are skeleton joints on a human body or those of a localized body part.
Figure 2: Hand Labels
The SDK module recognizes a set of predefined poses and gestures and returns the recognition results. Poses are static hand and finger positions defined to deliver certain meanings.
Navigation Gestures | Swipe left/right/up/down |
Hand gestures | Wave, circle |
Pose gestures | Thumbs up/down, peace, big 5 |
Figure 3: Gesture Labels
2.2. Anatomy of Heart application
Focus Medica, a subsidiary of Panther Publishers, develops high-end medical reference material in the form of medical animations and printed content with images and illustrations. They are among the Top 100 applications in the education segment on the iTunes store with an expanding footprint on Android and a recent entry into the Windows* Store in partnership with Intel.
Focus Medica developed the Anatomy of the Heart application that uses technology, especially visual media, in teaching and learning about anatomical aspects of the human body. The powerful 3D animation used in the app help showcase the anatomical aspects of the heart and its functions. Accompanied by audio, the app creates a unique and very real experience for the viewer.
Figure 4: Anatomy of Heart application
2.3 Controlling the application using gestures
The Anatomy of Heart application was developed on Microsoft Windows 8 32-bit operating system, using C# and Visual Studio* 2012.
The Intel Perceptual Computing SDK exposes the C# interfaces as a dynamic link library. The DLL supports Microsoft .NET framework 4.0. We need to set our project/solution build configuration to target the x86 or x64 platform.
- $(PCSDK_DIR)/bin/Win32/libpxcclr.dll for the 32-bit OS
- $(PCSDK_DIR)/bin/x64/libpxcclr.dll for the 64-bit OS
We need to add the DLL as a reference in the Visual Studio project. If we reference the wrong version of the DLL, pxcmStatus. PXCM_STATUS_ITEM_UNAVAILABLE will be returned.
2.3.1 Programming using the utility class
The application uses the UtilMPipeline utility class to initiate gesture recognition. This is the simplest approach, as the UtilMPipeline utility class implements all required steps such as locating an I/O device and streaming data between the I/O device and the gesture recognition module. The tradeoff is that customization of the configuration and data streaming process is limited to the filtering functions and event callbacks that Util[M]Pipeline provides.
The application calls the EnableGesture( ) to enable gestures and EnableImage( ) to display the user’s hand gestures. This is done in the constructor of UtilMPipeline.
Figure 5: Calling the Enable Gesture ( ). The application overrides the OnGesture( ) to receive pose/gesture notification. GestureValueCallback( ) calls the bw_Dowork() function that implements the actions based on the gestures.
Figure 6: Calling the Gesture Labels
2.3.2 Creating a background worker thread
The application calls the LoopFrames function to initialize the pipeline and pass data among the pipeline components. LoopFrames( ) runs in an asynchronous mode, and it’s a loop to capture new frames. Since it’s in synchronous mode, the UI cannot handle any user’s gestures/inputs. To overcome this, we need to create a background worker thread.
Figure 7: Calling the background worker thread
The background thread worker calls bw_DoWork( ), which calls DoHeartFunction( ), which performs specific actions based on the user’s gestures.
Figure 8: Calling the function from the background thread
Figure 9: Calling the gesture labels from the DoHeartFunction( )
3. Summary
With Intel Perceptual Computing SDK, developers can add natural, immersive user experiences to their applications. With the new usage models and algorithms they can differentiate their applications and accelerate the application’s adoption in the market.