Investigation into the capabilities of GANs, conducted by Intel Student Ambassador Prajjwal Bhargava, provides insights into using Intel® architecture-based frameworks to understand and create practical applications using this technology.
"GANs' (generative adversarial networks) potential is huge, because they can learn to mimic any distribution of data. That is, GANs can be taught to create worlds eerily similar to our own in any domain: images, music, speech, prose. They are robot artists in a sense, and their output is impressive—poignant even."1
Excerpt from "GAN: A Beginner's Guide to Generative Adversarial Networks"
Challenge
Past efforts at building unsupervised learning capabilities into deep neural networks have been largely unsuccessful. A new modeling approach that uses opposing neural networks, one functioning as a generator and the other as a discriminator, has opened innovative avenues for research and practical applications.
Solution
The possibilities of using GANs to accelerate deep learning in an unsupervised training environment are progressively being revealed through ongoing exploration and experimentation. Prajjwal's work in this area promises to uncover paths likely to yield positive results as applications move from speculative to real-world implementations.
Background and Project History
An increasingly important area of generative modeling, known as generative adversarial networks (GANs), offers a means to endow computers with a better understanding of the surrounding world through unsupervised learning techniques. This field of inquiry has been the focus of Prajjwal Bhargava in his work for the Intel® AI Academy.
Prior to becoming a Student Ambassador for the Intel AI Academy, Prajjwal sharpened his expertise in convolutional neural networks for image recognition, data structure and algorithms, deep-learning coding techniques, and machine learning. These topics have been useful in his research on GANs. "Initially, I started off with computer vision," Prajjwal said. "Back then I was learning how convolutional neural networks worked and how they do what they do. That required going deeper into the architectures." After getting into them, he started working with Recurrent Neural Networks (RNNs) and complex architectures like Long Short-Term Memory (LSTM)."
"I later learned more about GANs," he continued, "and it was quite fascinating to me. I knew there were some significant challenges. For example, training a GAN— with the generator and discriminator getting updated independently—can have a serious impact on reaching convergence."
Prajjwal observed that the original GAN paper didn't fully address this issue. It became clear that a different mechanism was needed for effectively resolving this problem. He looked into the issue further and found the paper describing this approach, "Wasserstein GAN", to be very influential and revolutionary.
"The theory was explained brilliantly and it supported their experiment well," Prajjwal said. From this perspective, he started working on implementations using a variety of architectures to see which approaches could yield the greatest results.
"Since Ian Goodfellow presented his landmark paper at the NIPS [Neural Information Processing Systems] conference in 2014, I've always felt that this architecture [GANs] is quite revolutionary by itself. I feel that these networks have changed the way we look at deep learning compared to a few years back. It has enabled us to visualize data in ways that couldn't have been accomplished through other techniques."
Prajjwal Bhargava, Student Ambassador for Artificial Intelligence, Intel AI Academy
Prajjwal has been working on GANs for over a year, and he doesn't see an end to his research. Each new research paper that is published offers fresh ideas and different perspectives. His own paper, "Better Generative Modeling through Wasserstein GANs," provides the insights he has gained over the course of his work with Intel AI Academy.
"I want to try all possible variants of GANs," Prajjwal said. "There are so many, each one performing a new task in the best possible manner. However, I think the future calls for something universal and I think this applies to GANs as well. The more we are able to let our network generalize, the better it is. There's so much more to do and hopefully I will continue to contribute towards this research."
"Training and sampling from generative models is an excellent test of our ability to represent and manipulate high-dimensional probability distributions. High-dimensional probability distributions are important objects in a wide variety of applied math and engineering domains."2
Ian Goodfellow, Staff Research Scientist, Google Brain
Key Findings of the Experimentation
As Prajjwal continues to research GAN variants, the work that he has accomplished so far has led him to a key conclusion. In summary, he noted, "GANs are essentially models that try to learn distribution of real data by minimizing divergence (difference in probability distribution) through generation of adversarial data. In the original [Goodfellow] paper, convergence in mix max objective is interpreted as minimizing Jensen-Shannon divergence. Wasserstein is a better alternative than using Jensen-Shannon divergence. It gives a smooth representation in between."
"If we have two probability distributions—P and Q—there is no overlap when they are not equal, but when they are equal, the two distributions just overlap," Prajjwal continued. "If we calculate D(kl), we get infinity if two distributions are disjoint. So, the value of D(js) jumps off and the curve isn't differentiable: Ɵ is 0."
"The Wasserstein metric provides a smooth measure. This helps ensure a stable learning process using gradient descents," he added.
Real Samples
Figure 1. Double feedback loop used for a generative adversarial network (GAN).
The research being done on GANs suggests a wide variety of use cases across multiple industries, Prajjwal believes. Some of the promising possibilities include the following:
- Accelerating drug discovery and finding cures for previously incurable diseases. The Generator could propose a drug for treatment and the Discriminator could determine whether the drug would be likely to produce a positive outcome.
- Advancing molecule development in oncology, generating new anti-cancer molecules within a defined set of parameters.
- Performing text translation to describe the content of images accurately.
- Generating super-resolved images from downsampled original images to improve the perceptual qualities.
- Boosting creativity in fields where variety and innovation are important, such as fashion or design.
"Unsupervised learning is the next frontier in artificial intelligence," Prajjwal said, "and we are moving rapidly in that direction, even though we still have a long way to go."
Enabling Technologies
The primary enabling technologies that were used for research during this project include:
- PyTorch*, which includes the use of the Intel® Math Kernel Library (Intel® MKL), is a library based on Python* that was used to build the architecture for GAN research.
- Intel® AI DevCloud powered by Intel® Xeon Phi™ processors (current versions of the Intel AI DevCloud use Intel® Xeon® Scalable processors).
"Intel MKL was really useful for optimizing matrix calculations and vector operations on my platform," Prajjwal commented, "and I have gone through technical articles on the Intel® Developer Zone (Intel® DZ) to better understand how to improve optimization on the architecture that I was using. A number of tutorials targeting Intel architecture were also quite useful."
One of the key challenges that Prajjwal encountered was training GANs efficiently on Intel architecture-based systems. The difficulties included managing updates for the Generator and Discriminator concurrently, rather than independently. As it stands, reaching convergence can be a challenge. Part of the solution will require optimizing the training models so that the workflow proceeds more efficiently, taking better advantage of Intel architecture capabilities and built-in features.
"It's been a year since I started working with Intel in the Intel AI Academy," Prajjwal noted. "And over this time, I've learned a lot. I've received much help and gained expertise working with Intel architecture-based hardware. It's great to see so many other Student Ambassadors working across the world in the same field. I've gotten to know so many people through conferences and online communities. Intel goes a long way to share the projects that we've done so that Student Ambassadors get recognition. Also, Intel provides a really good platform to publish our research and findings. I am really grateful that I got to become part of this AI Academy program and hope to do some more great work in the future."
AI is Expanding the Boundaries of Generative Modeling
Through the design and development of specialized chips, sponsored research, educational outreach, and industry partnerships, Intel is frmly committed to advancing the state of artifcial intelligence (AI) to solve difcult challenges in medicine, manufacturing, agriculture, scientifc research, and other industry sectors. Intel works closely with government organizations, non-government organizations, educational institutions, and corporations to advance solutions that address major challenges in the sciences.
In terms of real-world applications of GAN techniques, the collaborative work accomplished by the NASA Frontier Development Lab (FDL) offers a striking example. FDL brings together companies, Intel being one, to share resources and expertise in a cooperative effort to solve space exploration challenges.
During the Planetary Defense segment of the 2016 session, a GAN was developed to help detect potentially hazardous asteroids and determine the shape and the spin axis of the asteroid.
One of the participants on this project, Adam Cobb, described the challenge of handling the input data: "Our predominant form of input data consisted of a series of delay-Doppler images. These are radar images that are defned in both time delay and frequency. Although to the untrained eye...these images might look like they are optical images, they actually have a non-unique mapping to the true asteroid shape. This many-to-one relationship added an extra level of complexity to the already difcult challenge of going from 2D to 3D representations. In order to go about solving this task we applied deep-learning architectures such as autoencoders, variational autoencoders, and generative adversarial networks to generate asteroid shapes and achieved promising results."
Beyond the challenge of asteroid shape modeling, another challenge in the Planetary Defense area, Asteroid "Deflector Selector" Decision Support, used machine learning to determine the most effective deflection strategies to prevent an asteroid from colliding with Earth (see Figure 2 for an artist's rendering of this scenario).
Figure 2. NASA rendering of an asteroid in proximity to Earth.
The NASA FDL is hosted by The SETI Institute in Mountain View, California with support from the NASA Ames Research Center. Intel provided hardware and technology, software and training, as well as expertise to the endeavor. Other corporate participants included NVIDIA Corporation, IBM* and Lockheed Martin, ESA, SpaceResources Luxembourg, USC MASCLE, Kx Systems*, and Miso Technologies.
In these early stages of AI, at a time when commercial GAN implementations haven’t been widely released to the field, some of the best examples of the potential of this technique come from research papers and student implementations exploring the mechanisms to discover how GANs can be applied to real-world scenarios.
One of the more interesting examples along this line is image-to-image translation with CycleGANs. A collection of resources on this topic, including code, interactive demos, videos, and a research paper, have been compiled by members of the University of California, Berkeley research team and can be found here: https://phillipi.github.io/pix2pix/.
In image-to-image translation, the goal is to learn the mapping between an input image and an output image, using a training set of aligned image pairs. Practically speaking, paired training is not usually available, wherein the network can learn the mapping from domain X to domain Y. The objective in this approach is to learn a mapping G: X → Y, such that the distribution of images from G(X) is indistinguishable from the distribution Y, using adversarial loss.
Preformatter images that maintain a strong correlation between both domains are required, but getting data to accomplish that can be time consuming and ineffective. CycleGANs build upon a pix2pix architecture, which supports modeling of unpaired collections of images, and, in the process, it can learn to translate the image between two aesthetics without tightly integrating matches into a single X/Y training image.
Figure three shows some specifc image-to-image translation processes that highlight the capabilities of a CycleGAN.4
The Intel® AI technologies used in this implementation included:
Intel Xeon Scalable processors: Tackle AI challenges with a compute architecture optimized for a broad range of AI workloads, including deep learning
Framework optimization: Achieve faster training of deep neural networks on a robust scalable infrastructure.
For Intel AI Academy members, the Intel AI DevCloud provides a cloud platform and framework for machine-learning and deeplearning training. Powered by Intel Xeon Scalable processors, the Intel AI DevCloud is available for up to 30 days of free remote access to support projects by academy members
Join today at: https://software.intel.com/ai/sign-up
For a complete look at our AI portfolio, visit https://ai.intel.com/technology.
Figure 3. Image-to-image translation examples (courtesy of Berkeley AI researchers).
"At Intel, we're encouraged by the impact that AI is having, driven by its rich community of developers. AI is mapping the brain in real time, discovering ancient lost cities, identifying resources for lunar exploration, helping to protect Earth's oceans, and fighting fraud that costs the world billions of dollars per year, to name just a few projects. It is our privilege to support this community as it delivers world-changing AI across verticals, use cases, and geographies."5
Naveen Rao, Vice President and General Manager, Artificial Intelligence Products Group, Intel
Resources
- Intel® AI Academy
- Inside Artificial Intelligence – Next-level computing powered by Intel AI
- Intel® Math Kernel Library
- Intel® AI Dev Cloud
- Intel® Developer Mesh
- Better Generative Modeling through Wasserstein GANs
- Generative Models
- Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
- Improved Techniques for Training GANs. Ian Goodfellow, et al.
- GAN: A Beginner's Guide to Generative Adversarial Networks
- Innovation in AI success stories
- OpenVINO* toolkit
- "GAN: A Beginner's Guide to Generative Adversarial Networks." DL4J. 2017.
- Goodfellow, Ian. "NIPS 2016 Tutorial." 2016.
- Cobb, Adam. "3D Shape Modelling of Asteroids." 2017.
- Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, Alexei Efros. "Image-to-Image Translation with Conditional Adversarial Networks." Berkeley AI Research Laboratory. 2017.
- Rao, Naveen. "Helping Developers Make AI Real." Intel. May 16, 2018.