By Benjamin A. Lieberman, PhD
One of the biggest challenges in a data-rich world is finding information of relevance, particularly when the information you’re seeking is visual in nature. Imagine looking for a specific image file but being limited to a text search of descriptions of the image or laboriously scanning thumbnail pictures one by one. How can you know if the image file was properly categorized? Or categorized at all? What if you need to pick out a single image from tens of thousands of other, similar images?
The engineers at Intel have developed the Intel® Magnifying Lens Tool, an innovative and exciting approach to solving this problem, and led the effort to develop the first web app based on this approach, MagLens. The MagLens technology shows great promise for changing the way individuals approach mass storage of personal information, including images, text, and video. The technology is part of an ongoing effort at Intel to change the way information is captured, explored, and used.
Images and Files Everywhere
Today, most people carry a camera multiple hours a day in the form of a smartphone. More and more of our personal content is online—text, video, pictures, books, movies, social media, and more. The list seems to grow with each passing day.
Users are increasingly storing their files and content in cloud. Service providers like Amazon, Apple, and Google are all making this migration easier, safer, and less expensive. Personal content is available 24 hours a day, 7 days a week, 365 days a year from practically any mobile device.
Unfortunately, old-style storage structures based on hierarchical folders present a serious barrier to the optimized use of data. Specifically, the classical file storage structures are prone to poor design, mismanagement, neglect, and misuse. Unless users are well organized, these file shares rapidly become a dumping ground, much like books stacked in a pile (Figure 1). The books in the image are clearly organized but not terribly usable. As a result, it becomes increasingly difficult to locate relevant information in these “stacks,” particularly when the current techniques of textual search are applied to visual content.
Figure 1. I know it must be in here somewhere...
In the exploding cloud-storage space, it’s now possible to store tens of thousands of image-related information online. These files typically accumulate over time and are often as haphazardly stored as they are created. Unlike a well-regulated research library, little or no metadata (for example, date, location, subject matter, or topic) are created with the images, making a textual search all but impossible. How can you find the file you want when it could be anywhere, surrounded by virtually anything?
Rapidly scanning this vast forest of files for a single image is a daunting task. Consider the steps involved. First is the desire to locate information of interest: What differentiates that information from other information like it? Next, you attempt to remember where that information was stored. Was it on a local drive or a cloud server; if it’s in the cloud, which service? What was the folder structure? Or is there just a collection of multiple files all stored in one place? Now how do you recognize the file of interest? It’s doubtful that the file name will be helpful (which may be something automatically generated and cryptic, such as 73940-a-200.jpg), and thumbnail images are difficult to see clearly, even on a high-definition display. What’s required is some method to rapidly scan the stored images for specific patterns of shape and color by leveraging our highly evolved visual sense.
MagLens Expands the Options for Discovery
Many years of research in neurobiology and cognitive science have shown that human thinking and pattern recognition are geared toward the visual space. Our brains have evolved to handle the complex (and necessary-to-survival) problem of determining whether the fuzzy shape behind the next clump of bushes is a large, carnivorous animal that would like to invite us over for a quick bite. These days, we’re faced with choices that have somewhat less terminal outcomes, such as finding a set of photos from our most recent vacation. Nevertheless, we would be aided in our task if we could use our well-honed pattern-discovery and matching efficiently.
The Intel® Magnifying Lens Tool is both simple and profound. Like many successful metaphors in computing, the idea is that you should be able to rapidly scan across a visual field, with the focus of attention (that is, the focal point) matching the greatest detailed magnification of an image (Figure 2). Around the focus image, other images are magnified, as well, but in a decreasing amount as you move away from the focal point, similar to the way icon magnification on Mac OS X* works as you pass the cursor over the application Dock. With MagLens, the visual field is the full screen space (rather than a linear bar), allowing a rapid scan across thousands of images in a few seconds. All the images remain in view at all times, varying by size as the focus of attention scans across the field of vision.
Figure 2. MagLens* technology allows dynamic exploration of a visual space.
Contrast this technology with previous attempts at magnification, where the magnified portion of the image blocks out other vital information in the view (Figure 3). Even if the magnification moves around the screen, only the area directly under the cursor is visible: the remainder of the image is blocked from view, hampering the ability of your visual systems to recognize patterns. You have to engage your short-term memory to temporarily store one screen view, and then mentally compare it with the next. In contrast, the MagLens approach makes the entire view available for pattern matching.
Figure 3. Zooming a section of the image obscures large areas of the non-highlighted image.
This “elastic presentation space” approach differs from previous attempts at rapid scanning, such as by “flipping” pages of thumbnails, or dragging a scroll bar, in that it simultaneously gives you a natural scan of the information field (much like how your eyes normally scan a complex visual field), dynamically increasing the level of detail at the point of visual attention. Combined with the natural gesture recognition that 3D recognition technology (such as the Intel® RealSense™ technology) provides, this technique opens the visual computation space to a wide range of applications. To explore this option, the development team integrated Intel RealSense technology into the prototype to optimize the application for a wide range of content.
Where We Have Been
The research on what would become MagLens began as four years of Intel-sponsored research (1997–2001) by Sheelagh Carpendale, who was working on her doctoral dissertation at the University of Calgary (see “For More Information”). Although the approach she devised has been discussed and written about extensively, there has to date been no successful adoption of a widespread technological approach. John Light at Intel began to pursue a prototype using Dr. Carpendale’s “elastic presentation space” idea.
Light’s team created a prototype that took advantage of modern computing power to allow users to view hundreds of images at the same time.
Shardul Golwalkar, an Intel intern at the time, expanded this prototype into a more usable proof of concept. The expansion project began with text-heavy content such as textbooks and was later expanded to more visual content exploration through magazines and news publications. Employing a user-centric development technique called Design Thinking (see the sidebar “Design Thinking Overview”), Shardul broadened the prototype into a functional web-enabled space, where it would be possible to perform the visual scan through a standard web browser interface.
At the conclusion of his internship, Shardul continued with the idea as an undergraduate at Arizona State University. During this time, he continued to explore the optimization of the technology and supported project development. Together, Shardul and Light demonstrated that it was possible to model 40,000 simultaneous images to the “discovery space” and enable a multimodal understanding of the data space using both gestures and vision. At the end of this effort the team had succeeded in creating an interface that was intuitive, powerful, and empowering for user-driven self-discovery of new capabilities—a delight for the user.
Where We Are
When the initial development was complete, there was interest at Intel in moving to a more commercial product. Intel sponsored the company Empirical to develop the Intel® Magnifying Lens Tool and move it toward a 2015 product release. Developers at Empirical reworked the original development, building new workflows and making the overall experience more polished and performant. See “For More Information” for a link to the current product, or click here.
A major goal of the initial development was to allow users connected to Internet file shares (such as Google Drive*) to view cloud-based files and ultimately enjoy integration across multiple cloud file stores. The product was optimized for web use, especially for touch screen display devices such as All-in-One desktops, 2 in 1s, notebooks, and tablets. Using MagLens, users no longer need to know the file storage hierarchy to find materials. The MagLens site collects all the identified file stores and “flattens” the visualization to a single, scalable 2D space. Now, it’s possible to locate a file of interest regardless of where it resides.
Imagine the Possibilities
Intel selected Empirical to develop the MagLens concept into a viable product based on its years of experience with product design and development. The Intel collaboration with Empirical has discovered many possible applications for MagLens. Indeed, Intel is open to licensing the MagLens code to software vendors, original equipment manufacturers, cloud services providers, and operating system developers—to expand the concept beyond photos and images to applications involving magazines, photography, films, and visualization of complex multimedia information. The Intel contact for licensing inquiries is Mike.Premi@intel.com.
Research is also continuing on the core concept of browsing through additional metadata to enable exploration and sorting for likely conceptual matches, such as a filter or clustering algorithm that gathers similar images (i.e., a digital photo library). Other techniques include using algorithms for facial recognition and integration as a utility of core operating systems.
MagLens shows great promise for changing the ideas around information discovery, organization, and integration. The future of this technology is limited only by our ability to see the possibilities.
References
Marianne Sheelagh Therese Carpendale, “A Framework for Elastic Presentation Space” (doctoral dissertation, Simon Fraser University, 1999), http://pages.cpsc.ucalgary.ca/~sheelagh/wiki/pmwiki.php?n=Main.Thesis
Roger Chandler, “Adventures in Design Thinking,” (2015). https://software.intel.com/en-us/blogs/2015/06/09/adventures-in-design-thinking
Learn more about the MagLens application from Empirical and Intel at mag-lens.com.
About the Author
Ben Lieberman holds a PhD in biophysics and genetics from the University of Colorado, Health Sciences Center. Dr. Lieberman serves as principal architect for BioLogic Software Consulting, bringing more than 20 years of software architecture and IT experience in various fields, including telecommunications, rocket engineering, airline travel, e-commerce, government, financial services, and the life sciences. Dr. Lieberman bases his consulting services on the best practices of software development, with specialization in object-oriented architectures and distributed computing—in particular, Java*-based systems and distributed website development, XML/XSLT, Perl, and C++-based client–server systems. He is also an accomplished professional writer with a book (The Art of Software Modeling, Benjamin A. Auerbach Publications, 2007), numerous software-related articles, and a series of IBM corporate technology newsletters to his credit.