Computer Vision is a rapidly evolving area. This guide is to provide a starting point to understanding some of the terminology used in computer vision and the OpenVINO™ project.
I hope to make this a useful reference for people that are learning to develop, sell, train or otherwise understand the concepts and vocabulary in this fascinating area of research.
This first article is for people who are in sales, marketing, project management or who would otherwise like to be knowledgeable about OpenVINO™, but are not specialists, researchers or engineers.
If you have words, abbreviations or other concepts that you think should be include in this cheat sheet, then feel free to email them to me at daniel.w.holmlund@intel.com.
Glossary
Introductory Terms for Non-Developers
This section contains foundational key terms that any non-expert should know to speak knowledgeably on the topic of OpenVINO™.
- Caffe*
- Caffe is a deep learning framework developed by Berkeley AI Research (BAIR) and by community contributors and released under the BSD 2-Clause license.
- http://caffe.berkeleyvision.org/
- Computer Vision
- An interdisciplinary field that deals with how computers can be made to have a high-level understanding from digital images or videos.
- https://en.wikipedia.org/wiki/Computer_vision
- Convolutional Neural Network (CNN) - Convolutional Neural Networks are Neural Networks that make the explicit assumption that the inputs are 1d, 2d or multi-dimensional arrays. This assumption allows us to simplify the neural network architecture and make it more efficient for applications in computer vision that use images or video.
- CPU
- A central processing unit (CPU) is the electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logical, control and input/output (I/O) operations specified by the instructions.
- https://en.wikipedia.org/wiki/Central_processing_unit
- Deep Learning Inference Engine
- An inference engine is a component of a system that applies logical rules to a set of inputs to deduce new information.
- The Intel® deep learning inference engine is a piece of software that runs trained neural network models. It receives input, runs it through the trained neural network, and delivers the output.
- FPGA
- A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing. It can be specialized for accelerating the highly parallelled processing tasks required in computer vision.
- https://en.wikipedia.org/wiki/Field-programmable_gate_array
- FPGA Inference Accelerator
- A FPGA that is specialized for running the Intel(R) deep learning inference engine at high speeds.
- GPU
- A graphics processing unit (GPU) is a specialized electronic circuit designed to rapidly manipulate data associated with computer vision and computer graphics.
- Hardware Heterogeneity
- Hardware Heterogeneity refers to the idea that computer software should be able to identify and run on a combination of different hardware. For example, if a computer vision program has access to a CPU, GPU and an FPGA then it should be able to use all three in an optimal manner.
- Intel® Arria® 10 FPGA GX
- A FPGA board designed for computer vision applications.
- https://www.altera.com/products/fpga/arria-series/arria-10/overview.html
- Intel® Media SDK
- A software SDK that enables hardware acceleration of video encoding, decoding, and processing in Microsoft Windows* and Linux*.
- https://software.intel.com/en-us/media-sdk
- Intel® Movidius™ brand
- The trademark name given to products developed by a computer vision company named Movidius™ that was acquired by Intel in September 2016.
- Intel® Movidius™ Neural Compute Stick
- The Intel® Movidius™ Neural Compute Stick (NCS) is a tiny, fanless, deep learning device that you can use to learn AI programming at the edge. NCS is powered by the same low power high performance Intel® Movidius™ Vision Processing Unit (VPU) that can be found in millions of smart security cameras, gesture controlled drones, industrial machine vision equipment, and more.
- Model
- Model is a trained neural network that specializes in a particular activity. More formally, it is a neural network that has been trained to approximate a particular function.
- Model Optimizer
- The Model Optimizer is a cross-platform command-line tool that takes pre-trained deep learning models from Caffe*, Tensorflow* and MxNet* converts to an intermediate representation for use with the inference engine. It performs static model analysis and adjusts deep learning models for optimal execution on end-point target devices.
- MxNet
- A neural network file format used by the Apache MXNet™ project.
- https://mxnet.incubator.apache.org/
- Neural Network Topology
- The total number of neurons and all of their connections and weights are referred to as the Neural Network Topology.
- Neural Network
- Object Detection
- OpenCL™
- OpenCL™ (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. - Wikipedia
- OpenCV
- OpenCV (Open Source Computer Vision Library) is a software library that specializes in real-time computer vision algorithms. Originally, started by Intel, OpenCV is now open source and the most widely used computer vision library in the world. https://opencv.org/
- OpenVINO™
- Open Visual Inference & Neural Network Optimization (OpenVINO™) toolkit provides computer vision libraries and deep neural network and convolutional neural networks (CNN) libraries, the toolkit extends workloads across Intel® hardware and maximizes performance. - https://software.intel.com/en-us/openvino-toolkit
- OpenVX*
- OpenVX* is an open, royalty-free standard for cross platform acceleration of computer vision applications. OpenVX enables performance and power-optimized computer vision processing, especially important in embedded and real-time use cases such as face, body and gesture tracking, smart video surveillance, advanced driver assistance systems (ADAS), object and scene reconstruction, augmented reality, visual inspection, robotics and more. - https://www.khronos.org/openvx/
- TensorFlow*
- TensorFlow* is an open source software library for high performance numerical computation. it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains.
- The Edge (of the Network)
- Networks located on the periphery of a centralized network. Device’s attached at the edge are often user facing.
- VPU
- A Visual Processing Unit is dedicated silicon that is designed for processing computer vision media including images and video. It’s often used in conjunction with Intel® Movidius™ technology.