By Gareth Noyes
Embedded development is often driven by the need to deploy highly optimized and efficient systems.
AI is positioned to disrupt businesses either by enabling new approaches to solving complex problems or threatening the status quo for whole business sectors or types of jobs. Whether you understand what the excitement is all about and how it will be applied to your market, or you struggle to understand how you might take advantage of the technology, having some basic understanding of artificial intelligence and its potential applications has to be part of your strategic planning process.
Despite the hype, it is sobering to remember that artificial intelligence is not a magic trick that can do anything. It’s a tool with which a magician can do a few tricks. One area that is gaining interest is how artificial intelligence may be applied to embedded systems, with a focus on how to plan for deployment in these more constrained environments.
Definitions and Basic Principles
To be sure we are all on the same page, let’s start with some background about the different technologies and their compute requirements.
AI is a computer science discipline looking at how computers can be used to mimic human intelligence. AI has existed since the dawn of computing in the 20th Century, when pioneers such as Alan Turing foresaw the possibility of computers solving problems in similar ways in which humans might do so.
Classical computer programming solves problems by encoding algorithms explicitly in code, guiding computers to execute logic to process data, and compute an output. In contrast, Machine Learning (ML) is an AI approach that seeks to find patterns in data, effectively learning based on the data. There are many ways in which this can be implemented, including pre-labeling data (or not), reinforcement learning to guide algorithm development, extracting features through statistical analysis (or some other means), and then classifying input data against this trained data set to determine an output with a stated degree of confidence.
Deep Learning (DL) is a subset of ML that uses multiple layers of neural networks to train a model from large data sets iteratively. Once trained, a model can look at new data sets to make an inference about the new data. This approach has gained a lot of recent attention and has been applied to problems as varied as image processing and speech recognition or financial asset modeling. We see this approach also having a significant impact on future critical infrastructure and devices.
Applying ML/DL in Embedded Systems
Due to the large data sets required to create accurate models, and the large amount of computing power required to train models, training is usually performed in the cloud or high-performance computing environments. In contrast, inference is often applied in devices close to the source of data. Whereas distributed or edge training is a topic of great interest, it is not the way in which most ML systems are deployed today. For the sake of simplicity, let’s assume that training takes place in the cloud, and inference will take place at the edge or in-device.
As we’ve described, ML and DL are data-centric disciplines. As such, creating and training models requires access to large data sets, and tools and environments that provide a rich environment for data manipulation. Frameworks and languages that ease the manipulation of data, and implement complex math libraries and statistical analysis, are used. Often these are language frameworks such as Python, on which ML frameworks are then built. There are many such frameworks, but some common ones include TensorFlow, Caffe, or PyTorch.
ML frameworks can be used for model development and training, and can also be used to run inference engines using trained models at the edge. A simple deployment scenario is, therefore, to deploy a framework such as TensorFlow in a device. As these require rich runtime environments, such as Python, they are best suited to general-purpose compute workloads on Linux. Due to the need to run ML in mobile devices, we’re seeing a number of lighter-weight inference engines (TensorFlow Lite, PyTorch mobile) starting to be developed that will require fewer resources, but these are not yet widely available or as mature as their full-featured parents.
Some models can be interpreted and run without needing the complete ML framework. For example, OpenCV, a computer vision framework that includes Deep/Convoluted-Neural-Network (DNN/CNN) libraries, can read models from TensorFlow and other frameworks. OpenCV and the DNN libraries are available on many compact operating environments that don’t support a more complex or full-featured ML framework, so a second deployment option would be to deploy an inference engine using a framework such as OpenCV.
ML is highly computationally intensive, and early deployments (such as in autonomous vehicles) rely on specialized hardware accelerators such as GPUs, FPGAs, or specialized neural networks. As these accelerators become more prevalent in SoCs, we can anticipate seeing highly efficient engines to run DL models in constrained devices. When that happens, another deployment option will be to compile trained models for optimized deployment on DNN accelerators. Some such tools already exist and require modern compiler frameworks such as LLVM to target the model front-ends and the hardware accelerator back-ends.
Implications for Embedded Development
Embedded development is often driven by the need to deploy highly optimized and efficient systems. The classical development approach is to start with very constrained hardware and software environments and add capability only as needed. This has been the typical realm of RTOS applications.
With rapidly changing technologies, we see the development approach starts with making complex systems work and then optimizing for deployment at a later stage. As with many major advances in software, open-source communities are a large driver of the pace and scale of innovation that we see in ML. Embracing tools and frameworks that originate in open source, and often start with development in Linux, is rapidly becoming the primary innovation path. Using both a real-time operating system (RTOS) and Linux, or migrating open source from Linux to an RTOS, are, therefore, important developer journeys that must be supported.
Whether a company is just beginning its journey or is ready to deploy optimized machine learning solutions, they must build the fundamental technologies and rich development environments to abstract complexity and enable heterogeneous run-time environments.