The deep learning revolution is in full swing, casting powerful predictive analysis to a slew of industries with unique datasets and accelerating innovation, but not without acknowledgement of the increasing utility of supportive tools and platforms. Software advances manifested in the market with new frameworks to train neural networks and models, like Caffe, Pytorch and Tensorflow. But now the spotlight is on hardware. Dedicated computing deep learning chips are beginning to enter the market, for cloud processing and edge environments, like Graphcore, Horizon.ai, Wave Computing and Cerebras System, competing with the giants like Nvidia, Intel, Google, Qualcomm, Xilinx, AMD, and CEVA, all producing impressive results, yet all within an envelope of tradeoffs. Cloud dedicated deep learning chips are designed to achieve processing high throughput, while deep learning chips for the edge address factors like power efficiency, latency, bandwidth and optimized designs to fit into edge devices. Israeli startup Hailo is doubling down a fundamental market inefficiency with their new deep learning chip that’s designed from a new processing architecture and enables running efficient deep learning computing on the edge.
Hailo, founded in 2017, is a fabless semiconductor company developing deep learning chips designed to deliver data-center performance on edge devices. Their technology allows their chip to distribute compute problems across the chip’s sections, enabling them to consume extra low power for deep learning computations.
Their first product, the Hailo-8, launched in May 2019, is arguably the world’s fastest and most efficient deep learning processor for edge devices. It registers a maximum capacity of 26 TOPS (Tera Operations Per Second), with measured 3 TOPS per watt efficiency. And it promises 20 times better efficiency than nvidia’s Xavier AGX, all inside a significantly smaller form factor. They measured power consumption of only 1.7 watts by running a ResNet-50 on low resolution video (224 x 224, 8-bit precision with a batch size of 1) at 672 fps.
The novelty of the Hailo solution lies in its micro-architecture which sets it apart from AI chipset pack. According to Hailo’s CTO Avi Baum, “in the last 70 years, the Von-Neuman approach has been the dominating micro-architecture for the vast majority of compute platforms and is serving well for all rule-based compute systems. In a rule-based program, the execution flow is highly dependent on the data that flows through it. Contrary to that, a neural network execution is determined by its structure.” This, and some other fundamental virtues that are related to the intimate properties of neural networks have been leveraged by the Hailo home-baked architecture to define a computer structure that leverages those properties. The result is a structure-define data flow architecture which is much better suited for running neural network workloads. “To live up to the promise of a processor architecture, the solution is fully programmable and delivered along with a complete tool chain to allow machine learning application developers simple and easy on-boarding to the Hailo platform.”
“When you want to optimize power efficiency, low power distribution becomes a complex paradigm” explained Hailo CEO, Orr Danon. “If I have a customer that wants a solution with 40 frames per second on a specific algorithm, and it consumes two watts. They want to know if it they run 20 frames per second, it will consume one watt. It’s linear and they need to know the exact specifications. But in a cloud environment, these aren’t concerns, just overload, or turn it off. You can be very general purpose in cloud. On the edge, it’s not possible. We need to work on specific workloads, and it’s reason behind our new solution.”
The startup was founded by elite IDF intelligence unit alumni Orr Danon (CEO), Hadar Zeitlin (Chief Business and Development Officer) and Avi Baum (CTO). All three are military alumni of elite units like Tapliot, with deep tech expertise in hardware engineering, logic design, and system engineering, and are the recipients of numerous tech achievement military awards. Avi Baum previously held the role of CTO of Texas Instruments Israel.
“Data transfer and processing is moving from cloud to edge environments” explained Danon. And the startup has banked on that bet since their inception, predicting the usage of edge computing would proliferate, but all limited to power consumption constraints.
Hailo is working on the first productization of the Hailo-8 and getting it certified for automotive grade, ASIL-B at the chip level and ASIL-D at the system level (is AEC-Q100 qualified), and preparing for software ready for production level. “But we’re mainly a processor company at the end of the day,” said Danon. “What’s driving the hardware innovation is the software.”
“We’re coming from an architecture play, and it revolves around how the software works. We don’t need to change the properties of silicon or deep learning. We changed the logical design, both for software and hardware.” Over a year of design iterations, the co-founders created their first concept. It’s purpose was to close the full loop of a deep learning processor, “which is why we had machine learning people implementing it from day one, working iteratively. The key was to understand what’s different in deep learning than general work loads, and realize what’s unique about this problem space. You’re entering a function that needs tons of compute power, but very little memory, or very high compute and bandwidth memory, and you have to design a general purpose execution unit that could do good on average. That’s the only strategy to tackle this currently. But when given neural networks, the processor is given the structure of the network, and knows in advance to run it multiple times. If you’re distributing the process properly in such a highly parallelized problem, you don’t have it distributed symmetrically, and can perform unbalanced processing at the task level. General purpose computing would struggle to achieve this, and would be extremely inefficient. But for this particular fit, you can gain significant factor efficiency from just using specific parts of the hardware for specific parts of the problem, which are a better fit.”
Hailo raised a total $26.5 million in private funding from venture capital investors like OurCrowd, Maniv Mobility, Zohar Zisapel (Chairman of Hailo), and Glory Ventures. The startup has over ten patents pending on their technology and novel architecture. Their team totals 60 people in their Tel Aviv offices, comprised of over 80% R&D, of which 80% are dedicated to deep learning.
They have an SDK for development combining an end-to-end solution to convert your Tensorflow neural network to a binary file and executed on the Hailo-8.
By 2024, the edge AI chipset market is forecast to increase to $8 billion according to ABI Research. The IoT Devices market in general is expected to reach $158 billion over the same period. Industries like agriculture, industrial production, smartphones, video security, AR/VR platforms, and smart cities, with IoT integrations, can all benefit from the Hailo-8’s edge computing capabilities enabling the power of deep learning to their devices.
With the hardware wars already at play, between Intel, Nvidia, Xilinx or AMD, for combined $400 billion in market capitalization, there’s little room for a seat at the table. But as explained by Danon, “when you need to close a gap that’s an evolutionary change, the big players will win. When you need a revolutionary change, in the core architecture, core business or core technology, then new players have a chance and resources aren’t a factor.” And with their new Hailo-8 processor set for mass production in 2020, the startup is trekking their own path in a new revolution in the AI field, helping truly bring deep learning close to the edge.