Christoph Meinel (HPI) - Binary Neural Networks

This video belongs to the openHPI course clean-IT: Towards Sustainable Digital Technologies. Do you want to see more?

Enroll yourself for free

Christoph Meinel (HPI) - Binary Neural Networks

Time effort: approx. 13 minutes

An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.

Scroll to current position

00:00Here I want to introduce our research work about Binary Neural networks,
00:05and the goal is to make AI training much more energy efficient.
00:12And I start with this slide from MIT technology review, which compares
00:19the carbon footprint of a round-trip flight,
00:23New York to San Francisco. Human life over one year,
00:29American life over one year, and US car including fuels,
00:35the average lifetime, and
00:38the energy that's needed to train the former
00:42neural architecture research.
00:47Such an AI system needs a huge amount of energy and
00:54produces a very big carbon footprint, compared to a flight that also
01:02has a considerable carbon footprint.
01:06So when we imagine that such AI systems
01:11will be applied and used around the world,
01:15then it will not work for our climate, so we need definitely
01:21much more energy efficient AI systems.
01:25And here is an idea we tried to follow over the last years,
01:30that we a see Deep Learning architectures
01:35and instead of working with 32-bit architectures that
01:40we try to use binary neural networks to use networks on the
01:481-bit where the computation are done on the bit level.
01:53So the state of the art of such AI networks is 32-bit
01:59models in the convolution, as there are
02:042-bit numbers operated with each other and we
02:09try to design and train deep neural networks on a binary level
02:16and show that it's possible. And of course it produces a lot of energy saving.
02:22So what's the idea behind such low-bit neural networks?
02:27Here I show it on the one bit level. The extreme case is binary
02:33neural networks only use 2 weights, for the networks, +1 and - 1
02:41for weights as well as for inputs,
02:45instead of the 32-bit floating point that is
02:51done in the state of the art models.
02:55Upto 32 model compression and 58 times speed up
03:01is possible during the inference and more than a 1000 times energy saving
03:08on dedicated hardware is possible. This is shown by
03:13works of colleagues
03:16in the world. So the challenge of these low bit networks is they are wonderful
03:23in using only a small amount of energy compared to the state-
03:27of-the-art networks, but they lose accuracy compared to the 32-bit networks.
03:34So for example, directly binary sized binarization of a
03:39network trained on the image net(this is a huge
03:45database of image data) leads to a loss of accuracy by about 10 percent.
03:54We believe and with our research work we try to contribute to
03:59improve this accuracy and to
04:04make the loss in accuracy smaller ,best to reach the same accuracy level
04:10like with the 32-bit networks.
04:12So the goal of our ongoing research is to achieve the same accuracy
04:17with binary networks as with traditional convolutional networks.
04:25What would be the result, if this research would
04:29succeed? If he would be able
04:33to close the gap between the 32-bit convolution networks and the binary
04:40neural networks, then we can deploy the dedicated hardware
04:46on servers, and achieve huge energy saving. But even more,
04:53the networks can run on mobile and embedded devices without
04:58a loss of accuracy, and such mobile and embedded devices need a lot less
05:04energy compared with this that's needed by
05:09server infrastructures.
05:12So here are some insights on our research,
05:17so what we experiment is with the clipping threshold that should be
05:23considered as an hyperparameter, and
05:26values between 1.2 and 1.3 lead to better results.
05:31Since the value one that was used in most previous work.
05:36Here are the as sources where you can follow and see the
05:41details of this research work.
05:44scaling of channels after binary convolution according to
05:49Rastegari et al. can be solved by BatchNorm layers and a tighter
05:55approximation of the sign function does not necessarily achieve better results,
06:02this could be shown. And these are the models, we are using and experimenting with.
06:09There is a binary dense net, a dense net adapter for binary networks
06:14replaces bottlenecks and shortcomings. Here are the dense net and
06:19a bottleneck. Here is no bottleneck and here are the binary densenet.
06:26Here are our suggestion which we are working and which we investigated.
06:34If we look to and compare the
06:38accuracy of the different models and we here consider our binary dense net
06:45model, then you can see here that we
06:51receive the best results on the different model size levels,
06:59both in the BinaryDenseNet
07:02with different parameters, so it makes sense to work around and to modify
07:09the networks that are used. Another binary
07:13neural network model we are experimenting with are
07:18our Melius networks.
07:21They are using 1-bit for weights and inputs and lead to lower quality and capacity.
07:29But the number of bit and operation are reduced drastically.
07:35So the number of possible values for the weights are reduced from
07:412^32 to 2.
07:44If you have 32-bit representations, the values and the weights
07:49could vary in this huge amount of different weights. When you
07:56work with the binary, with such MeliusNet then we have only 2
08:02weights possible. Ofcourse this leads as well to a quantization error
08:08as a lower feature quality.
08:12The question is, what can we do? How we can deal with this to improve
08:18the quality and to
08:22lower the quantization error.
08:25So the value range of inputs is
08:29similarly reduced, so fine granular difference can no longer
08:33exist in only 2-bits for example -1 and +1.
08:37And also here, we have a lower feature capacity.
08:41And the idea we experiment is to solve both challenges
08:48through a specific architecture design.
08:53And this is exactly the proposal of our Melius network
08:58which is shown here in a overview sketch. We have the dense block,
09:06and we have the improvement block, and they are connected here and
09:12designed in this way. So this part in this block the feature capacity is increased
09:19and here the feature quality is increased in the network.
09:26So these a models if we make it more fine-granular, you can here see
09:32how it's designed, how we work with
09:37different blocks and different transitions fields
09:41to make such networks at space only as it's working
09:47with the 2 values. And to make it more
09:52accurate and to provide more quality.
09:57So if we compare our network, our architecture with others
10:04then we see MeliusNet we are considering
10:11here in the operations, and in the
10:15operation models size on a very good way, and it shows
10:19good results,then it makes sense to vary, to modify,
10:26to think and to design new network models,
10:31new binary networks model for this AI application.
10:36It's not necessary that the
10:4032-bit networks are so much better in quality than such kind
10:47of networks. If you are interested in the details, here I could
10:50only give an overview. Then here are another
10:55network model, the BMXNet 2, we designed an open source framework
11:03for binary neural networks. So a BMXNet 2 is based on MXNet
11:10which was proposed before. It contains reduceable models and demos
11:17to provide a strong basis for research and industry. And it
11:20can be used to find new network architectures, test new ideas.
11:26Here is a link if you want you can also find there
11:32are showcases and demo applications about this BMXNet tool.
11:37And there is an entry to demo app
11:40also working on the imagenet database and imagenet classification
11:48which is based on the RestNet. Human Pose detection demo
11:54for a Raspberry Pi, this is a small device which only
12:01needs a small amount of energy and also here this
12:07is a link to find this. We are interested to interact with you
12:13about such ideas. We are interested if you have further improvements,
12:19please you are invited to bring this in to our
12:24cleanIT forum to share this knowledge with other research groups, with the goal
12:31to decrease the carbon footprint of such AI models
12:38and applications.

About this video

While the best AI systems train neural networks based on 32-bit algorithms, the procedure can also be carried out with “binary neural networks” (1-bit algorithm). This drastically reduces the effort in the individual calculation steps and immediately leads to energy savings by a factor of 20. Although binary neural networks are currently about 5 % less accurate than those of AI systems of global players, the reduction can save 95 % electricity usage. With AI applications being used millions of times a day, total emissions can be decreased significantly. More information...

Prof. Dr. Christoph Meinel is CEO and Scientific Director of the Hasso Plattner Institute for Digital Engineering (HPI) as well as Dean of the Digital Engineering Faculty at the University of Potsdam. He holds the chair of Internet Technologies and Systems and teaches courses on IT Systems Engineering on the MOOC platform openHPI and at the HPI School of Design Thinking. He is engaged in the fields of cybersecurity and digital education. He has developed the MOOC platform openHPI.de, supervises numerous Ph.D. students, and is a teacher at the HPI School of Design Thinking, where he is also scientifically active in research.