Prof. Christian Plessl (Paderborn University)

Це відео відноситься до openHPI курсу clean-IT: Towards Sustainable Digital Technologies. Бажаєте побачити більше?

Prof. Christian Plessl (Paderborn University) - Approximate Computing: A Paradigm for Energy Efficient Computations

Часове навантаження: прибл. 48 хвилин

An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.

Прокрутити до поточної позиції

00:05As I said already to the last exchange of 2022
00:10and today we are very pleased to have the wonderful Christian Plessl here
00:16who is going to talk about the topic of approximate computing.
00:20But first I'm going to tell you a little bit more about his work.
00:25He got a Doctorate in Computer engineering at ETH Zurich,
00:32after that he got to be the full professor for high-performance computing
00:37at the University of Paderborn,
00:40and also is currently the director of the Paderborn center for the Parallel Computing
00:47and in his career he also this year got the world record for
00:54the first research group worldwide to achieve acceptable performance
00:58by simulating a task of two spike protein.
01:02And today we'll talk about as I already said about the
01:05beauty of approximate computing and how it can help us
01:10go forward in this world and how that is connected to community as well
01:15though I am with that I give the stage to you
01:19Christian and please
01:22share your slides.
01:26Thanks for having me here today, I'm very happy to be
01:32invited to this symposium because
01:36I think this is also a topic that is close to my heart so
01:38kind of the combination of computing and sustainability and efficiency
01:43is exactly what the topic of approximate computing for me at
01:47least my interpretation is. And in the next
01:5115 minutes I will give you a broad overview of my take
01:55on the topic and then I will be happy to share
02:00additional comments and discuss with you about this topic.
02:06So let me start outlining the challenge that we currently facing
02:10in the world of computing and as you have seen from the introduction
02:15I'm working in high performance computing, so I
02:18have not always worked in high performance computing I have
02:20a background also in embedded systems where energy and energy
02:24efficiency and power consumption have been very important topics
02:28of concern for a very very long time. But in recent years we
02:31have also seen that these topics becoming more relevant and
02:35of utmost importance for economic and ecological reasons also in
02:40high performance computing and data centers.
02:43And this is a charge that you frequently see in this talks
02:47when someone talks about computing. So what you see here is
02:50the performance that all development for high performance computing centers have
02:55supercomputers more general, so here on the axis you see a
02:59time scale from 1994 to the year 2001.
03:03And what is shown here is the development of the performance
03:06of the so called top 500 list
03:10is a very well known
03:13list of like a world ranking of the fastest supercomputers
03:19and for competing between the supercomputers a specific benchmark
03:23application is executed which is the high performance olympic
03:27application, this is a way of solving a linear equation system
03:32a very large one with a supercomputer
03:36and this requires a certain number of arithmetic operations and
03:43dividing the number of operations by the time that they have
03:46been executed by the time used for solving the special problems
03:50that gives you the float number of floating-point operations flops per second
03:56what you see here is the development in this list. So in the
04:00first iteration of this list that was published in 1994
04:03to rank supercomputers the top system here
04:08achieved a 1 Teraflops per second
04:12and this is the aggregated sum so that the top number one system achieved
04:17422 Mega flops or 422
04:21million floating-point operations per seconds
04:24and that was that the last system, so this was
04:29number 1. So the list ranks the supercomputers according
04:33to the benchmarks and gives them ranks from 1 to 500
04:36on this list you see three things the blue line is the aggregated
04:40sum of the performance of all of the computers of this five
04:42hundred computers on the list the red one is the performance
04:46of the top of the most powerful system on the list and the top 5,
04:51the orange line is the one that just made it on the list of
04:56the top 500 most capable supercomputers
04:59so it is so in 1994 the lowest ranked system
05:04had four hundred twenty two mega flops the highest rank had
05:08about 60 Giga flops and in aggregates summing up all of
05:12the performances of these five hundred systems you achieve one point
05:16one seven teraflops now when you saw here over this development
05:20of the time you see this logarithmic scale here on the left we see this
05:25expected exponential increase of performance that is also expected
05:30and predicted by moore's law
05:32so this applies for both phenomena the number one system
05:37followed this trend of exponential growth also for the
05:40number five hundred system and also the sum
05:44of the aggregated performance follows roughly this trend but
05:47you also see here that here about in two thousand and thirteen
05:51there was kind of a change in the system so you see here that
05:55is slowing down down we are still on an exponential growth path
05:59but the rate of the growth has slowed down around
06:05ten years ago
06:10more really with much more rally at his cost is you if you look
06:17at the slide you think everything is going well except for
06:19a slight slowdown but the performance is still exponential
06:22increasing at an exponential rate however we also need to consider also the power
06:29that was used to achieve this results and this is
06:34you see here for the top ten systems how much power these systems
06:39are consuming so in two thousand o eight such a system consumed about two megawatts
06:46of power and today these top ten systems that they they require about
06:52ten roughly ten megawatts of
06:56performance this is the fifteen
06:58fifty most powerful system of the world and this is the five
07:02hundred so you see follow similar trends but what you see here
07:06in the increase in particular on the top you see that over
07:09ten years the power consumption was increased by a factor of eight
07:13for fifty most by four and by two so you have an exponential
07:18increase in performance but also a linear grow in the power
07:21consumption that means our systems have become more expecially
07:25powerful but also be investing more and more power to operate the system
07:30what we also do not see in this charts is
07:33that we investing more and more money to install the system
07:37so they're also very expensive so because moore's law is slowing down
07:42the systems become larger and larger and as requires a lot
07:45of investment
07:48the problem of energy consumption or increasing power is not
07:54only of relevance for high-performance computing but this is
07:58the same story also is true for commercial data centers this
08:01is data that was from a few years ago
08:05in frankfurt so the data centres that are located in frankfurt's
08:10they take about nineteen percent of the total power consumption
08:13in frankfurt and this is about the same quantity as the airport
08:17in frankfurt which is one of the largest passenger airports
08:21i believe both are million of passengers per day
08:26so it's a they there are eighty zero hundred people working
08:31at frankfurt airport so it's really massive operation
08:35so it's very clear that's the energy consumption of data centers has become
08:39getting a lot of concern and there's also some
08:43very strong movements to improve on the energy efficiency and and
08:46recently also more legislation to make sure that the energy
08:51is used to a more appropriate way
08:56i mentioned already worse law which is essentially
09:02saying that semiconductor technology was for a long time driving
09:07this development and we can become used to it some worse law predicted that
09:12transistor density at the minimum cost so it's not a technological
09:17but more an economical law doubles about every twenty four months
09:22but hand in hand with moore's law means we get more transistors but
09:27it went hand in hand with the so called better scaling that sets the
09:31don't did not even have more transistors but they also became more efficient
09:37so that means the electrical power dissipation remains constant
09:40when we double the transistor density and they even run
09:45a bit faster by forty percent so that means we do not only
09:48get more transistors but also more they also become more efficient and faster
09:53and this was fueling this whole industry
09:56of computing and was like a guiding principle to drive this
10:02exponential increase in performance however as we know that
10:06every exponential growth must come to an end finally there's two main
10:11areas y d moore's law must come to an end so one is the technological
10:18reasons for instance lithography reliability leakage and power
10:21density so so we are becoming to such small structure sizes
10:25that the size of the atoms are getting an important factor
10:29that will limit us at some point and there's also economical
10:33factors so if we produce this very super advanced semiconductors
10:37that become smaller and smaller the
10:40successful production yield is reduced and the cost of semiconductor production is increased
10:46and moore's law was predicted many times to die for either
10:51economical or technological reasons you see here a chart a
10:54very nice chart from the economist
10:57from two thousand sixteen that
11:01shows these projections here and this chart so this is when
11:05the prediction was made and when the end of moore's law was
11:08predicted so gordon moore in nineteen ninety five predicted that the
11:12morse law will end around two thousand five for economical
11:16reasons and then he further on the new projections like in
11:20two thousand five he made another predictions as well this will continue
11:25two thousand five he made another prediction say will continue
11:28to the range of twenty fifty to twenty five so see it has been
11:33predicted to end many times by different people who are also
11:38very smart so far it is still going
11:42strong we see some slowdown but it's not at its ends
11:45but at some point we will reach the end of we're slow now maybe
11:53if we look now semiconductor technology is developing so what
11:58we see is that
12:01we can no longer rely here on the circuit technology so
12:05for a historic reasons historically the semiconductor technology
12:11was a reliable drivers for performance inefficiencies with
12:14more transistors that were more efficient and this is what
12:16we can call us the bottom up innovation but still we don't
12:21see any post silicon technologies available on the horizon that really work
12:26so what we need to do onto the gets the earth to post silicon
12:30devices we need to have at the top innovation so we need to
12:33live with current cmos semiconductor technology and try to
12:37get more efficiency and performance until these devices are available
12:43and there are two approaches which i personally found very
12:45promising one is approximate computing and this goes very well in
12:50hand-in-hand with cross-layer optimization and co-design and
12:54in the next couple of slides i will give you some intuition
12:58of what this approximate computing is and then show some perspectives
13:01where this can lead us
13:04so what is approximate computing
13:06if you look at the design of computing systems we will recognize
13:10that there are some implicit assumptions and design goals when
13:14we design computing systems so one of these assumptions is
13:18correctness so as illustrated here by this pocket calculator
13:21we have here the calculation of three times three we expect
13:25this to be nine and not seven
13:29the other assumption that we have inherently is reliability
13:32so we want to assist them so that that works always in a reliable
13:38way and what sometimes works sometimes doesn't so this also
13:42requires us to make certain compromises or safeguards to ensure this reliable operation
13:49we also want that our systems they are of course by nature
13:54any visual representation of nature and numbers must be an
13:58approximation as can be integer numbers of floating-point numbers
14:01but still the implicit assumption is we want to have the computation as
14:06as precise as possible and here for this number of pi
14:10and then finally we want to have determinism so if we do the
14:13same computation for pi again we want that the same number
14:16is the result we don't have a variance in results
14:20the question is now so what can we gain
14:23if we relax this assumptions or give up this assumption so
14:27is there anything to be gains in performance and efficiency
14:31and if so what are the conditions and situations when we can afford to
14:36relax this assumptions
14:39and this brings us straight to the topic of approximate computing so the
14:43basic idea of approximate computing is to say well
14:47if we sacrifice precision or correctness or reliability
14:53can we gain something in performance and energy efficiency
14:56and the answer obviously is yes
14:59on the question as we need to find out so in what circumstances
15:03can we give up on precision correctness and or reliability
15:06and how can we exploit this fact for having for creating systems
15:11that are more energy efficient
15:14now you would assume that there are certain applications for
15:17having these parameters like precision correctness and reliability are of utmost
15:22importance so this is why we design our systems to have them
15:25but there are many applications and application domains where
15:29we don't really needs
15:31this kind of systems for instance when we do audio video and graphics processing
15:36so you see here on the top right a video that has been compressed
15:39you see some compression artifacts to to exaggerate the effect
15:43but what you see is it's just a visual impression and our human
15:46perception is very inaccurate anyway so we are tolerant to certain
15:52fluctuations in approximations also
15:56discussed when we process external signals as come from sensors
16:00there's also some noise in the signal centers probably no need
16:03to compute the signals with double-precision floating-point accuracy
16:07and then finally there are some application areas in particular
16:11in the area of machine learning where we don't have a true
16:14golden answer available but we working on for instance some classifications
16:19problems like image classifications and since also the methods
16:23that we implementing is an approximation
16:26for instance for human image recognition we can also use approximation
16:31in the implementation of neural networks for instance
16:35this idea of approximation cannot be implemented at many different
16:38levels so one level how it can be implemented is at the application level
16:43or at the algorithm or library level so we modify the application
16:46or libraries so that they are inherently approximate
16:50and a computer like exploit the fact that we don't need precise computation
16:58another possibility is to implement this at the computer architecture level
17:02so for instance we could have functional units instead of computing an accurate multiplication
17:08we could maybe if the norm the ranges of the numbers are very constraints
17:12we could have a multiplier that computes an approximation of
17:15a multiplication or we can even go down to the circuit level
17:18that we work with voltage over scaling
17:22or frequency of scaling and operate the system at the level
17:25where we have reduced carbons and introduce some stochastic errors but the
17:29computation is much more efficient
17:32i may have a focus of my researches to leverage approximate
17:35computing to accelerate high-performance computing applications
17:38and in particular with an eye on making efficient use of accelerators
17:42such as fp js or chip use
17:45now well here is a slide that shows you
17:50why we want to work with reduced precision and what we can
17:54gain in area and in energy for instance what you see here is if we
17:58can use the precision from a thirty two bits adder to a eight bit adder
18:03became about a factor of three and relative energy costs
18:07and more interesting so when we move from a thirty two bit floating-point
18:12multiplication which requires here this technology three point one picket chew
18:17to a thirty two add to an eight bit integer multiplication
18:22we see here that we achieve a
18:25reduction of nineteen in terms of energy and then twenty seven
18:31in terms of area that means we can have massive
18:35reductions in chip sizes and also in energy consumption if
18:39we can move to lower precision and computing and here we only
18:43talk about different data types not even of stronger forms of approximations
18:49one example from our research where we exploit this fact is
18:54using lower mix precision in scientific computing for computing operations on matrices
19:01in this case what we started here in this work is we computes the
19:05in worst piece root of a matrix for instance this could be the
19:11the inverse of the matrix but also other
19:15in worst piece fruits
19:18so for computing this operation for the matrix inversions if
19:23if p equals to one then this is just a matrix inversion there's
19:26a method that has been proposed by bini et all which is an iterative
19:30operations of e just take the matrix and use this iterative equation here
19:37to computes the inverse of the matrix step by step and then
19:40what we can do is we can in each step we can computes the error
19:45between the precise results and the approximate results and
19:48this is shown here on the right
19:50so what we did here is if he computes this for a matrix a with
19:56single-precision floating-point precision that has twenty two bits of mantissa
20:00then we see here on this left hand sides the error of the computation
20:04and then in every step of this iteration equation we see with
20:09increasing iterations the error is reduced until we here after
20:14seven iterations approximately we converge here to minimum
20:17error which is then an approximation of the inverse of the matrix
20:22now it is interesting if he computers with reduced precision
20:25for instance have eighteen bit monte sour fourteen or ten bits
20:30we see that initially the convergence behavior of this equation
20:33is completely the same and then the convergence stops when
20:37we reach a limit that is then given by the available mantissa
20:42so the floating point representation does give us more doesn't
20:45give us more precision to converge to a better degree
20:49so what will become see here is that this iterative methods allow us to
20:54approximate a value of here of the inverse matrix and what
20:59we could do for instance here an approximate computing you
21:02see here we could do the first approximations here say until
21:07the iteration four or five we could do this with a low precision
21:12for instance here with fourteen bit precisions and then change
21:16to twenty two bits precision and then converge still to the
21:22to the very low error margin if this required for application
21:25of just twenty five operations and stick here with this error
21:28margin if this is acceptable for our application
21:34the second and last application i would like to give you an
21:37idea for a very different way of approximation that is now not related to
21:43just numerical approximations but in this case we do an algorithmic operation
21:50and what we do here is we computes a so called matrix function
21:53so it is a function that we compute a function of a matrix
21:58that could be for instance b and it worse or a roots or another
22:02a function that takes of a matrix and computes a
22:06another matrix
22:10what we have developed is a method which is which we denote
22:12as the sub matrix method that will explain in a bit how it works
22:17so the main idea is that we have this matrix a here and then
22:21we apply a function to the matrix and computes a approximation of the function
22:27and how the method works is it looks at this matrix a
22:31and which this is now important is the sparse matrix and what we do is
22:36we can words the evaluation of the function on a very large sparse matrix into
22:43a very parallel execution of the applying the same functions
22:48to very small dense matrix so how we do this is we build from this very large
22:54matrix a here we go column by column and in each column we take the
22:59non zero elements of the column and the corresponding non zero
23:04rows of this matrix which are symmetric
23:07and then from the first row we build the sub matrix a here
23:11from the second row we take here non zero and the the entry so we take
23:16here the first the second and the the sixth value and take the first
23:23the second and the sixth column of the matrix build the sub matrix
23:28then we can individually apply this function on the sub matrix
23:34which is now a dense or much denser matrix
23:38and then we can take the corresponding column of this matrix
23:41and fill it in in the same structure as the original matrix
23:46so the benefit of this matrix is now about we can work here on the very
23:51large sparse matrix and treat this problem as a massively parallel
23:56problem that works here on a very large number of smaller matrix which are dense
24:01this allows us to have on one hand we have a now a very parallel problem
24:07and we can see that the problems are completely independent from each other
24:12and we have shown by a theory and practical evaluation that the way of
24:17computing this matrix function in this way leads to acceptable
24:22errors for a algorithms in in computational chemistry
24:30now this is a form of approximation that works very well for a
24:36being included in ab initio molecular dynamics applications
24:40and additionally through this kind of sub matrix application
24:42we can also to a the similar operations that i've shown before
24:47so on each of the sub matrix we can also use this iteration methods to gradually
24:53compute these functions
24:55i wanted to show animation which unfortunately doesn't
24:58work here but the the punch line of this algorithm is that
25:03we are able to to be with the errors that we introduce we can treat them
25:09as noise on our accurate
25:13simulation results and we have found a way using this larger
25:18type of equations how we can perfectly compensate this noise if he computes
25:24average values of observables from this molecular simulations
25:33here you see this also in this molecular simulation so you
25:36see the atoms and they're moving slightly left and right
25:39and the idea is if you're interested to find the average distance
25:43between two molecules which is shown here with this value here
25:47you don't need to know where each molecule is moving exactly
25:50and you don't need to have the trajectories of each of the
25:52molecules but you only want to have like an example average
25:56of the distance between two atoms so since we introduce some
26:00approximations in the computations that can be treated as white noise
26:04these errors or this white noise can be
26:06perfectly cancelled out and although we doing approximate computations
26:11we can still accurately compute for instance here there's a distance between
26:17two atoms
26:20so this brings me to the conclusion of my talk so
26:23this approximate computing is a technique that is not one specific
26:28technique but it is more a paradigm or concept that we can say
26:32we willingly accept in accuracies in the evaluation of scientific algorithms
26:38and by accepting this in a curacies we get instead an improved
26:44energy efficiency and we can apply this concept to many different
26:47levels of our systems so we can apply it at the application level
26:52by using a problem specific quality requirements of the result
26:56we can apply it on the algorithm level for instance by using
26:58approximation tolerant algorithms or lower mix precision what
27:02i just showed in this talk
27:04is possibilities to apply approximate computing at the software
27:08level there has been work on approximation where compilers and runtime systems
27:13we can build approximate arithmetic units so in this talk i only talked about
27:18using
27:22precise arithmetic different bit widths and data representations
27:26but we can have more crude
27:28approximations or of arithmetic computation and finally we
27:32can though go down to the circuit level and work with techniques
27:36like dynamic voltage scaling and frequency over scaling so
27:39we reduce the operating voltage of the of the circuit so low
27:44that we introduce some stochastic errors so this gives us much
27:48improved energy efficiency but we introduce some errors that
27:52also exhibit then some noise in the results
27:56without i hope i give you some idea how approximate computing
27:59can be seen as a paradigm for improving efficiency and performance
28:02so it's very broadly applicable and it allows us to do this
28:05innovation at the top so we don't need any disruptive technology
28:09or post silicon technology but essentially what we do is we
28:13take the existing algorithms and apply the work maybe take
28:18the existing technologies and apply new algorithms to make
28:22better use of what we currently have in the harbor of course
28:25if we can also tailor the hardware and this brings it back
28:28to my research in field-programmable gate arrays where you
28:31can tailor the execution architecture you can even go beyond
28:35that and then build custom arithmetic units and highly tailor
28:39its architectures that work cross layer
28:42so that i would like to conclude and i will be very happy to
28:45take any questions that you may have
28:50thank you the plaza was a very interesting target in my opinion and
28:56yeah we're really happy to have any questions right now from the audience
29:02and so you can write them shut or raise your hand to sure
29:06you want to speak up and other than that i can also
29:11add some questions on my own and i think we already had the
29:14first question from sebastian heckler
29:19which i will just read out
29:21so is there investigation to bring the submatrix method in hardware
29:26or something
29:28that provides the guaranteed accuracy
29:34i'm
29:43not sure what it means can we bring the sub- matrix okay there's
29:46two things can we bring it to the hardware the others is there guaranteed accuracy
29:51i haven't talked about
29:53this in so much detail but in the in the introduction you mentioned
29:56that we we had this work on a very large scale simulation of
30:00a molecular system it is the
30:03worked on the coa covered nineteen virus and also in the hiv virus too
30:08to show the limits of what can be done of molecular or lack
30:11of optimistic simulations of viruses so here our target from
30:16the beginning was cpu's white gpus because chip used for our today is very
30:22highly tailored for machine learning so that means instead
30:25of having double precision arithmetic there is single precision
30:28and half precision and even eight bit arithmetic in two days at
30:32chip use so the idea was the optimized for tens linear algebra
30:37so the whole idea between the sub matrix methods have used
30:41for this application is to
30:43translate this problem that works on sparse super large sparse matrices
30:49into a problem that works on dense matrices which are much
30:53smaller in the sweet spot of the gpu this is exactly what we do so we
30:58have matrices sparse matrices that are many millions by millions
31:02of entries and we cut them down into pieces that are like thousand by thousands
31:07and we can operate on them in mix precision computing with
31:11mixed sixteen and thirty two bit floating-point precision this
31:14is exactly the sweet spot of the gpu
31:17and all of these computations are completely independent from
31:20each other so we can scale our problem to two massive amounts
31:24and this is what we did on the most powerful chip
31:27system at that time with
31:29no risk and so it's completely tailored to the gpu because we know for
31:33sure deep learning begets low precision mixed
31:36precision hardware and we try to make good use of this hardware
31:39for other purposes than machine learning
31:42the second part of the question not sure yet correctly is is
31:46the guaranteed accuracy in this case no no so
31:50we we cannot guarantee an accuracy of course it depends on
31:53the operations how we computes that the operation of the sub
31:58matrices there we have seen for instance this in worse roots problem i have
32:03we have some conversions behavior this is also mathematically proven
32:07the difficult part is the methods per se very
32:11cuts down the problem of working on a very large sparse matrix
32:15in a large sequence of independent sub matrices of course we
32:19introduce an error by that because
32:23you cannot just decompose the problem right so there's some
32:27error that you introduce for the area that we working on we're
32:31working on specific matrices called hamiltonians and in computational chemistry
32:36this error doesn't affect us so badly and we can also curve
32:40perfectly compensated for our sample averages but this is not
32:44a technique that works for any use case so it's really application
32:47dependent
32:53this answers the question and we got another one which is
32:58that you mentioned sacrifice determinism without further drilling into examples
33:03how much potential do you see improved elastic computing yeah there is that
33:08yeah so so what i meant where i see potential is in this
33:12voltage over scaling or frequency over scaling or meaning
33:16reducing the supply walters to a level where you start introducing statistical
33:23malfunctions or errors in the computation this is one way how
33:28can you have like a non deterministic results because if carbons become lower
33:32there's also an alter
33:35field an approximate computing that has proposed by someone which is a
33:39probabilistic computing very computes on on for instance on
33:43binary bit streams and there is also whole field how you can
33:47do arithmetic on binary bit streams
33:50which is very convenient because it's very easy to like if
33:52you have just two bit streams and you need to have very simple logic operations
33:57the operations can be extremely efficient
34:00i'm not an expert in that fields i have talked to people that
34:02have worked more in that and they were not so convinced they
34:05said like like you the overhead is quite drastic that you introduce and
34:10they thought it's not worth the effort so so i don't have much personal experience
34:15i've seen the idea but i have not seen convincing results yet but also didn't
34:20investigate this so much so i don't see it
34:24as a reasonable architecture for for the scientific computing
34:28field where i'm working in maybe for embedded system or signal
34:33processing this is more appropriate
34:40right now we don't have any further questions from the church
34:44so uh
34:47um yeah i can just ask my question person and thanks a lot
34:50for the talk and i think it's quite interesting to see
34:53there's a lot of applications from this field which can also
34:56save energy of course and so my question would go into the direction
35:00and what do you think how many researchers or engineers in
35:03the approximate computing field already focus on saving energy
35:07by what they're doing is it like
35:09are you kind of like an outstanding if you think about energy
35:12consumption and saving energy with these with these techniques or
35:16is it generally a very widely
35:20a known that you can save energy it isn't a top goal for many researchers
35:28is it yeah it's a good question i don't know i
35:31thought the field is developing i think from two areas so one
35:36is from there's people from coming from computer architecture that that or
35:40embedded systems circuit design that trying to
35:45go beyond what they are currently doing and there is also people
35:49that are practitioners coming from
35:52numerics applied math high-performance computing linear algebra
35:58library development and so forth and and and i think
36:03the hyper full forms computing people they're also now a computer
36:08developing mixed precision linear algebra libraries for instance
36:12they also use iterative methods to converge the correct solutions so
36:16they are mostly interested in exploiting the computer architectures
36:20in an efficient way so they see there's lots of computing units
36:23that can do low precision and and other single and half precision computing and think
36:27what can we do with it and on by on the way they say
36:31they save save energy i think the energy saving part is not
36:36their main concern but they're mostly interested in performance
36:39and the of course also are interested in the saving of energy i think
36:45so far the topic of
36:49how much operational cost is involved in doing computation is not
36:54is not of much concern to most researchers in application domains
36:58because they don't pay for the computing time and also not for the energy
37:02i think there is some some some shift going on currently and
37:06with the increasing energy costs that will even be more pronounced
37:10and i think now even those who didn't care about the problem
37:13are starting to care about the problem so i think it will become
37:16more relevant for the future
37:19but i i think for people in in application sciences
37:24they're mostly interested in the results
37:27whereas in computer science and computer engineering like our
37:31research is most about also about the methods as we publish
37:34about how these methods work do some error analysis so this
37:37is all what for them it doesn't even count
37:40them counts just what is the specific properties of material x
37:44and this is completely out of their scope but i think from
37:48the compute centre operators and from the people working on
37:55computer architecture and in systems design it is a topic of much concern
38:00these days thanks a lot
38:04thank you so if there not any other questions maybe i can add
38:09one myself as well
38:12as another one sir or commodore not sure to technically
38:16interest chip production is not cutting edge at the moment
38:20that would be to some see through nanometers
38:22and dusty since you have a prediction on further increasing
38:25the integration density of transistors and chips linked to energy efficiency
38:31maybe you have some insights i
38:34know and i'm unfortunate i don't have much many insights on
38:37that so so i'm i'm mostly working or not mostly i'm only working where
38:42as my main physical of technology are fhs and and also chip use and abuse
38:48a bit and they're we just take what is given to us by the vendors
38:52as well have a look at those trends but i'm not so super familiar
38:57with the current predictions for the semiconductor technologies
39:01i think there is still some integration density are these towards
39:05intel claims and and and
39:08you see it on the charts
39:09but this is all about quite far in the future so i don't have
39:14any specific insights on that
39:19yeah maybe one other question for me would be so you talked
39:23about some applications of approximate computing some machinery
39:26signal processing audio video and do you see any other
39:31future applications that you didn't mention and that aren't
39:34using approximate computing right now but maybe will be using
39:37it in the future and are there any dangerous and using approximate computing
39:43and those areas
39:45any any dangerous yeah like
39:50anything that you have to consider before
39:53using it in your applications
39:56so so yes what i'm trying to do is likely that the fields of
40:02audio video processing machine learning i think there it's it's obvious
40:06that there is some potential for approximate computing
40:09those all what i've been trying also with collaborators is to find
40:14areas where we can apply the idea of approximate computing
40:18that are non obvious and and where you really care about precision
40:23you want to have controls
40:25results right and and i mentioned that for instance one of
40:29the errors is whenever i have an iterative methods that converges
40:32to a result so the basic idea is ok let's do a coarse approximation
40:37to the results and then do a couple of iterations at the very
40:40end to do some iterative refinements to converge to the perfect
40:43result i think this is kind of
40:47an area where there are some first works in scientific computing
40:51but it's not widely used there's some first libraries coming up and
40:54and people really trying to exploit those capabilities that
40:57you have in today's hardware i think this is a that there are
41:01probably many areas where you could apply this this idea but
41:05it is not widely used i think this will mostly be then taken
41:09up by libraries people will develop some some some scientific
41:13based libraries and then people just use it under the hood without
41:17developing the methods themselves so
41:20i think there is still a lot of use a lot of potential
41:30the dangers are of course that you
41:36translate the technique to a field without knowing what the
41:41pre-conditions were for instance this is the sub matrix method
41:44a i mentioned and this is also going a little bit of out of my comfort zone
41:49but the point is that you introduce a certain type of error in
41:56the computation and the question is what do you do with the
41:59matrix that you have at the end now in quantum chemistry you
42:03use this matrix to multiplies
42:06and and that only what you're interested is only the trace
42:09of the matrix and the off-diagonal elements are not so super
42:12relevant so for this kind of operation it works very well now
42:15if you take just a method say we'll pass that all have this
42:18nice sub matrix method and we are just applied for some other arbitrary
42:22sparse matrix problem then you may up with completely wrong results because the
42:28underlying assumptions that we have that we are mostly concerned about
42:33the trace of the matrix
42:36is no longer true and and and i think this is
42:40this could be a problem rather people try to
42:45not understand the implicit assumptions that are in the work
42:48and applied to field where this is no longer true and those could be like
42:52algorithms that do not converge or give random results or just
42:56garbage in garbage out so i think this is a concern
43:01i also wouldn't use it for super safety-critical operations
43:06but i think that's that's anyway clear now
43:11it makes sense i'm ok then maybe that's end on a practical question
43:17so at which scale do you think using
43:20a personal computer makes sense on a practical level he talked
43:24a lot about a supercomputer where the scale is like astronomical
43:28really hard to to grasp and but is it also possible to use
43:33it on your mobile phone on your personal computer and do you
43:37think it would make sense in the future or is it something that's like
43:41mostly geared for researchers right now no no
43:45i think it's a necessity so so so i only talked about
43:49approximate computing as wanted technique to have this
43:53at the top innovation but
43:56i really needs much more top at the top innovation is a very
44:01nice paper which reference the method by leiserson at all
44:05i like to expand on this like a lot of room on the top or similar title
44:10says like approximate computing is one technique but there
44:14is also many other things that we can do to get still more
44:17benefit from current technique right reducing software bloat
44:21doing proper performance optimization doing better data structures
44:25of course better algorithms always
44:27beats a lot right so sorry like like this is the
44:30first having good algorithms this is the the highest leverage
44:34that you can get but then there follows a lot of things across the computer layers
44:38where you can do some improvements and then
44:41this is there is approximate computing is one piece of the puzzle
44:46but you can reducing software flow to improve performance optimisation
44:51getting a only computing us what you really need
44:55that if you want to have like the weather forecast for for
44:58potsdam you don't maybe needs to compute a cold model that
45:02also computes the weather in tokyo and to narrow it down and
45:06i think there is a lot of potential there without changing
45:10anything in the semiconductor technology this this is all on
45:13top and maybe there is also exciting things coming
45:17down the pipeline but there is still a lot of potential today
45:20so this is why i think it's not at all only for supercomputing i think it's
45:24that this needs to be applied to hold
45:27the complete computing stack
45:29of course if you think about danger centers in computer and high performance computing
45:34this is such large scale is it's such high operation costs
45:38that you're really driven to do this kind of optimizations
45:42there but also for your cell phones for instance there is maybe
45:46not the operation cost but its battery life right so so the
45:49battery life at it is limited it will not become better in
45:52the same space significantly so the only thing that you can
45:56do is you can find new way to conserve energy and this is the
46:01way one of the ways to go
46:04thank you for that ending note
46:08great message to also for everybody in the audience
46:11to hear and yeah with that i want to thank you again suppressive
46:16for for being with us today and
46:18holding this this last exchange of the year
46:22so just organism thériault so there will be a new season of
46:28open exchanges starting next year and the first time will be
46:32in the end of january and we will share more information on the form
46:37that yeah and will provide further do
46:43yeah thanks thanks everybody for participating today and
46:47have a nice christmas have a nice
46:50third of a year and yes you mixed yeah
46:54thank you thanks for having me

Про це відео

Read a blog-post version of this openXchange live talk here on Medium!

How can we sacrifice correctness, reliability, and precision of an algorithm to increase its efficiency, while still producing useful results? Approximate computing takes a different view of the role and effect of approximations in computations. Instead of considering approximations as a necessary evil of digital computers, one tries to allow application-specific admissible approximations in order to reap significant improvements in energy efficiency. A cross-layer view of applications, numerical methods and computer architecture opens up new perspectives and research questions.

Christian Plessl is professor (W3) for High-Performance Computing at the department of Computer Science at Paderborn University. He is also managing director of the Paderborn Center for Parallel Computing (PC²), which is a central scientific institute of Paderborn University and a National High-Performance Computing center in the NHR alliance. In the last years, he has been dealing with different aspects and application fields of Approximate Computing.