Це відео відноситься до openHPI курсу clean-IT: Towards Sustainable Digital Technologies. Бажаєте побачити більше?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
- 00:05As I said already to the last exchange of 2022
- 00:10and today we are very pleased to have the wonderful Christian Plessl here
- 00:16who is going to talk about the topic of approximate computing.
- 00:20But first I'm going to tell you a little bit more about his work.
- 00:25He got a Doctorate in Computer engineering at ETH Zurich,
- 00:32after that he got to be the full professor for high-performance computing
- 00:37at the University of Paderborn,
- 00:40and also is currently the director of the Paderborn center for the Parallel Computing
- 00:47and in his career he also this year got the world record for
- 00:54the first research group worldwide to achieve acceptable performance
- 00:58by simulating a task of two spike protein.
- 01:02And today we'll talk about as I already said about the
- 01:05beauty of approximate computing and how it can help us
- 01:10go forward in this world and how that is connected to community as well
- 01:15though I am with that I give the stage to you
- 01:19Christian and please
- 01:22share your slides.
- 01:26Thanks for having me here today, I'm very happy to be
- 01:32invited to this symposium because
- 01:36I think this is also a topic that is close to my heart so
- 01:38kind of the combination of computing and sustainability and efficiency
- 01:43is exactly what the topic of approximate computing for me at
- 01:47least my interpretation is. And in the next
- 01:5115 minutes I will give you a broad overview of my take
- 01:55on the topic and then I will be happy to share
- 02:00additional comments and discuss with you about this topic.
- 02:06So let me start outlining the challenge that we currently facing
- 02:10in the world of computing and as you have seen from the introduction
- 02:15I'm working in high performance computing, so I
- 02:18have not always worked in high performance computing I have
- 02:20a background also in embedded systems where energy and energy
- 02:24efficiency and power consumption have been very important topics
- 02:28of concern for a very very long time. But in recent years we
- 02:31have also seen that these topics becoming more relevant and
- 02:35of utmost importance for economic and ecological reasons also in
- 02:40high performance computing and data centers.
- 02:43And this is a charge that you frequently see in this talks
- 02:47when someone talks about computing. So what you see here is
- 02:50the performance that all development for high performance computing centers have
- 02:55supercomputers more general, so here on the axis you see a
- 02:59time scale from 1994 to the year 2001.
- 03:03And what is shown here is the development of the performance
- 03:06of the so called top 500 list
- 03:10is a very well known
- 03:13list of like a world ranking of the fastest supercomputers
- 03:19and for competing between the supercomputers a specific benchmark
- 03:23application is executed which is the high performance olympic
- 03:27application, this is a way of solving a linear equation system
- 03:32a very large one with a supercomputer
- 03:36and this requires a certain number of arithmetic operations and
- 03:43dividing the number of operations by the time that they have
- 03:46been executed by the time used for solving the special problems
- 03:50that gives you the float number of floating-point operations flops per second
- 03:56what you see here is the development in this list. So in the
- 04:00first iteration of this list that was published in 1994
- 04:03to rank supercomputers the top system here
- 04:08achieved a 1 Teraflops per second
- 04:12and this is the aggregated sum so that the top number one system achieved
- 04:17422 Mega flops or 422
- 04:21million floating-point operations per seconds
- 04:24and that was that the last system, so this was
- 04:29number 1. So the list ranks the supercomputers according
- 04:33to the benchmarks and gives them ranks from 1 to 500
- 04:36on this list you see three things the blue line is the aggregated
- 04:40sum of the performance of all of the computers of this five
- 04:42hundred computers on the list the red one is the performance
- 04:46of the top of the most powerful system on the list and the top 5,
- 04:51the orange line is the one that just made it on the list of
- 04:56the top 500 most capable supercomputers
- 04:59so it is so in 1994 the lowest ranked system
- 05:04had four hundred twenty two mega flops the highest rank had
- 05:08about 60 Giga flops and in aggregates summing up all of
- 05:12the performances of these five hundred systems you achieve one point
- 05:16one seven teraflops now when you saw here over this development
- 05:20of the time you see this logarithmic scale here on the left we see this
- 05:25expected exponential increase of performance that is also expected
- 05:30and predicted by moore's law
- 05:32so this applies for both phenomena the number one system
- 05:37followed this trend of exponential growth also for the
- 05:40number five hundred system and also the sum
- 05:44of the aggregated performance follows roughly this trend but
- 05:47you also see here that here about in two thousand and thirteen
- 05:51there was kind of a change in the system so you see here that
- 05:55is slowing down down we are still on an exponential growth path
- 05:59but the rate of the growth has slowed down around
- 06:05ten years ago
- 06:10more really with much more rally at his cost is you if you look
- 06:17at the slide you think everything is going well except for
- 06:19a slight slowdown but the performance is still exponential
- 06:22increasing at an exponential rate however we also need to consider also the power
- 06:29that was used to achieve this results and this is
- 06:34you see here for the top ten systems how much power these systems
- 06:39are consuming so in two thousand o eight such a system consumed about two megawatts
- 06:46of power and today these top ten systems that they they require about
- 06:52ten roughly ten megawatts of
- 06:56performance this is the fifteen
- 06:58fifty most powerful system of the world and this is the five
- 07:02hundred so you see follow similar trends but what you see here
- 07:06in the increase in particular on the top you see that over
- 07:09ten years the power consumption was increased by a factor of eight
- 07:13for fifty most by four and by two so you have an exponential
- 07:18increase in performance but also a linear grow in the power
- 07:21consumption that means our systems have become more expecially
- 07:25powerful but also be investing more and more power to operate the system
- 07:30what we also do not see in this charts is
- 07:33that we investing more and more money to install the system
- 07:37so they're also very expensive so because moore's law is slowing down
- 07:42the systems become larger and larger and as requires a lot
- 07:45of investment
- 07:48the problem of energy consumption or increasing power is not
- 07:54only of relevance for high-performance computing but this is
- 07:58the same story also is true for commercial data centers this
- 08:01is data that was from a few years ago
- 08:05in frankfurt so the data centres that are located in frankfurt's
- 08:10they take about nineteen percent of the total power consumption
- 08:13in frankfurt and this is about the same quantity as the airport
- 08:17in frankfurt which is one of the largest passenger airports
- 08:21i believe both are million of passengers per day
- 08:26so it's a they there are eighty zero hundred people working
- 08:31at frankfurt airport so it's really massive operation
- 08:35so it's very clear that's the energy consumption of data centers has become
- 08:39getting a lot of concern and there's also some
- 08:43very strong movements to improve on the energy efficiency and and
- 08:46recently also more legislation to make sure that the energy
- 08:51is used to a more appropriate way
- 08:56i mentioned already worse law which is essentially
- 09:02saying that semiconductor technology was for a long time driving
- 09:07this development and we can become used to it some worse law predicted that
- 09:12transistor density at the minimum cost so it's not a technological
- 09:17but more an economical law doubles about every twenty four months
- 09:22but hand in hand with moore's law means we get more transistors but
- 09:27it went hand in hand with the so called better scaling that sets the
- 09:31don't did not even have more transistors but they also became more efficient
- 09:37so that means the electrical power dissipation remains constant
- 09:40when we double the transistor density and they even run
- 09:45a bit faster by forty percent so that means we do not only
- 09:48get more transistors but also more they also become more efficient and faster
- 09:53and this was fueling this whole industry
- 09:56of computing and was like a guiding principle to drive this
- 10:02exponential increase in performance however as we know that
- 10:06every exponential growth must come to an end finally there's two main
- 10:11areas y d moore's law must come to an end so one is the technological
- 10:18reasons for instance lithography reliability leakage and power
- 10:21density so so we are becoming to such small structure sizes
- 10:25that the size of the atoms are getting an important factor
- 10:29that will limit us at some point and there's also economical
- 10:33factors so if we produce this very super advanced semiconductors
- 10:37that become smaller and smaller the
- 10:40successful production yield is reduced and the cost of semiconductor production is increased
- 10:46and moore's law was predicted many times to die for either
- 10:51economical or technological reasons you see here a chart a
- 10:54very nice chart from the economist
- 10:57from two thousand sixteen that
- 11:01shows these projections here and this chart so this is when
- 11:05the prediction was made and when the end of moore's law was
- 11:08predicted so gordon moore in nineteen ninety five predicted that the
- 11:12morse law will end around two thousand five for economical
- 11:16reasons and then he further on the new projections like in
- 11:20two thousand five he made another predictions as well this will continue
- 11:25two thousand five he made another prediction say will continue
- 11:28to the range of twenty fifty to twenty five so see it has been
- 11:33predicted to end many times by different people who are also
- 11:38very smart so far it is still going
- 11:42strong we see some slowdown but it's not at its ends
- 11:45but at some point we will reach the end of we're slow now maybe
- 11:53if we look now semiconductor technology is developing so what
- 11:58we see is that
- 12:01we can no longer rely here on the circuit technology so
- 12:05for a historic reasons historically the semiconductor technology
- 12:11was a reliable drivers for performance inefficiencies with
- 12:14more transistors that were more efficient and this is what
- 12:16we can call us the bottom up innovation but still we don't
- 12:21see any post silicon technologies available on the horizon that really work
- 12:26so what we need to do onto the gets the earth to post silicon
- 12:30devices we need to have at the top innovation so we need to
- 12:33live with current cmos semiconductor technology and try to
- 12:37get more efficiency and performance until these devices are available
- 12:43and there are two approaches which i personally found very
- 12:45promising one is approximate computing and this goes very well in
- 12:50hand-in-hand with cross-layer optimization and co-design and
- 12:54in the next couple of slides i will give you some intuition
- 12:58of what this approximate computing is and then show some perspectives
- 13:01where this can lead us
- 13:04so what is approximate computing
- 13:06if you look at the design of computing systems we will recognize
- 13:10that there are some implicit assumptions and design goals when
- 13:14we design computing systems so one of these assumptions is
- 13:18correctness so as illustrated here by this pocket calculator
- 13:21we have here the calculation of three times three we expect
- 13:25this to be nine and not seven
- 13:29the other assumption that we have inherently is reliability
- 13:32so we want to assist them so that that works always in a reliable
- 13:38way and what sometimes works sometimes doesn't so this also
- 13:42requires us to make certain compromises or safeguards to ensure this reliable operation
- 13:49we also want that our systems they are of course by nature
- 13:54any visual representation of nature and numbers must be an
- 13:58approximation as can be integer numbers of floating-point numbers
- 14:01but still the implicit assumption is we want to have the computation as
- 14:06as precise as possible and here for this number of pi
- 14:10and then finally we want to have determinism so if we do the
- 14:13same computation for pi again we want that the same number
- 14:16is the result we don't have a variance in results
- 14:20the question is now so what can we gain
- 14:23if we relax this assumptions or give up this assumption so
- 14:27is there anything to be gains in performance and efficiency
- 14:31and if so what are the conditions and situations when we can afford to
- 14:36relax this assumptions
- 14:39and this brings us straight to the topic of approximate computing so the
- 14:43basic idea of approximate computing is to say well
- 14:47if we sacrifice precision or correctness or reliability
- 14:53can we gain something in performance and energy efficiency
- 14:56and the answer obviously is yes
- 14:59on the question as we need to find out so in what circumstances
- 15:03can we give up on precision correctness and or reliability
- 15:06and how can we exploit this fact for having for creating systems
- 15:11that are more energy efficient
- 15:14now you would assume that there are certain applications for
- 15:17having these parameters like precision correctness and reliability are of utmost
- 15:22importance so this is why we design our systems to have them
- 15:25but there are many applications and application domains where
- 15:29we don't really needs
- 15:31this kind of systems for instance when we do audio video and graphics processing
- 15:36so you see here on the top right a video that has been compressed
- 15:39you see some compression artifacts to to exaggerate the effect
- 15:43but what you see is it's just a visual impression and our human
- 15:46perception is very inaccurate anyway so we are tolerant to certain
- 15:52fluctuations in approximations also
- 15:56discussed when we process external signals as come from sensors
- 16:00there's also some noise in the signal centers probably no need
- 16:03to compute the signals with double-precision floating-point accuracy
- 16:07and then finally there are some application areas in particular
- 16:11in the area of machine learning where we don't have a true
- 16:14golden answer available but we working on for instance some classifications
- 16:19problems like image classifications and since also the methods
- 16:23that we implementing is an approximation
- 16:26for instance for human image recognition we can also use approximation
- 16:31in the implementation of neural networks for instance
- 16:35this idea of approximation cannot be implemented at many different
- 16:38levels so one level how it can be implemented is at the application level
- 16:43or at the algorithm or library level so we modify the application
- 16:46or libraries so that they are inherently approximate
- 16:50and a computer like exploit the fact that we don't need precise computation
- 16:58another possibility is to implement this at the computer architecture level
- 17:02so for instance we could have functional units instead of computing an accurate multiplication
- 17:08we could maybe if the norm the ranges of the numbers are very constraints
- 17:12we could have a multiplier that computes an approximation of
- 17:15a multiplication or we can even go down to the circuit level
- 17:18that we work with voltage over scaling
- 17:22or frequency of scaling and operate the system at the level
- 17:25where we have reduced carbons and introduce some stochastic errors but the
- 17:29computation is much more efficient
- 17:32i may have a focus of my researches to leverage approximate
- 17:35computing to accelerate high-performance computing applications
- 17:38and in particular with an eye on making efficient use of accelerators
- 17:42such as fp js or chip use
- 17:45now well here is a slide that shows you
- 17:50why we want to work with reduced precision and what we can
- 17:54gain in area and in energy for instance what you see here is if we
- 17:58can use the precision from a thirty two bits adder to a eight bit adder
- 18:03became about a factor of three and relative energy costs
- 18:07and more interesting so when we move from a thirty two bit floating-point
- 18:12multiplication which requires here this technology three point one picket chew
- 18:17to a thirty two add to an eight bit integer multiplication
- 18:22we see here that we achieve a
- 18:25reduction of nineteen in terms of energy and then twenty seven
- 18:31in terms of area that means we can have massive
- 18:35reductions in chip sizes and also in energy consumption if
- 18:39we can move to lower precision and computing and here we only
- 18:43talk about different data types not even of stronger forms of approximations
- 18:49one example from our research where we exploit this fact is
- 18:54using lower mix precision in scientific computing for computing operations on matrices
- 19:01in this case what we started here in this work is we computes the
- 19:05in worst piece root of a matrix for instance this could be the
- 19:11the inverse of the matrix but also other
- 19:15in worst piece fruits
- 19:18so for computing this operation for the matrix inversions if
- 19:23if p equals to one then this is just a matrix inversion there's
- 19:26a method that has been proposed by bini et all which is an iterative
- 19:30operations of e just take the matrix and use this iterative equation here
- 19:37to computes the inverse of the matrix step by step and then
- 19:40what we can do is we can in each step we can computes the error
- 19:45between the precise results and the approximate results and
- 19:48this is shown here on the right
- 19:50so what we did here is if he computes this for a matrix a with
- 19:56single-precision floating-point precision that has twenty two bits of mantissa
- 20:00then we see here on this left hand sides the error of the computation
- 20:04and then in every step of this iteration equation we see with
- 20:09increasing iterations the error is reduced until we here after
- 20:14seven iterations approximately we converge here to minimum
- 20:17error which is then an approximation of the inverse of the matrix
- 20:22now it is interesting if he computers with reduced precision
- 20:25for instance have eighteen bit monte sour fourteen or ten bits
- 20:30we see that initially the convergence behavior of this equation
- 20:33is completely the same and then the convergence stops when
- 20:37we reach a limit that is then given by the available mantissa
- 20:42so the floating point representation does give us more doesn't
- 20:45give us more precision to converge to a better degree
- 20:49so what will become see here is that this iterative methods allow us to
- 20:54approximate a value of here of the inverse matrix and what
- 20:59we could do for instance here an approximate computing you
- 21:02see here we could do the first approximations here say until
- 21:07the iteration four or five we could do this with a low precision
- 21:12for instance here with fourteen bit precisions and then change
- 21:16to twenty two bits precision and then converge still to the
- 21:22to the very low error margin if this required for application
- 21:25of just twenty five operations and stick here with this error
- 21:28margin if this is acceptable for our application
- 21:34the second and last application i would like to give you an
- 21:37idea for a very different way of approximation that is now not related to
- 21:43just numerical approximations but in this case we do an algorithmic operation
- 21:50and what we do here is we computes a so called matrix function
- 21:53so it is a function that we compute a function of a matrix
- 21:58that could be for instance b and it worse or a roots or another
- 22:02a function that takes of a matrix and computes a
- 22:06another matrix
- 22:10what we have developed is a method which is which we denote
- 22:12as the sub matrix method that will explain in a bit how it works
- 22:17so the main idea is that we have this matrix a here and then
- 22:21we apply a function to the matrix and computes a approximation of the function
- 22:27and how the method works is it looks at this matrix a
- 22:31and which this is now important is the sparse matrix and what we do is
- 22:36we can words the evaluation of the function on a very large sparse matrix into
- 22:43a very parallel execution of the applying the same functions
- 22:48to very small dense matrix so how we do this is we build from this very large
- 22:54matrix a here we go column by column and in each column we take the
- 22:59non zero elements of the column and the corresponding non zero
- 23:04rows of this matrix which are symmetric
- 23:07and then from the first row we build the sub matrix a here
- 23:11from the second row we take here non zero and the the entry so we take
- 23:16here the first the second and the the sixth value and take the first
- 23:23the second and the sixth column of the matrix build the sub matrix
- 23:28then we can individually apply this function on the sub matrix
- 23:34which is now a dense or much denser matrix
- 23:38and then we can take the corresponding column of this matrix
- 23:41and fill it in in the same structure as the original matrix
- 23:46so the benefit of this matrix is now about we can work here on the very
- 23:51large sparse matrix and treat this problem as a massively parallel
- 23:56problem that works here on a very large number of smaller matrix which are dense
- 24:01this allows us to have on one hand we have a now a very parallel problem
- 24:07and we can see that the problems are completely independent from each other
- 24:12and we have shown by a theory and practical evaluation that the way of
- 24:17computing this matrix function in this way leads to acceptable
- 24:22errors for a algorithms in in computational chemistry
- 24:30now this is a form of approximation that works very well for a
- 24:36being included in ab initio molecular dynamics applications
- 24:40and additionally through this kind of sub matrix application
- 24:42we can also to a the similar operations that i've shown before
- 24:47so on each of the sub matrix we can also use this iteration methods to gradually
- 24:53compute these functions
- 24:55i wanted to show animation which unfortunately doesn't
- 24:58work here but the the punch line of this algorithm is that
- 25:03we are able to to be with the errors that we introduce we can treat them
- 25:09as noise on our accurate
- 25:13simulation results and we have found a way using this larger
- 25:18type of equations how we can perfectly compensate this noise if he computes
- 25:24average values of observables from this molecular simulations
- 25:33here you see this also in this molecular simulation so you
- 25:36see the atoms and they're moving slightly left and right
- 25:39and the idea is if you're interested to find the average distance
- 25:43between two molecules which is shown here with this value here
- 25:47you don't need to know where each molecule is moving exactly
- 25:50and you don't need to have the trajectories of each of the
- 25:52molecules but you only want to have like an example average
- 25:56of the distance between two atoms so since we introduce some
- 26:00approximations in the computations that can be treated as white noise
- 26:04these errors or this white noise can be
- 26:06perfectly cancelled out and although we doing approximate computations
- 26:11we can still accurately compute for instance here there's a distance between
- 26:17two atoms
- 26:20so this brings me to the conclusion of my talk so
- 26:23this approximate computing is a technique that is not one specific
- 26:28technique but it is more a paradigm or concept that we can say
- 26:32we willingly accept in accuracies in the evaluation of scientific algorithms
- 26:38and by accepting this in a curacies we get instead an improved
- 26:44energy efficiency and we can apply this concept to many different
- 26:47levels of our systems so we can apply it at the application level
- 26:52by using a problem specific quality requirements of the result
- 26:56we can apply it on the algorithm level for instance by using
- 26:58approximation tolerant algorithms or lower mix precision what
- 27:02i just showed in this talk
- 27:04is possibilities to apply approximate computing at the software
- 27:08level there has been work on approximation where compilers and runtime systems
- 27:13we can build approximate arithmetic units so in this talk i only talked about
- 27:18using
- 27:22precise arithmetic different bit widths and data representations
- 27:26but we can have more crude
- 27:28approximations or of arithmetic computation and finally we
- 27:32can though go down to the circuit level and work with techniques
- 27:36like dynamic voltage scaling and frequency over scaling so
- 27:39we reduce the operating voltage of the of the circuit so low
- 27:44that we introduce some stochastic errors so this gives us much
- 27:48improved energy efficiency but we introduce some errors that
- 27:52also exhibit then some noise in the results
- 27:56without i hope i give you some idea how approximate computing
- 27:59can be seen as a paradigm for improving efficiency and performance
- 28:02so it's very broadly applicable and it allows us to do this
- 28:05innovation at the top so we don't need any disruptive technology
- 28:09or post silicon technology but essentially what we do is we
- 28:13take the existing algorithms and apply the work maybe take
- 28:18the existing technologies and apply new algorithms to make
- 28:22better use of what we currently have in the harbor of course
- 28:25if we can also tailor the hardware and this brings it back
- 28:28to my research in field-programmable gate arrays where you
- 28:31can tailor the execution architecture you can even go beyond
- 28:35that and then build custom arithmetic units and highly tailor
- 28:39its architectures that work cross layer
- 28:42so that i would like to conclude and i will be very happy to
- 28:45take any questions that you may have
- 28:50thank you the plaza was a very interesting target in my opinion and
- 28:56yeah we're really happy to have any questions right now from the audience
- 29:02and so you can write them shut or raise your hand to sure
- 29:06you want to speak up and other than that i can also
- 29:11add some questions on my own and i think we already had the
- 29:14first question from sebastian heckler
- 29:19which i will just read out
- 29:21so is there investigation to bring the submatrix method in hardware
- 29:26or something
- 29:28that provides the guaranteed accuracy
- 29:34i'm
- 29:43not sure what it means can we bring the sub- matrix okay there's
- 29:46two things can we bring it to the hardware the others is there guaranteed accuracy
- 29:51i haven't talked about
- 29:53this in so much detail but in the in the introduction you mentioned
- 29:56that we we had this work on a very large scale simulation of
- 30:00a molecular system it is the
- 30:03worked on the coa covered nineteen virus and also in the hiv virus too
- 30:08to show the limits of what can be done of molecular or lack
- 30:11of optimistic simulations of viruses so here our target from
- 30:16the beginning was cpu's white gpus because chip used for our today is very
- 30:22highly tailored for machine learning so that means instead
- 30:25of having double precision arithmetic there is single precision
- 30:28and half precision and even eight bit arithmetic in two days at
- 30:32chip use so the idea was the optimized for tens linear algebra
- 30:37so the whole idea between the sub matrix methods have used
- 30:41for this application is to
- 30:43translate this problem that works on sparse super large sparse matrices
- 30:49into a problem that works on dense matrices which are much
- 30:53smaller in the sweet spot of the gpu this is exactly what we do so we
- 30:58have matrices sparse matrices that are many millions by millions
- 31:02of entries and we cut them down into pieces that are like thousand by thousands
- 31:07and we can operate on them in mix precision computing with
- 31:11mixed sixteen and thirty two bit floating-point precision this
- 31:14is exactly the sweet spot of the gpu
- 31:17and all of these computations are completely independent from
- 31:20each other so we can scale our problem to two massive amounts
- 31:24and this is what we did on the most powerful chip
- 31:27system at that time with
- 31:29no risk and so it's completely tailored to the gpu because we know for
- 31:33sure deep learning begets low precision mixed
- 31:36precision hardware and we try to make good use of this hardware
- 31:39for other purposes than machine learning
- 31:42the second part of the question not sure yet correctly is is
- 31:46the guaranteed accuracy in this case no no so
- 31:50we we cannot guarantee an accuracy of course it depends on
- 31:53the operations how we computes that the operation of the sub
- 31:58matrices there we have seen for instance this in worse roots problem i have
- 32:03we have some conversions behavior this is also mathematically proven
- 32:07the difficult part is the methods per se very
- 32:11cuts down the problem of working on a very large sparse matrix
- 32:15in a large sequence of independent sub matrices of course we
- 32:19introduce an error by that because
- 32:23you cannot just decompose the problem right so there's some
- 32:27error that you introduce for the area that we working on we're
- 32:31working on specific matrices called hamiltonians and in computational chemistry
- 32:36this error doesn't affect us so badly and we can also curve
- 32:40perfectly compensated for our sample averages but this is not
- 32:44a technique that works for any use case so it's really application
- 32:47dependent
- 32:53this answers the question and we got another one which is
- 32:58that you mentioned sacrifice determinism without further drilling into examples
- 33:03how much potential do you see improved elastic computing yeah there is that
- 33:08yeah so so what i meant where i see potential is in this
- 33:12voltage over scaling or frequency over scaling or meaning
- 33:16reducing the supply walters to a level where you start introducing statistical
- 33:23malfunctions or errors in the computation this is one way how
- 33:28can you have like a non deterministic results because if carbons become lower
- 33:32there's also an alter
- 33:35field an approximate computing that has proposed by someone which is a
- 33:39probabilistic computing very computes on on for instance on
- 33:43binary bit streams and there is also whole field how you can
- 33:47do arithmetic on binary bit streams
- 33:50which is very convenient because it's very easy to like if
- 33:52you have just two bit streams and you need to have very simple logic operations
- 33:57the operations can be extremely efficient
- 34:00i'm not an expert in that fields i have talked to people that
- 34:02have worked more in that and they were not so convinced they
- 34:05said like like you the overhead is quite drastic that you introduce and
- 34:10they thought it's not worth the effort so so i don't have much personal experience
- 34:15i've seen the idea but i have not seen convincing results yet but also didn't
- 34:20investigate this so much so i don't see it
- 34:24as a reasonable architecture for for the scientific computing
- 34:28field where i'm working in maybe for embedded system or signal
- 34:33processing this is more appropriate
- 34:40right now we don't have any further questions from the church
- 34:44so uh
- 34:47um yeah i can just ask my question person and thanks a lot
- 34:50for the talk and i think it's quite interesting to see
- 34:53there's a lot of applications from this field which can also
- 34:56save energy of course and so my question would go into the direction
- 35:00and what do you think how many researchers or engineers in
- 35:03the approximate computing field already focus on saving energy
- 35:07by what they're doing is it like
- 35:09are you kind of like an outstanding if you think about energy
- 35:12consumption and saving energy with these with these techniques or
- 35:16is it generally a very widely
- 35:20a known that you can save energy it isn't a top goal for many researchers
- 35:28is it yeah it's a good question i don't know i
- 35:31thought the field is developing i think from two areas so one
- 35:36is from there's people from coming from computer architecture that that or
- 35:40embedded systems circuit design that trying to
- 35:45go beyond what they are currently doing and there is also people
- 35:49that are practitioners coming from
- 35:52numerics applied math high-performance computing linear algebra
- 35:58library development and so forth and and and i think
- 36:03the hyper full forms computing people they're also now a computer
- 36:08developing mixed precision linear algebra libraries for instance
- 36:12they also use iterative methods to converge the correct solutions so
- 36:16they are mostly interested in exploiting the computer architectures
- 36:20in an efficient way so they see there's lots of computing units
- 36:23that can do low precision and and other single and half precision computing and think
- 36:27what can we do with it and on by on the way they say
- 36:31they save save energy i think the energy saving part is not
- 36:36their main concern but they're mostly interested in performance
- 36:39and the of course also are interested in the saving of energy i think
- 36:45so far the topic of
- 36:49how much operational cost is involved in doing computation is not
- 36:54is not of much concern to most researchers in application domains
- 36:58because they don't pay for the computing time and also not for the energy
- 37:02i think there is some some some shift going on currently and
- 37:06with the increasing energy costs that will even be more pronounced
- 37:10and i think now even those who didn't care about the problem
- 37:13are starting to care about the problem so i think it will become
- 37:16more relevant for the future
- 37:19but i i think for people in in application sciences
- 37:24they're mostly interested in the results
- 37:27whereas in computer science and computer engineering like our
- 37:31research is most about also about the methods as we publish
- 37:34about how these methods work do some error analysis so this
- 37:37is all what for them it doesn't even count
- 37:40them counts just what is the specific properties of material x
- 37:44and this is completely out of their scope but i think from
- 37:48the compute centre operators and from the people working on
- 37:55computer architecture and in systems design it is a topic of much concern
- 38:00these days thanks a lot
- 38:04thank you so if there not any other questions maybe i can add
- 38:09one myself as well
- 38:12as another one sir or commodore not sure to technically
- 38:16interest chip production is not cutting edge at the moment
- 38:20that would be to some see through nanometers
- 38:22and dusty since you have a prediction on further increasing
- 38:25the integration density of transistors and chips linked to energy efficiency
- 38:31maybe you have some insights i
- 38:34know and i'm unfortunate i don't have much many insights on
- 38:37that so so i'm i'm mostly working or not mostly i'm only working where
- 38:42as my main physical of technology are fhs and and also chip use and abuse
- 38:48a bit and they're we just take what is given to us by the vendors
- 38:52as well have a look at those trends but i'm not so super familiar
- 38:57with the current predictions for the semiconductor technologies
- 39:01i think there is still some integration density are these towards
- 39:05intel claims and and and
- 39:08you see it on the charts
- 39:09but this is all about quite far in the future so i don't have
- 39:14any specific insights on that
- 39:19yeah maybe one other question for me would be so you talked
- 39:23about some applications of approximate computing some machinery
- 39:26signal processing audio video and do you see any other
- 39:31future applications that you didn't mention and that aren't
- 39:34using approximate computing right now but maybe will be using
- 39:37it in the future and are there any dangerous and using approximate computing
- 39:43and those areas
- 39:45any any dangerous yeah like
- 39:50anything that you have to consider before
- 39:53using it in your applications
- 39:56so so yes what i'm trying to do is likely that the fields of
- 40:02audio video processing machine learning i think there it's it's obvious
- 40:06that there is some potential for approximate computing
- 40:09those all what i've been trying also with collaborators is to find
- 40:14areas where we can apply the idea of approximate computing
- 40:18that are non obvious and and where you really care about precision
- 40:23you want to have controls
- 40:25results right and and i mentioned that for instance one of
- 40:29the errors is whenever i have an iterative methods that converges
- 40:32to a result so the basic idea is ok let's do a coarse approximation
- 40:37to the results and then do a couple of iterations at the very
- 40:40end to do some iterative refinements to converge to the perfect
- 40:43result i think this is kind of
- 40:47an area where there are some first works in scientific computing
- 40:51but it's not widely used there's some first libraries coming up and
- 40:54and people really trying to exploit those capabilities that
- 40:57you have in today's hardware i think this is a that there are
- 41:01probably many areas where you could apply this this idea but
- 41:05it is not widely used i think this will mostly be then taken
- 41:09up by libraries people will develop some some some scientific
- 41:13based libraries and then people just use it under the hood without
- 41:17developing the methods themselves so
- 41:20i think there is still a lot of use a lot of potential
- 41:30the dangers are of course that you
- 41:36translate the technique to a field without knowing what the
- 41:41pre-conditions were for instance this is the sub matrix method
- 41:44a i mentioned and this is also going a little bit of out of my comfort zone
- 41:49but the point is that you introduce a certain type of error in
- 41:56the computation and the question is what do you do with the
- 41:59matrix that you have at the end now in quantum chemistry you
- 42:03use this matrix to multiplies
- 42:06and and that only what you're interested is only the trace
- 42:09of the matrix and the off-diagonal elements are not so super
- 42:12relevant so for this kind of operation it works very well now
- 42:15if you take just a method say we'll pass that all have this
- 42:18nice sub matrix method and we are just applied for some other arbitrary
- 42:22sparse matrix problem then you may up with completely wrong results because the
- 42:28underlying assumptions that we have that we are mostly concerned about
- 42:33the trace of the matrix
- 42:36is no longer true and and and i think this is
- 42:40this could be a problem rather people try to
- 42:45not understand the implicit assumptions that are in the work
- 42:48and applied to field where this is no longer true and those could be like
- 42:52algorithms that do not converge or give random results or just
- 42:56garbage in garbage out so i think this is a concern
- 43:01i also wouldn't use it for super safety-critical operations
- 43:06but i think that's that's anyway clear now
- 43:11it makes sense i'm ok then maybe that's end on a practical question
- 43:17so at which scale do you think using
- 43:20a personal computer makes sense on a practical level he talked
- 43:24a lot about a supercomputer where the scale is like astronomical
- 43:28really hard to to grasp and but is it also possible to use
- 43:33it on your mobile phone on your personal computer and do you
- 43:37think it would make sense in the future or is it something that's like
- 43:41mostly geared for researchers right now no no
- 43:45i think it's a necessity so so so i only talked about
- 43:49approximate computing as wanted technique to have this
- 43:53at the top innovation but
- 43:56i really needs much more top at the top innovation is a very
- 44:01nice paper which reference the method by leiserson at all
- 44:05i like to expand on this like a lot of room on the top or similar title
- 44:10says like approximate computing is one technique but there
- 44:14is also many other things that we can do to get still more
- 44:17benefit from current technique right reducing software bloat
- 44:21doing proper performance optimization doing better data structures
- 44:25of course better algorithms always
- 44:27beats a lot right so sorry like like this is the
- 44:30first having good algorithms this is the the highest leverage
- 44:34that you can get but then there follows a lot of things across the computer layers
- 44:38where you can do some improvements and then
- 44:41this is there is approximate computing is one piece of the puzzle
- 44:46but you can reducing software flow to improve performance optimisation
- 44:51getting a only computing us what you really need
- 44:55that if you want to have like the weather forecast for for
- 44:58potsdam you don't maybe needs to compute a cold model that
- 45:02also computes the weather in tokyo and to narrow it down and
- 45:06i think there is a lot of potential there without changing
- 45:10anything in the semiconductor technology this this is all on
- 45:13top and maybe there is also exciting things coming
- 45:17down the pipeline but there is still a lot of potential today
- 45:20so this is why i think it's not at all only for supercomputing i think it's
- 45:24that this needs to be applied to hold
- 45:27the complete computing stack
- 45:29of course if you think about danger centers in computer and high performance computing
- 45:34this is such large scale is it's such high operation costs
- 45:38that you're really driven to do this kind of optimizations
- 45:42there but also for your cell phones for instance there is maybe
- 45:46not the operation cost but its battery life right so so the
- 45:49battery life at it is limited it will not become better in
- 45:52the same space significantly so the only thing that you can
- 45:56do is you can find new way to conserve energy and this is the
- 46:01way one of the ways to go
- 46:04thank you for that ending note
- 46:08great message to also for everybody in the audience
- 46:11to hear and yeah with that i want to thank you again suppressive
- 46:16for for being with us today and
- 46:18holding this this last exchange of the year
- 46:22so just organism thériault so there will be a new season of
- 46:28open exchanges starting next year and the first time will be
- 46:32in the end of january and we will share more information on the form
- 46:37that yeah and will provide further do
- 46:43yeah thanks thanks everybody for participating today and
- 46:47have a nice christmas have a nice
- 46:50third of a year and yes you mixed yeah
- 46:54thank you thanks for having me
Щоб увімкнути запис, виберіть мову в меню налаштувань відео.
Про це відео
Read a blog-post version of this openXchange live talk here on Medium!
How can we sacrifice correctness, reliability, and precision of an algorithm to increase its efficiency, while still producing useful results? Approximate computing takes a different view of the role and effect of approximations in computations. Instead of considering approximations as a necessary evil of digital computers, one tries to allow application-specific admissible approximations in order to reap significant improvements in energy efficiency. A cross-layer view of applications, numerical methods and computer architecture opens up new perspectives and research questions.
Christian Plessl is professor (W3) for High-Performance Computing at the department of Computer Science at Paderborn University. He is also managing director of the Paderborn Center for Parallel Computing (PC²), which is a central scientific institute of Paderborn University and a National High-Performance Computing center in the NHR alliance. In the last years, he has been dealing with different aspects and application fields of Approximate Computing.