Este vídeo pertenece al curso Künstliche Intelligenz und Maschinelles Lernen in der Praxis de openHPI. ¿Quiere ver más?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
Scroll to current position
- 00:00In this unity we now want to see how programming libraries work,
- 00:04having just looked at each other, how the foundations of Python are constructed.
- 00:09programming libraries are Now something very, very exciting.
- 00:13And that's what you can imagine. because we are now Python code for example
- 00:18, we can write this It's easy to spread.
- 00:21Not only we can, but we can also completely different developers.
- 00:25And the great thing is that, through this Dissemination finally we code also from
- 00:30other programmers.
- 00:32For example, if the The challenges and the solution
- 00:38and then developed it publicly we can make this code available to us
- 00:42look and research and then use it yourself if necessary.
- 00:46Now, there are programming libraries that are in the end, a hodgepodge of code
- 00:52, not just maybe from one person, but perhaps from several developers
- 00:56are well known and are well used, we would like to use our own
- 01:02to solve challenges.
- 01:03And finally, you can imagine that programming libraries
- 01:08are, after all, something like tools.
- 01:10That is, this function, which we can just reuse.
- 01:13And now there are tools. For example, the
- 01:17to run data science.
- 01:19Let us look at an example see what a library like that is.
- 01:24how it is built, how to use it and then towards the end of the
- 01:30Let's take a very quick look at what unity is. Librarize it like this in the Data section
- 01:34Science. For this example, we will We're going to look at the Pandas Library.
- 01:41suitable for this purpose, for example: and then process data.
- 01:48And what we take for that is for example, the following CSV file.
- 01:52So CSV is just a data format. that often looks like you can
- 01:58first row comma-separates the column , then simply the values,
- 02:03, then again comma-separated.
- 02:04So a file that you can easily into an Excel workbook.
- 02:09And now we don't want an Excel folder , but write a Python program,
- 02:13what this file reads and then maybe processed.
- 02:17And exactly for this purpose there are There are, for example, pandas.
- 02:22So this library of pandas can that is simply the name of this
- 02:26Library, so like you would call a hammer,
- 02:30can be used to Library called pandas.
- 02:32This is what the developers have We decided that it should be.
- 02:36And after now, pandas, this Python library installed on our system,
- 02:42we can then Import Python code, that is, just like this
- 02:48tool, what we want to use, or, if you like, tool box,
- 02:51to place on our tool table and then be able to work with it.
- 02:56It's very easy in Python via "import pandas as PD".
- 03:01So PD is just, after all, that we say we don't always want to
- 03:04Write out pandas. This is too long, we prefer to write PD.
- 03:08And when we do that, we kind of go along with it. Execute this command the tool with us
- 03:14placed on the table and can now use it.
- 03:16And now, for example, we have just under this file path, i.e. /data/housing.csv
- 03:23this CSV file we want to import.
- 03:26And Pandas offers a functionality, namely with read CSV.
- 03:31It is as in the previous Video simply shows a functionality that
- 03:37has been developed. Not by but from the Pandas developers.
- 03:41What we can do is, run this function now.
- 03:45The result in the variable DF, which is short for data frame, and then
- 03:51of the ECB. As we can see now, this CSV file has been successful
- 03:55and we can now work with it.
- 04:01Now I just said that We didn't write this code.
- 04:05that the Pandas developers and developers.
- 04:09We can now just jump in the Source code behind it because pandas
- 04:13is completely open.
- 04:15So it's an open-source project. We are asking quickly determine how much work is behind it.
- 04:19And that's what we're looking at. single file and pandas consists of many
- 04:25Files. And we already see here alone that it's 1497 lines of code, that's really quite
- 04:33a lot. That really means every single line of code has some instruction, be it now
- 04:38an import statement from another library, or hold function descriptions.
- 04:44And now we can just with ctrl-F after the read CSV file,
- 04:48Find function description and must skip a few values.
- 04:56And then come to the function signature and see how the
- 05:00Developer of pandas this feature is written with a whole lot of arguments.
- 05:05That is, things that we pass to the function such as the file path.
- 05:10And if we keep scrolling down. then we finally see the source code.
- 05:15Yes, that means we can now look behind it how this function is actually implemented
- 05:21has been reported. Now it can be quite elaborate. such a library by the source code alone
- 05:26to understand and apply.
- 05:28And because Pandas is established and by many, is used by many developers, there is
- 05:35a very good documentation. So a Documentation made for humans.
- 05:40We can also have a look and see for example, the documentation page for
- 05:47pandas.read_csv, see Here again, all the arguments.
- 05:50but can now be found here Read documentation page very well
- 05:55and also simple and understandable how to use these arguments.
- 05:58for example, filepath_or_buffer, This is a string and we can now
- 06:05here in the description in the free text, as
- 06:08We will use this argument best.
- 06:14Now, of course, you have to know. Pandas offers even more functionality.
- 06:19So there is not only read.CSV, but also for example, to filter options.
- 06:24Something I can say in my data Frame, just show me those values.
- 06:29which are used for the total rooms attribute have a value greater than 1000.
- 06:34When we do this, we see the values, of course not all
- 06:40but the values that just meet this condition, for example.
- 06:45Now you have to know that Pandas provides such functionality
- 06:52that there is a read.csv, or that there is data filtering.
- 06:57And it can't be that easy because that's just so many functionalities
- 07:03where the question then often arises, how do we know I think I'm going to do that with pandas.
- 07:08can? And the simplest answer is actually, that if you don't know
- 07:15, mostly building up the knowledge, Just by getting started.
- 07:18That is, you have an idea of something that you want to program, for example
- 07:22a calculator or a Machine learning model or whatever,
- 07:26and you get going until you realize, that you encounter a problem,
- 07:30what you don't It's as simple as that.
- 07:33Yes, we could just do that be reading the file, so that we do not
- 07:38independently, the function , which reads this CSV file for us.
- 07:43That can be, for example.
- 07:45So what we could do is we could do it with a microscope. simple Google search or any other
- 07:50Search engine Python read csv file and search for that term; and
- 07:58would then be highly likely to Libraries or simple
- 08:03How to kick That's the problem.
- 08:06And if we then, for example, , you can search for this solution
- 08:10and read the documentation and then apply stop.
- 08:14And typically, when you do this then you can remember that for
- 08:20next time. Or you hit it over and over and must search for it every time.
- 08:24But you know at least that it somehow this library seemed to give.
- 08:28And so it just happens that you have to time builds knowledge about such tools and
- 08:33for example, if I know CSV files I want to read, oh, there were pandas.
- 08:38that was this library, the I can do another research now
- 08:41and then maybe hit the solution I need.
- 08:44Exactly, and with that, you have Libraries finally just a
- 08:49powerful tool to use.
- 08:51And that's not just for Pandas, so not only pandas is the only
- 08:57Programming library, but it also exists, for example, in the data
- 09:01Science area libraries for visualization techniques.
- 09:04For example, there is matplotlib for or to perform more complex mathematics,
- 09:10e.g. more complex Formulas or numerical principles,
- 09:14There is a numpy for that. What we will also be watching in particular,
- 09:18I look forward to it personally a very special one,
- 09:21is to view scikit-learn, i.e. standard machine learning library for
- 09:28Python. And, really, we're going to be doing this kind of programming. Libraries of current course in many places
- 09:34reconnect with the and to many places.
To enable the transcript, please select a language in the video player settings menu.
Sobre este vídeo
- Erratum, ab ca. 0:12 min: Das Zitat stammt aus dem Jahr 1675 und nicht 1965
- Auf GitHub haben wir alle Materialien für die praktischen Einheiten zusammengefasst und für Sie aufbereitet.