This video belongs to the openHPI course Let’s Git - Versionsverwaltung und OpenSource. Do you want to see more?
An error occurred while loading the video player, or it takes a long time to initialize. You can try clearing your browser cache. Please try again later and contact the helpdesk if the problem persists.
- 00:00Hello and welcome.
- 00:01In this video we will look at the Git data model and how it works, how Git internally represents data.
- 00:08First we start with a small recap of the, what you've already gotten to know in the first week.
- 00:27And you already know that Git has three important stages, the Working Directory, the directory where we can edit data, the staging area as an intermediate step to the repository.
- 00:29And you have also learned, if you want to add versions of files to the repository, then you must first run git add to add files to the staging area.
- 00:40And then git commit, to store them in the repository.
- 00:46Right. And now if we look at the Git data model, git add and git commit things happen in these steps, that Git does with the files.
- 00:54First, Git stores the files we want to add with git add as so-called blobs, is what Git internally calls the representation of the data.
- 01:05Git then calculates a checksum for each file, you already know the checksum, so this is a total, that Git can use to uniquely identify each file.
- 01:15Afterwards the checksums are added to the staging area.
- 01:19That is, if we were now to execute this command git add liste1.txt liste2.txt, Git would create a blob for each of these files,
- 01:29create a checksum for this blob and then add them to the repository.
- 01:34That is when we look at our overview again, in the first step from Working Directory to Staging Area
- 01:39Git calculates the checksums and adds them and also creates the blobs and adds them as well.
- 01:45Right. What happens at git commit? A little more happens,
- 01:50first Git then creates checksums for all subdirectories and then stores them as a so-called tree object with pointers to the stored file versions, i.e. to the blobs.
- 02:00Git also creates a commit object with metadata and a pointer to the tree object.
- 02:06And the metadata in this case is the name and e-mail address of the author of the commit, the commit message and even pointers to the commits that came before that, so-called predecessors or also called parents.
- 02:18Now, for example, let's create a commit with the commit message, "Added the first two lists", you already know this command from week 1, so git commit -m and then the commit message "The first 2 lists added".
- 02:32We have now inserted this again here in the graphic, so with git commit, checksums are created for the subdirectories, the tree object is created, and the commit object is also created.
- 02:44Exactly, we now look again at our example step by step.
- 02:49So we have our two lists, List1, where we collect gifts for example, the second list, where we collect recipes for e.g. a birthday cake.
- 02:58Now that we are running git add liste1 liste2 Git creates blobs for each one.
- 03:04These blobs have a certain size, which is not further relevant here.
- 03:07Then, if we remember back, the checksum is created, they're now written in bold here.
- 03:14These checksums are then also added to the staging area.
- 03:18Right. And then, remember, we do git commit, and the first thing that is created is a tree object.
- 03:27This tree object also has a checksum, but what is particularly interesting is what's inside the tree object, there are the pointers to the two blobs saved with the respective file name.
- 03:38The commit object is also created, and in there the tree object is referenced with the checksum, and in addition the metadata already mentioned are also stored there,
- 03:53for example the name we have presented here from the author of the commit and the customer message.
- 03:58Right. So now we've looked at an example with only one commit, but what does it look like when we have multiple commits in a row?
- 04:08Here you can see the example, what we have right now in a slightly smaller version.
- 04:13We still see all the metadata, and you can see, that we now have an entry Parent.
- 04:17However, this entry is still empty, this is because this is the first commit to the repository and that's why no commit has ever come before this commit.
- 04:26We'll see in a moment that Parent's checksums by the commits that come before it.
- 04:33We also see a reference down to snapshot A.
- 04:37This snapshot A now represents our tree object and the blobs underneath, we won't go into any more detail here.
- 04:46Now, if we make another commit, we see that a new commit has been added on the right.
- 04:51This commit also has all the known tenant data as well as a checksum, and we now see that in Parent the checksum of the commit was entered before.
- 05:00So that way Git knows exactly which commit came after which commit, and so we're sort of building a tree.
- 05:07Next, we'll take another look at what's happening, if another committal is added.
- 05:13Here we have now coloured once, what different pointers are stored in the commit object.
- 05:18So in blue we see here always the pointer to the tree object and thus for us to the snapshots of the files.
- 05:28Well, that's always the blue down arrow from a commit object.
- 05:32And we also drew in red the pointer to the predecessor of the respective commit object, that's the arrow that goes to the left.
- 05:42Exactly, so that's about it for the git objects.
- 05:49So now you know that there are blobs, trees, and commit objects in Git.
- 05:53We will be talking about commits in the next few videos as well, and so we don't have to write down this long list of metadata every time, we will use one of the most important principles of computer science, abstraction.
- 06:07That is, we can see our familiar example of 3 commits up here and see the example below with C1, C2 and C3, how we'll handle it in future videos.
- 06:16But remember, with every commit. there is also a snapshot, a tree object, and always append the blobs with the versions of the files below.
- 06:25There goes that video. Thank you so much for being there. See you next time.
- 06:31In the sense that Let's Git
To enable the transcript, please select a language in the video player settings menu.
About this video
Folie 9, 1:30: git add
fügt Änderungen natürlich nur zur Staging Area und nicht zum Repository hinzu.
Folie 12 ab 2:34 sichtbar: Statt Unterzeichnis sollte hier Unterverzeichnis stehen.
Folie 17 ab 3:40 sichtbar: Hier sollte die Commit Message lauten "Die ersten zwei Listen hinzugefügt"
Folie 20/21 ab 5:11 sichtbar: Der letzte Commit ganz rechts soll auf die Momentaufnahme C und nicht auf die Momentaufnahme B zeigen.