- 1. Getting geeky with Git #1. Remotes and upstream branches
- 2. Getting geeky with Git #2. Building blocks of a commit
- 3. Getting geeky with Git #3. The branch is a reference
- 4. Getting geeky with Git #4. Fast-forward merge and merge strategies
- 5. Getting geeky with Git #5. Improving merge workflow with rebase
- 6. Getting geeky with Git #6. Interactive Rebase
- 7. Getting geeky with Git #7. Cherry Pick with Reflog
- 8. Getting geeky with Git #8. Improving our debugging flow with Bisect and Worktree
- 9. Getting geeky with Git #9. Understanding the revert feature
- 10. Getting geeky with Git #10. The overview of Git hooks with Husky
- 11. Getting geeky with Git #11. Keeping our Git history clean with fixup commits
Branches are the bread and butter of a software developer using a Version Control System (VCS) of any kind.
Today we explore how they work in Git.
In the previous part of this series, we’ve learned that a commit is a full snapshot of the project state. In Git, the branch is a pointer to a particular snapshot. We can think of it as an indicator of a top of a cluster of commits.
The above is in contrast to other Version Control Systems, where we have to create a copy of our source code. Thanks to the way they work in Git, the branches are lightweight and effortless to create.
The branch is a pointer
One of the best ways to learn is to do it with examples. Below, we inspect the express-typescript repository that is a part of the TypeScript Express series.
First, let’s look into the latest commit in the repository using git log:
commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)
Author: Marcin Wanago <wanago.marcin@gmail.com>
Date: Sun Apr 26 19:11:32 2020 +0200chore(): update @types/mongoose
This long string describing the above commit is a hash. In the previous part of this series, we learn that it acts as an identifier generated based on the contents of the commit.
Using the --points-at argument, we can get a list of branches that point to a specific commit.
1 |
git branch --points-at 5b5bb249e4990e672a96bbe4800a6e36d9a60962 |
* master
It might prove to be difficult to find a Git repository without the master branch. Although it is not mandatory to have one, the git init command creates it and it is considered a standard.
Every time we make a commit, the branch pointer moves forward. To see what commit does a branch point to, we can use git show-branch:
1 |
git show-branch --sha1-name postgres |
[4fcd357] test(Authentication): test if there is a token in the registration response
The above output contains a shorter version of a hash. It is not unusual for Git, and other commands can do this also.
A good example is git rev-parse with the –short argument. By default, it produces a hash that is at least seven characters long. If it is not unique, it returns more characters.
The HEAD
Above in the output of the git log command, we can see a list of branches:
commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)
Simplifying it, HEAD is a pointer to a commit that our repository is checked out on. It most cases it means, that HEAD points to the same commit that a branch that we currently use.
If we make a commit, HEAD now points to it.
A detached head
Although HEAD usually points to a current branch, it is not always the case.
When we use the git checkout command, we specify which revision of our repository we want to work with. A typical way to use it is with a branch name:
1 |
git checkout postgres |
The above causes the HEAD to point to the last commit in the postgres branch. However, when using git checkout, we can also provide a hash of a specific commit:
1 |
git checkout 3238d85 |
HEAD is now at 3238d85 feat(Posts): create a relation between the Post and the User
The above puts us in a detached HEAD state. A detached state happens when we check out to a specific commit instead of a branch. If we make some changes now and commit them, they don’t belong to any branch!
When we make changes while having a detached HEAD, we can still create a new branch containing the new code. To do so, we can use the checkout -b new-branch command.
Git reset
Understanding the HEAD can come in handy. An example is deleting an unpushed commit. Let’s inspect closer the code provided in this StackOverflow answer:
1 |
git reset --soft HEAD~1 |
1 |
git reset --hard HEAD~1 |
The job of git reset is to reset the HEAD to a specified state.
By providing HEAD~1, we point to a parent of the last commit. By doing so, we remove the last commit that we’ve made.
We could also delete more than just one commit, for example, by typing HEAD~4 and removing four commits.
We can check what the HEAD points to by looking up the HEAD file in the .git directory. If we are using a Unix-like system, we can do this with the cat command:
1 |
cat ./.git/HEAD |
ref: refs/heads/master
The branch is a type of a reference
In the HEAD file above, we can see a path: refs/heads/master. It leads to a file located in the .git/refs/heads directory and contains the hash of a commit that the master branch points to.
1 |
cat ./.git/refs/heads/master |
5b5bb249e4990e672a96bbe4800a6e36d9a60962
In Git, references point to a specific commit. The
The .git/refs/heads directory contains all of our local branches.
If you want to know what is a local branch, check out the first part of this series.
1 |
ls ./.git/refs/heads/ |
master postgres
Based on the above, we can determine, that branch is a type of a Git reference. Other types of references are tags and remotes.
When we make a commit to a master branch, Git moves the master pointer. Now, it refers to a new commit. To do so, it has to update the .git/refs/heads/master file.
We can also update refs ourselves using the git-update-ref
Packed-refs
As our repository grows significantly, the above approach might not prove to be very performant. Because of that, Git periodically compresses refs into a single file.
By doing so, Git moves all branches and tags into a single packed-refs file. If you ever wonder why your .git/refs directory looks empty, this might have taken place.
We might force the above behavior using the git-gc utility. Let’s do so on the express-typescript repository.
1 2 |
git gc cat ./.git/packed-refs |
# pack-refs with: peeled fully-peeled sorted
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/heads/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/heads/postgres
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/remotes/origin/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/remotes/origin/postgres
The more branches we have locally, the more of them end up in the packed-refs file.
Summary
It turns out that branches are references stored in the .git/refs directory. Another important aspect connected to them is the HEAD file. It points us to a commit that is currently checked out. It means that usually, HEAD refers to a current branch. We call it a detached HEAD if it points to a specific commit instead of a branch. Knowing all of the above might help us avoid issues and resolve them if we bump into any.