Git explained: Rewriting history
One of Git's core features is "rewriting history", i.e., "altering" existing commits. I'm using quotation marks, because — despite the appearance — the Git history is immutable. It is by design impossible to modify or delete an existing commit with regular Git commands.
And yet, among many Git users, there exists this fear of losing some changes or creating a mess. In reality, any committed change can be considered safe, i.e., it can be retrieved even after "altering" the history.
- Part 1: Rewriting history
- Part 2: Commit ranges
What does rewriting mean?
Why would you ever want to alter Git history anyway? Let's imagine you've made a typo in the most recent commit:
* 5c7a782 - (HEAD -> main) Fix deefct 47
* fb2546f - Add checkout page
Usually, you can use commands like reset
or rebase (-i)
to "rewrite" the Git history. However, correcting the last commit is fairly common, so there is an easier alternative:
git commit --amend -m "Fix defect 47"
Another look into the Git history shows the correct message:
* 2b049ea - (HEAD -> main) Fix defect 47
* fb2546f - Add checkout page
It appears as if we have just altered the last commit. However, the hash has changed from 5c7a782
to 2b049ea
, which means that we have created a new commit. This is the reason:
Since we've just changed the commit message, Git created a new commit with a different hash. The same thing happens when "moving" (e.g. rebasing) commits, because the parent commit changes.
Unreachable commits
But where did the other commit go? By default, git log
hides every commit that is not reachable from any pointer, like HEAD
or a branch. Those unreachable (or dangling) commits can be displayed with the --reflog
flag:
git log --graph --oneline --reflog
* 2b049ea - (HEAD -> main) Fix defect 47
| * 5c7a782 - Fix deefct 47
|/
* fb2546f - Add checkout page
Alternatively, you can use git reflog
to find unreachable commits:
git reflog
2b049ea (HEAD -> master) HEAD@{0}: commit (amend): Fix defect 47
5c7a782 HEAD@{1}: commit: Fix deefct 47
fb2546f HEAD@{2}: commit: Add checkout page
As we can see, "altering" history is nothing else than creating new commits and moving the HEAD
and main
pointers. Hence, the term "alternative history" is more suitable. It also means that we can revert our "destructive" commit --amend
command if required:
git reset --hard 5c7a782
Author date vs. commit date
Git keeps two timestamps for each commit:
- Author date: The date of the original commit.
- Commit date: The date of the (last) "altered" commit.
When creating a new commit, both timestamps will be equal. However, they will differ after "altering" an existing commit. When using commands like show
or log
, Git displays the commit date by default. To view both timestamps, use the fuller
format:
git show --format=fuller 2b049ea
commit 2b049eadac74e183e48b918e377e41765fca2a99
Author: Darek Kay
AuthorDate: Thu Mar 31 19:18:02 2022
Commit: Darek Kay
CommitDate: Fri May 6 18:26:49 2022
If you want to sync the commit date with the original (author) date when "altering" Git history, use the --committer-date-is-author-date
flag:
git rebase -i --committer-date-is-author-date
Garbage collection
Previously, I've claimed that all commits can be considered safe. However, there is a limitation:
Especially in a rebase flow, you will create and copy a lot of commits. The garbage collector does some housekeeping and removes all abandoned commits after a certain time. In my daily work, I rarely want to keep them. If you do, assign a branch to a dangling commit:
git branch my-branch 5c7a782
Altering public history
As long as we are "altering" commits that are not public (i.e., they were never pushed to a remote repository), you're free to do whatever you want. Things get tricky when we want to move a public branch around.
A common Git best practice states:
Do not alter public history.
I think this is great advice for Git beginners, but it can be limiting if you and your colleagues know what you're doing.
Let's first see the consequences of rewriting public Git history. Let's assume the flawed commit from the last section had been already pushed to the origin
remote. After running git commit --amend
the log would look like this:
* 2b049ea - (HEAD -> main) Fix defect 47
| * 5c7a782 - (origin/main) Fix deefct 47
|/
* fb2546f - Add checkout page
If we try to push our corrected commit to origin
, we'll get an error:
git push
To ../origin/
! [rejected] main -> main (non-fast-forward)
error: failed to push some refs to '../origin/'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
This is the expected behavior, because by default a git push
is only allowed to add new commits onto the latest "tip" of the remote (origin
). Git suggests a simple solution: "integrating the remote changes". In our case, git pull
is equivalent to git merge origin/main
, which is not what we want. Instead, we want to replace the remote commit. This can be achieved with a force push:
git push -f
git push --force-with-lease # safer version
Now everything looks fine:
* 2b049ea - (HEAD -> main, origin/main) Fix defect 47
* fb2546f - Add checkout page
But now Janet and Steve and all your other colleagues who work on the main
branch will get the same issue as you've had before. That's why projects often disallow a force push for shared branches (e.g. main
, develop
).
How do we deal with this situation?
When in doubt, follow the best practice "Do not alter public history". A typo looks bad, but do you want to go through all the hassle to fix it?
If you do have to force-push a shared public branch, the first and most important step is communication. All your colleagues working on the same branch should know to expect an issue when interacting with the shared repository. They should also know how to fix this situation. For amended changes, here's a solution that covers most use cases:
git pull --rebase
This command will fetch the remote branch and rebase all local commits on top of it. Amended commits will be resolved automatically (so the typo commit will be skipped).
Another solution might be to discard any local changes and reset the local main
branch to origin/main
:
git reset --hard origin/main
Other use cases may be harder to fix, including (interactive) rebasing and cherry-picking. Always consider the trade-offs before force-pushing.
What if you're working alone on a public branch? Then you're usually free to force-push as much as you like. Again, communication is key here. I would rephrase the Git best practice from above:
In general, do not alter public history on branches that multiple people work on.
Conclusion
Version control is an underrated skill. Most software engineers use it daily, and yet, many are not willing to invest more than necessary to learn it. That's fine, but knowing more than commit/push/pull will at least make you more efficient. It will also help you solve issues you (and your colleagues) may encounter.
I hope this article explains Git's default "safety net" behavior and motivates you to try out some advanced features.