Git Tools - Rewriting History
# Git Tools - Rewriting History
Many times when working with Git, you may want to revise your commit history. One of the great things about Git is that it allows you to make decisions at the last possible moment. You can decide which files go into which commits just before committing with the staging area, you can decide you don't want to work on something yet with git stash, and you can rewrite commits that have already happened to make it look like they happened differently. This can involve changing the order of commits, changing messages or files in a commit, squashing or splitting commits, or removing commits entirely -- all before sharing your work with others.
In this section, you will learn how to accomplish these tasks so your commit history looks the way you want before sharing it with others.
| Note | Don't push your work until you're happy with it. One of the fundamental principles of Git is that, since a lot of work in your clone is local, you have tremendous freedom to rewrite your history locally. However, once you push your work, that is a completely different story. You should consider pushed work as final unless you have a good reason to change it. In short, avoid pushing your work until you're satisfied with it and ready to share it with others. |
|---|---|
# Changing the Last Commit
Modifying your most recent commit is probably the most common history rewriting operation. You typically want to do one of two things with your last commit: simply change the commit message, or change the actual content by adding, removing, or modifying files.
# Changing the Commit Message
If you just want to modify the most recent commit message, it's simple:
$ git commit --amend
This command loads the last commit message into your editor for modification. When you save and close the editor, it writes the updated message as a new commit, which becomes your new most recent commit.
# Changing the Actual Content
On the other hand, if you want to change the actual content of the last commit, the process is similar: first make the changes you want, stage them, then use git commit --amend to replace the old last commit with the new improved one.
Be careful with this technique because amending changes the commit's SHA-1 checksum. It's like a mini-rebase -- don't amend your last commit if you have already pushed it.
| Tip | An amended commit may need an amended message. When you amend a commit, you can modify both the content and the message. If you amend the content, you almost certainly need to update the message to reflect the changes. On the other hand, if your amendment is trivial (fixing a typo or adding a forgotten file), you can simply make the change, stage it, and avoid an unnecessary editor step with: $ git commit --amend --no-edit |
|---|---|
# Changing Multiple Commit Messages
To modify a commit further back in your history, you must use more complex tools. Git doesn't have a dedicated history-modification tool, but you can use the rebase tool to rebase a series of commits based on their original HEAD rather than moving them to a new one. With the interactive rebase tool, you can stop after any commit you want to modify, then change the message, add files, or do anything you want. You can run rebase interactively by adding the -i option to git rebase. You must indicate how far back you want to rewrite by telling the command which commit to rebase onto.
For example, to change the last three commit messages (or any of those three), supply the parent of the most recent commit you want to edit as the argument to git rebase -i, i.e., HEAD~2^ or HEAD~3. It may be easier to remember ~3 because you're trying to edit the last three commits, but note that you're actually designating four commits ago -- the parent of the commit you want to modify:
$ git rebase -i HEAD~3
Remember again that this is a rebasing command -- every commit in the HEAD~3..HEAD range with a changed message, along with all its descendants, will be rewritten. Don't include any commits that you've already pushed to a central server -- doing so will create two versions of the same change and confuse others.
Running this command gives you a list of commits in your text editor that looks like this:
pick f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
# Rebase 710f0f8..a5f4a0d onto 710f0f8
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# . create a merge commit using the original merge commit's
# . message (or the oneline, if no original merge commit was
# . specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
An important thing to note is that these commits are listed in the opposite order from what you normally see with log. Running log would show something like:
$ git log --pretty=format:"%h %s" HEAD~3..HEAD
a5f4a0d added cat-file
310154e updated README formatting and added blame
f7f3f6d changed my name a bit
2
3
4
Note the reverse order. The interactive rebase gives you a script that it's going to run. It will start with the commit you specified on the command line (HEAD~3) and replay each commit's changes from top to bottom. It lists the oldest commit at the top instead of the newest, because that will be the first one to be replayed.
You need to edit the script so it stops at the commit you want to change. To do this, change the word 'pick' to 'edit' for each commit you want to modify. For example, to only modify the third commit message, change the file like this:
edit f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
2
3
When you save and exit the editor, Git takes you back to the last commit in the list and drops you to the command line with this message:
$ git rebase -i HEAD~3
Stopped at f7f3f6d... changed my name a bit
You can amend the commit now, with
git commit --amend
Once you're satisfied with your changes, run
git rebase --continue
2
3
4
5
6
7
8
9
These instructions tell you exactly what to do. Type:
$ git commit --amend
Change the commit message, then exit the editor. Then run:
$ git rebase --continue
This command automatically applies the other two commits, and you're done. If you change 'pick' to 'edit' on more than one line, you need to repeat these steps for each commit marked as 'edit'. Each time, Git stops, lets you amend the commit, and then continues.
# Reordering Commits
You can also use interactive rebase to reorder or completely remove commits. If you want to remove the "added cat-file" commit and change the order of the other two, you can change the rebase script from this:
pick f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
2
3
to this:
pick 310154e updated README formatting and added blame
pick f7f3f6d changed my name a bit
2
When you save and exit the editor, Git rewinds your branch to the parent of these commits, applies 310154e then f7f3f6d, and then stops. You effectively changed the order of those commits and completely removed the "added cat-file" commit.
# Squashing Commits
With the interactive rebase tool, you can also squash a series of commits into a single commit. The rebase message script provides useful instructions:
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# . create a merge commit using the original merge commit's
# . message (or the oneline, if no original merge commit was
# . specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
If you specify "squash" instead of "pick" or "edit", Git applies both changes and merges the commit messages together. So if you want these three commits to become a single commit, modify the script like this:
pick f7f3f6d changed my name a bit
squash 310154e updated README formatting and added blame
squash a5f4a0d added cat-file
2
3
When you save and exit the editor, Git applies all three changes and then puts you back into the editor to merge the three commit messages:
# This is a combination of 3 commits.
# The first commit's message is:
changed my name a bit
# This is the 2nd commit message:
updated README formatting and added blame
# This is the 3rd commit message:
added cat-file
2
3
4
5
6
7
8
9
10
11
When you save, you have a single commit that includes all the changes from the previous three commits.
# Splitting a Commit
Splitting a commit undoes that commit, then partially stages and commits as many times as needed. For example, suppose you want to split the middle of three commits. You want to split "updated README formatting and added blame" into two commits: "updated README formatting" for the first, and "added blame" for the second. You can do this by modifying the rebase -i script, changing the instruction for the commit to split to "edit":
pick f7f3f6d changed my name a bit
edit 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
2
3
Then, when the script drops you to the command line, reset that commit, take the reset changes, and create multiple commits from them. When you save and exit the editor, Git rewinds to the parent of the first commit in the list, applies the first commit (f7f3f6d), applies the second (310154e), and then drops you to the command line. There you can do a mixed reset of that commit with git reset HEAD^, which effectively undoes that commit and leaves the modified files unstaged. Now you can stage and commit files until you have several commits, and then run git rebase --continue when done:
$ git reset HEAD^
$ git add README
$ git commit -m 'updated README formatting'
$ git add lib/simplegit.rb
$ git commit -m 'added blame'
$ git rebase --continue
2
3
4
5
6
Git applies the last commit in the script (a5f4a0d), and the history looks like this:
$ git log -4 --pretty=format:"%h %s"
1c002dd added cat-file
9b29157 added blame
35cfb2b updated README formatting
f3cc40e changed my name a bit
2
3
4
5
Once again, this changes the SHA-1 checksums of all commits in the list, so make sure none of those commits have been pushed to a shared repository.
# The Nuclear Option: filter-branch
There is another history-rewriting option you can use if you need to rewrite a large number of commits programmatically -- for example, changing your email address globally or removing a file from every commit. The command is filter-branch, and it can rewrite large swaths of your history. Unless your project is not yet public and nobody has based work on the commits you're about to rewrite, you should not use it. However, it can be very useful. You will learn a few common uses to get an idea of where it might be appropriate.
| Caution | git filter-branch has many pitfalls and is no longer recommended for rewriting history. Please consider using git-filter-repo, a Python script that does a better job for most use cases than filter-branch. Its documentation and source code are available at https://github.com/newren/git-filter-repo. |
|---|---|
# Removing a File from Every Commit
This happens fairly often. Someone accidentally commits a huge binary file with git add ., and you want to remove it everywhere. Maybe you accidentally committed a file containing a password and you want to open source the project. filter-branch is the tool you can use to scrub your entire history. To remove a file called passwords.txt from your entire history, use the --tree-filter option with filter-branch:
$ git filter-branch --tree-filter 'rm -f passwords.txt' HEAD
Rewrite 6b9b3cf04e7c5686a9cb838c3f36a8cb6a0fc2bd (21/21)
Ref 'refs/heads/master' was rewritten
2
3
The --tree-filter option runs the specified command after each checkout of the project and then recommits the results. In this case, you remove a file called passwords.txt from every snapshot, whether it exists or not. To remove all accidentally committed editor backup files, you can run something like git filter-branch --tree-filter 'rm -f *~' HEAD.
At the end you'll see Git rewriting trees and commits, then moving the branch pointer. It's generally a good idea to do this in a test branch and then hard-reset your master branch once you've verified the outcome is what you want. To make filter-branch run on all branches, pass the --all option.
# Making a Subdirectory the New Root
Suppose you've imported from another source control system and have several meaningless subdirectories (trunk, tags, etc.). If you want trunk to be the new project root for every commit, filter-branch can help:
$ git filter-branch --subdirectory-filter trunk HEAD
Rewrite 856f0bf61e41a27326cdae8f09fe708d679f596f (12/12)
Ref 'refs/heads/master' was rewritten
2
3
Now the new project root is the trunk subdirectory. Git will automatically remove all commits that didn't affect the subdirectory.
# Changing Email Addresses Globally
Another common scenario is when you forgot to run git config to set your name and email before starting work, or you want to open source a project and change all your work email addresses to your personal email. In any case, you can change email addresses in multiple commits at once with filter-branch. Be careful to only change your own email addresses, using --commit-filter:
$ git filter-branch --commit-filter '
if [ "$GIT_AUTHOR_EMAIL" = "schacon@localhost" ];
then
GIT_AUTHOR_NAME="Scott Chacon";
GIT_AUTHOR_EMAIL="schacon@example.com";
git commit-tree "$@";
else
git commit-tree "$@";
fi' HEAD
2
3
4
5
6
7
8
9
This goes through and rewrites every commit to include your new email address. Because commits contain the SHA-1 of their parent commits, this command changes every commit SHA-1 in your history, not just those matching the email address.