reduce git repo size
git repos can grow fat after long time of use; fortunately, there are some easy commands to reduce their size;
garbage collection
the primary downsize technique is garbage collection; the command we use here for garbage collection is:
git gc --aggressive --prune=now
this deletes unreachable objects in the repo in an aggressive manner; you can
find more about this in git help gc
;
prune reflog
if you have indeed read the doc shown by git help gc
, you see it says:
git gc
tries very hard not to delete objects that are referenced anywhere in your repository. In particular, it will keep not only objects referenced by your current set of branches and tags, but also objects referenced by the index, remote-tracking branches, reflogs (which may reference commits in branches that were later amended or rewound), and anything else in therefs/*
namespace. Note that a note (of the kind created by git notes) attached to an object does not contribute in keeping the object alive. If you are expecting some objects to be deleted and they aren’t, check all of those locations and decide whether it makes sense in your case to remove those references.
in short, we need to remove unnecessary references to let gc actually delete the objects; as it says, there are several kinds of references:
-
current set of branches and tags;
-
index;
-
remote-tracking branches;
-
reflogs;
-
anything else in the
refs/*
namespace;
the one that is common but often ignored is reflog, which we can prune with:
git reflog expire --expire=90.days.ago --expire-unreachable=all --all
this command prunes reachable reflog entries that are 90 days or older, and all
unreachable reflog entries; when used together, run this command before git gc
so that git gc
takes its effect:
git reflog expire --expire=90.days.ago --expire-unreachable=all --all
git gc --aggressive --prune=now
delete all commits, but one
warn: this can be destructive!
sometimes we have a lot of legacy in our git repo, and what we really want is merely its current state (ie: the last commit); when this is the case, we can remove all the other commits, which can reduce the repo size quite a bit:
git checkout --orphan newbranch
git add -A
git commit
git branch -D master
git branch -m master
git push -f origin master
this creates a new orphan branch as the master
branch, then discards the old
one (including all the commits only reachable from it); this technique is very
potent, so use it with care; if you need to push the new branch to remote, use
forced push; finally, this technique is not limited to the master
branch and
the name newbranch
is arbitrary;
when you have done this, you can append the same downsize commands above; note that because you have freed the old branch, there are possibly many objects to delete; so this can be a huge saving (at the expense of losing commit history); as such, you may not want this in a public git repo;