shrink git repo to a small size

oh yeah, tipical BHAW post!!

simple recipee to delete stuff from a repo that is getting too large and cluncky: trying to upload a thing like a small(ish) repo of only 50MB to a remote can disrupt, break, refuse and generally just give a pain when you decide to do some cleaning on the repo branches.

sometimes a lot of stuff will hang around, because some stubborn commit has a reference to a big file, a branch was not erased from existence, whatever.

so, step by step:
– use the a biggit_file_finder.sh to get the biggest files.
– check any dead ducks to kill(big uneeded files, or whole dirs with lots/big files)
make sure it’s not stuff currently used
– backup the repo just to be sure (just copy or zip the whole dir to another place)
– run: git gc --aggressive --auto (git – -help is your friend)
– run: git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch ' --tag-name-filter cat -- --all

the trick is that option, –ignore-unmatch, which makes git remove all that MATCHES the name given. so, you do NOT want the stuff that NOT matches the name.
–ignore-unmatch(stuff) = not(not(match(stuff) = match(stuff).

so, say that the magical script gives me a lot of files in /my/junk/filled/dir/… i can just pass the parameter –ignore-unmatch /my/junk/filled/… or, /my/forgotten/branch/big-file.mp4 and all that stuff goes away.
double negatives, you gotta love them. git, you gotta get off those drugs… and then people wonder how do they become like this.

but it’s not over yet:
– go to the parent dir (can be any dir, really, just to keep it simple)
– run: git clone –no-hardlinks file://path/to/git/repo/dir new-repo-name

and now it’s over. in the ‘new-repo-name’ there is a cleaner, faster, leaner, meaner, much smaller(hopefully) repo packed in a pocket size. for my case, it went down from 53MB to 513k in a few commands, popup the beer and back to work!!! 😀