Gurus on Stackoverflow have already answered it. I wrote a script which automate this process. I have written a python script to automate this process. This script accepts size of the file after -s switch (in bytes) and regular expression after -e switch to match against the name of the file. For example, if I want to delete files bigger than 20000 bytes and with names prefixed by pdf then I’ll have to use the script as following :
python git_search_and_purge.py -s 20000 -e .*pdf$
It might take a lot of time to complete the job. It writes full branch-tree as many times are their are commits. I believe you know the danger of doing this on a shared repository. Another script which only searches files bigger than a given size and regular pattern. Regular pattern is optional. If it is not given, all files bigger than given size are printed on console. This script is available here. Using it is safe. It does not change the state of repository in any way. You can dump its output to a file and then execute your evil plans accordingly. On github, there is an article on ‘removing sensitive data from github’ or something like that. Do read that article. Happy gitting!