https://rtyley.github.io/bfg-repo-cleaner/

an alternative to git-filter-branch

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

  • Removing Crazy Big Files
  • Removing PasswordsCredentials & other Private data

The git-filter-branch command is enormously powerful and can do things that the BFG can't - but the BFG is much better for the tasks above, because:

  • Faster : 10 - 720x faster
  • Simpler : The BFG isn't particularily clever, but is focused on making the above tasks easy
  • Beautiful : If you need to, you can use the beautiful Scala language to customise the BFG. Which has got to be better than Bash scripting at least some of the time.

Usage

First clone a fresh copy of your repo, using the --mirror flag:

$ git clone --mirror git://example.com/some-big-repo.git

This is a bare repo, which means your normal files won't be visible, but it is a full copy of the Git database of your repository, and at this point you should make a backup of it to ensure you don't lose anything.

Now you can run the BFG to clean your repository up:

$ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

The BFG will update your commits and all branches and tags so they are clean, but it doesn't physically delete the unwanted stuff. Examine the repo to make sure your history has been updated, and then use the standard git gc command to strip out the unwanted dirty data, which Git will now recognise as surplus to requirements:

$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

Finally, once you're happy with the updated state of your repo, push it back up (note that because your clone command used the --mirror flag, this push will update all refs on your remote server):

$ git push

At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data. It's best to delete all old clones, as they'll have dirty history that you don't want to risk pushing back into your newly cleaned repo.

Examples

In all these examples bfg is an alias for java -jar bfg.jar.

Delete all files named 'id_rsa' or 'id_dsa' :

$ bfg --delete-files id_{dsa,rsa}  my-repo.git

Remove all blobs bigger than 50 megabytes :

$ bfg --strip-blobs-bigger-than 50M  my-repo.git

Replace all passwords listed in a file (prefix lines 'regex:' or 'glob:' if required) with ***REMOVED***wherever they occur in your repository :

$ bfg --replace-text passwords.txt  my-repo.git

Remove all folders or files named '.git' - a reserved filename in Git. These often become a problemwhen migrating to Git from other source-control systems like Mercurial :

$ bfg --delete-folders .git --delete-files .git  --no-blob-protection  my-repo.git

For further command-line options, you can run the BFG without any arguments, which will output text like this.

【转】BFG Repo-Cleaner: Removes large or troublesome blobs like git-filter-branch does, but faster.的更多相关文章

  1. Mysql_大字段问题Row size too large.....not counting BLOBs, is 8126.

    [问题描述] 1.从myslq(5.7.19-0ubuntu0.16.04.1)中导出sql脚本,导入到mysql(5.5.27)中,报如下错误:Row size too large. The max ...

  2. How to get started with GIT and work with GIT Remote Repo

    https://www.ntu.edu.sg/home/ehchua/programming/howto/Git_HowTo.html#zz-7. 1.  Introduction GIT is a ...

  3. (AOSP)repo checkout指定版本

    aosp 怎么切换分支? To properly switch Android version, all you need to change is branch for your manifest ...

  4. Git与Repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  5. repo的用法

    转自:http://blog.csdn.net/junglyfine/article/details/6299636 注:repo只是google用Python脚本写的调用Git的一个脚本,主要是用来 ...

  6. Git与Repo入门(转载)

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAykAAADuCAIAAACyDd+sAAAAA3NCSVQICAjb4U/gAAAgAElEQVR4Xu ...

  7. [git]Git与Repo入门

    转自:http://www.cnblogs.com/angeldevil/archive/2013/11/26/3238470.html 注:非常推荐的一篇关于git的博文 目录: 版本控制 一.原始 ...

  8. git和repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  9. repo的小结

    repo仅仅是google用Python脚本写的调用git的一个脚本,主要是用来下载.管理Android项目的软件仓库. 1. 下载 repo 的地址: http://android.git.kern ...

随机推荐

  1. eclipse项目改为maven项目导致svn无法比较历史数据的解决办法

    这个问题没有找到合适的答案,最终自己想出了一个解决方案,在此记录下. 问题描述 在将老的eclipse项目重构为maven项目时,我这边是新建了一个maven项目,然后将对应的数据分别放到相应的位置, ...

  2. MySQL运行内存不足时应采取的措施?

    排除故障指南:MySQL运行内存不足时应采取的措施? 天一阁@ 老叶茶馆 1周前 导读 排除故障指南:MySQL运行内存不足时应采取的措施? 翻译团队:知数堂藏经阁项目 - 天一阁 团队成员:天一阁- ...

  3. java随机分配端口占用其它服务端口问题完美解决

    问题描述:  java创建socket连接,创建的随机客户端端口占用了其它服务的端口,导致该服务无法启动 解决: 1.linux系统为java或其它程序提供随机端口配置项 查看端口范围:sysctl ...

  4. webpack打包vue -->简易讲解

    ### 1. 测试环境: 推荐这篇文章:讲的很细致 https://www.cnblogs.com/lhweb15/p/5660609.html 1. webpack.config.js自行安装 { ...

  5. nginx ssl通讯优化思路

    TLS通讯过程中主要做的两件事情: 1.交换密钥 2.加密数据 如果优化的话,主要也是从这两个点来考虑优化: 1.nginx 打开session cache 如一天内不需要再次协商密钥. 2.小文件较 ...

  6. excel表格公式无效、不生效的解决方案及常见问题、常用函数

    1.表格公式无效.不生效 使用公式时碰到了一个问题,那就是公式明明已经编辑好了,但是在单元格里不生效,直接把公式显示出来了,网上资料说有4种原因,但是我4种都不是,是第5种原因,如下图: 这种情况是由 ...

  7. 无法生成core dump文件的几个原因

    1. 进程无写权限(如目录不可写.存在同名的非regular文件(目录或符号链接)等) 2. 存在同名文件且有多个hard link 3. 文件系统空间不足 4. 指定目录不存在 5. 进程的RLIM ...

  8. 【sql】ALTER更新数据库字段

    --数据库表字段更新sql脚本,以下是示例 --新增列字段 --alter table table_name add column `address` varchar() COLLATE utf8_b ...

  9. note 12 集合Set

    集合Set +无序不重复元素(键)集 +和字典类似,但是无"值" 创建 x = set() x = {key1,key2,...} 添加和删除 x.add('body') x.re ...

  10. 二、Python发展始

    1989年的圣诞节,Guido开始编写Python语言的编译器.Python这个名字,来自Guido所挚爱的电视剧Monty Python’s Flying Circus.他希望这个新的叫做Pytho ...