https://rtyley.github.io/bfg-repo-cleaner/

an alternative to git-filter-branch

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

  • Removing Crazy Big Files
  • Removing PasswordsCredentials & other Private data

The git-filter-branch command is enormously powerful and can do things that the BFG can't - but the BFG is much better for the tasks above, because:

  • Faster : 10 - 720x faster
  • Simpler : The BFG isn't particularily clever, but is focused on making the above tasks easy
  • Beautiful : If you need to, you can use the beautiful Scala language to customise the BFG. Which has got to be better than Bash scripting at least some of the time.

Usage

First clone a fresh copy of your repo, using the --mirror flag:

$ git clone --mirror git://example.com/some-big-repo.git

This is a bare repo, which means your normal files won't be visible, but it is a full copy of the Git database of your repository, and at this point you should make a backup of it to ensure you don't lose anything.

Now you can run the BFG to clean your repository up:

$ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

The BFG will update your commits and all branches and tags so they are clean, but it doesn't physically delete the unwanted stuff. Examine the repo to make sure your history has been updated, and then use the standard git gc command to strip out the unwanted dirty data, which Git will now recognise as surplus to requirements:

$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

Finally, once you're happy with the updated state of your repo, push it back up (note that because your clone command used the --mirror flag, this push will update all refs on your remote server):

$ git push

At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data. It's best to delete all old clones, as they'll have dirty history that you don't want to risk pushing back into your newly cleaned repo.

Examples

In all these examples bfg is an alias for java -jar bfg.jar.

Delete all files named 'id_rsa' or 'id_dsa' :

$ bfg --delete-files id_{dsa,rsa}  my-repo.git

Remove all blobs bigger than 50 megabytes :

$ bfg --strip-blobs-bigger-than 50M  my-repo.git

Replace all passwords listed in a file (prefix lines 'regex:' or 'glob:' if required) with ***REMOVED***wherever they occur in your repository :

$ bfg --replace-text passwords.txt  my-repo.git

Remove all folders or files named '.git' - a reserved filename in Git. These often become a problemwhen migrating to Git from other source-control systems like Mercurial :

$ bfg --delete-folders .git --delete-files .git  --no-blob-protection  my-repo.git

For further command-line options, you can run the BFG without any arguments, which will output text like this.

【转】BFG Repo-Cleaner: Removes large or troublesome blobs like git-filter-branch does, but faster.的更多相关文章

  1. Mysql_大字段问题Row size too large.....not counting BLOBs, is 8126.

    [问题描述] 1.从myslq(5.7.19-0ubuntu0.16.04.1)中导出sql脚本,导入到mysql(5.5.27)中,报如下错误:Row size too large. The max ...

  2. How to get started with GIT and work with GIT Remote Repo

    https://www.ntu.edu.sg/home/ehchua/programming/howto/Git_HowTo.html#zz-7. 1.  Introduction GIT is a ...

  3. (AOSP)repo checkout指定版本

    aosp 怎么切换分支? To properly switch Android version, all you need to change is branch for your manifest ...

  4. Git与Repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  5. repo的用法

    转自:http://blog.csdn.net/junglyfine/article/details/6299636 注:repo只是google用Python脚本写的调用Git的一个脚本,主要是用来 ...

  6. Git与Repo入门(转载)

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAykAAADuCAIAAACyDd+sAAAAA3NCSVQICAjb4U/gAAAgAElEQVR4Xu ...

  7. [git]Git与Repo入门

    转自:http://www.cnblogs.com/angeldevil/archive/2013/11/26/3238470.html 注:非常推荐的一篇关于git的博文 目录: 版本控制 一.原始 ...

  8. git和repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  9. repo的小结

    repo仅仅是google用Python脚本写的调用git的一个脚本,主要是用来下载.管理Android项目的软件仓库. 1. 下载 repo 的地址: http://android.git.kern ...

随机推荐

  1. nc--windows下工具分享

    1.在windows下安装了9个memcached. 一些测试需要经常对这9个memcached的执行flush_all的操作 由于windows没有linux那样可以使用nc命令. 经过不懈搜索,找 ...

  2. 密码疑云 (3)——详解RSA的加密与解密

    上一篇文章介绍了RSA涉及的数学知识,本章将应用这些知识详解RSA的加密与解密. RSA算法的密钥生成过程 密钥的生成是RSA算法的核心,它的密钥对生成过程如下: 1. 选择两个不相等的大素数p和q, ...

  3. Java中List集合去除重复数据的方法

    1. 循环list中的所有元素然后删除重复 public static List removeDuplicate(List list) { for ( int i = 0 ; i < list. ...

  4. php7 使用simplexml扩展处理xml

    <?php $xmldoc = "<?xml version=\"1.0\" encoding=\"gb2312\"?> <s ...

  5. RN 获取组件的宽度和高度

    https://www.cnblogs.com/zhiyingzhou/p/7471212.html https://blog.csdn.net/calvin_zhou/article/details ...

  6. ASP.NET页面跳转的三大方法详解

    ASP.NET页面跳转有什么方法呢?,现在给大家介绍三种方法,他们的区别是什么呢?让我们开始吧: ASP.NET页面跳转1.response.redirect 这个跳转页面的方法跳转的速度不快,因为它 ...

  7. python3学习笔记12(变量作用域)

    变量作用域 参考http://www.runoob.com/python3/python3-function.html Python 中,程序的变量并不是在哪个位置都可以访问的,访问权限决定于这个变量 ...

  8. centos7安装nginx,以及使用node测试反向代理

    1.添加nginx的安装源 vi /etc/yum.repos.d/nginx.repo 2.输入下面内容,并保存退出 [nginx] name=nginx repo baseurl=http://n ...

  9. socket通信中select函数的使用和解释

    select函数的作用: select()在SOCKET编程中还是比较重要的,可是对于初学SOCKET的人来说都不太爱用select()写程序,他们只是习惯写诸如 conncet().accept() ...

  10. Shiro与基本web环境整合登陆验证实例

    1. 用maven导入Shiro依赖包 <dependency> <groupId>org.apache.shiro</groupId> <artifactI ...