两个不同内容计算出的sha-1值相同,这种现象术语称之为碰撞。

2017年2月23日,SHA-1 算法已经在谷歌的帮助下碰撞成功 ,对于此事,Hacker News 上已经炸了,很多人第一反应就是使用了 SHA-1 的 Git 怎么处理这个事情。新闻发布后不久,Git创始人 Linus Torvalds 已经在邮件列表进行了回应,大体意思是Git除了hash了数据之外,还在文件头部存储了数据长度等信息,想要骗过Git除了SHA-1值要相同之外,数据长度也要一致才行,这大大增加了碰撞难度,至少比生成两个不同内容相同SHA-1的PDF要难得多。Linus说碰撞是通过增删一些可有可无的不透明数据来实现的,所以 Git 以后需要加强对不透明数据的检查,让这些碰撞无处可藏。

Git 的开发者已经开始讨论加入新的 HASH 机制,详细内容参见此处:Another proposed hash function transition plan

How would git handle a SHA-1 collision on a blob? https://stackoverflow.com/questions/9392365/how-would-git-handle-a-sha-1-collision-on-a-blob/9392525#9392525

fatal: 发现出现 SHA1 冲突!

$ git pull coding webossgoo

remote: Enumerating objects: 185, done.
remote: Counting objects: 100% (185/185), done.
remote: Compressing objects: 100% (143/143), done.
remote: Total 151 (delta 47), reused 9 (delta 0)
fatal: 发现 114b6e64f6031196a1c45c4e4c7ddb4e2e2528fd 出现 SHA1 冲突!
fatal: index-pack failed


$ git show 114b6e64f6031196a1c45c4e4c7ddb4e2e2528fd
fatal: packed object 114b6e64f6031196a1c45c4e4c7ddb4e2e2528fd (stored in ./objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack) is corrupt

$ ls -lah .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack
-r--r--r-- 1 dml dml 474M 7月  23 09:54 .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack
dml@dml-pc:~/www/yiitest$ 


$ git fsck #这条命令执行时间有点长
.....
.....
error: bad offset for revindex
error: cannot unpack e87c6020c59406df2d95e651ec9a0eab90692373 from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068674
error: index CRC mismatch for object 6f721ba0a9ecbedce42a817371c4cdb1940934ee from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068721
error: inflate: data stream error (incorrect header check)
error: cannot unpack 6f721ba0a9ecbedce42a817371c4cdb1940934ee from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068721
error: index CRC mismatch for object aaa4977b41bb08114897e07bbb0c142f4cbe7db9 from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068767
error: unknown object type 0 at offset 290068767 in .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack
error: cannot unpack aaa4977b41bb08114897e07bbb0c142f4cbe7db9 from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068767
error: index CRC mismatch for object 1488f747269987f281b1828be1da6208bb99aa8f from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack at offset 290068809
fatal: BUG: total_in mismatch


$ git gc --auto
自动在后台执行仓库打包以求最佳性能。
手工维护参见 "git help gc"。
error: 最后一次 gc 操作报告如下信息。请检查原因并删除 .git/gc.log。
在该文件被删除之前,自动清理将不会执行。

error: failed to validate delta base reference at offset 284563497 from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack
error: failed to read object d4b8121cdd55f7ab558ee9ccfd4376034faa9e24 at offset 284563474 from .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack
fatal: packed object d4b8121cdd55f7ab558ee9ccfd4376034faa9e24 (stored in .git/objects/pack/pack-5757457a32191837209154b230a9c5ed971e74e8.pack) is corrupt
error: failed to run repack

$ rm -rf .git/gc.log 

这时再次pull ,还是 SHA1冲突

最后,还是重新 clone一份吧, 旧的备份一下

欲了解更多,请参考如下网页:

http://git-scm.com/book/en/Git-Internals-Git-Objects

http://en.wikipedia.org/wiki/SHA-1

http://en.wikipedia.org/wiki/Xiaoyun_Wang

上一篇:解决GitHub下载速度慢的方法