git.verplant.org Git - git.git/commit

author	Nicolas Pitre <nico@cam.org>
	Wed, 3 May 2006 03:31:00 +0000 (23:31 -0400)
committer	Junio C Hamano <junkio@cox.net>
	Wed, 3 May 2006 04:32:39 +0000 (21:32 -0700)
commit	06a9f9203570d21f9ef5fe219cdde527dcdf0990
tree	7a06db8f14535b45ba0c34c2d252a2b833314554	tree \| snapshot
parent	2d08e5dd730680f7f8645a6326ec653435e032df	commit \| diff

improve diff-delta with sparse and/or repetitive data

It is useless to preserve multiple hash entries for consecutive blocks
with the same hash. Keeping only the first one will allow for matching
the longest string of identical bytes while subsequent blocks will only
allow for shorter matches. The backward matching code will match the
end of it as necessary.

This improves both performances (no repeated string compare with long
successions of identical bytes, or even small group of bytes), as well
as compression (less likely to need random hash bucket entry culling),
especially with sparse files.

With well behaved data sets this patch doesn't change much.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>