<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/gitano/libgit2.git/src/hashsig.h, branch replace-luagit2</title>
<subtitle>git.gitano.org.uk: libgit2.git
</subtitle>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitano/libgit2.git/'/>
<entry>
<title>Refine pluggable similarity API</title>
<updated>2013-02-20T23:09:41+00:00</updated>
<author>
<name>Russell Belfer</name>
<email>rb@github.com</email>
</author>
<published>2013-02-19T18:25:41+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitano/libgit2.git/commit/?id=9bc8be3d7e5134de1d912c7ef08d6207079bd8c1'/>
<id>9bc8be3d7e5134de1d912c7ef08d6207079bd8c1</id>
<content type='text'>
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API.  In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.

Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API.  In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.

Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
</pre>
</div>
</content>
</entry>
<entry>
<title>Change similarity metric to sampled hashes</title>
<updated>2013-02-20T23:09:40+00:00</updated>
<author>
<name>Russell Belfer</name>
<email>rb@github.com</email>
</author>
<published>2013-02-15T01:25:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.baserock.org/cgit/delta/gitano/libgit2.git/commit/?id=5e5848eb15cc0dd8476d1c6882a9f770e6556586'/>
<id>5e5848eb15cc0dd8476d1c6882a9f770e6556586</id>
<content type='text'>
This moves the similarity metric code out of buf_text and into a
new file.  Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes.  In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This moves the similarity metric code out of buf_text and into a
new file.  Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes.  In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
</pre>
</div>
</content>
</entry>
</feed>
