gsoc-2011-git
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:my gsoc patches for `git diff --histogram`
# Description

The patches have been grouped into directories corresponding to each
patch series that was sent to the mailing list.

The date and message-id for the cover letter email are provided here,
for posterity's sake; it should be trivial to find the thread and/or the
other emailed patches in the series on most common online mailing list
archive.

# Content listing

## pre-v1

A preliminary patch series.

* 0000-cover-letter.patch

    Date: Wed,  6 Jul 2011 14:15:42 +0800

    Message-Id: <1309932945-5048-1-git-send-email-rctay89@gmail.com>
* 0001-xdiff-xprepare-use-memset.patch
* 0002-xdiff-xpatience-factor-out-fall-back-diff-function.patch
* 0003-t4033-diff-patience-factor-out-tests.patch

## pre-v2

v2 of the preliminary series.

* 0000-cover-letter.patch

    Date: Thu,  7 Jul 2011 00:38:53 +0800

    Message-Id: <1309970337-6016-1-git-send-email-rctay89@gmail.com>
* 0001-xdiff-xprepare-use-memset.patch
* 0002-xdiff-xprepare-refactor-abort-cleanups.patch
* 0003-xdiff-xpatience-factor-out-fall-back-diff-function.patch
* 0004-t4033-diff-patience-factor-out-tests.patch

## pre-v3

v3 of the preliminary series.

* 0000-cover-letter.patch

    Date: Thu,  7 Jul 2011 12:23:54 +0800

    Message-Id: <1310012638-3668-1-git-send-email-rctay89@gmail.com>
* 0001-xdiff-xprepare-use-memset.patch
* 0002-xdiff-xprepare-refactor-abort-cleanups.patch
* 0003-xdiff-xpatience-factor-out-fall-back-diff-function.patch
* 0004-t4033-diff-patience-factor-out-tests.patch

## v1

The actual histogram diff implementation.

* 0000-cover-letter.patch

    Date: Tue, 12 Jul 2011 14:10:24 +0800

    Message-Id: <1310451027-15148-1-git-send-email-rctay89@gmail.com>
* 0001-teach-histogram-to-diff.patch
* 0002-xdiff-xprepare-skip-classification.patch
* 0003-xdiff-xprepare-use-a-smaller-sample-size-for-histogr.patch

## v2

v2 of the implementation.

* 0000-cover-letter.patch

    Date: Mon,  1 Aug 2011 11:16:40 +0800

    Message-Id: <1312168608-10828-1-git-send-email-rctay89@gmail.com>
* 0001-xdiff-xprepare-use-memset.patch
* 0002-do-away-with-xdl_mmfile_next.patch
* 0003-xdiff-xprepare-refactor-abort-cleanups.patch
* 0004-xdiff-xpatience-factor-out-fall-back-diff-function.patch
* 0005-t4033-diff-patience-factor-out-tests.patch
* 0006-teach-histogram-to-diff.patch
* 0007-xdiff-xprepare-skip-classification.patch
* 0008-xdiff-xprepare-use-a-smaller-sample-size-for-histogr.patch

## v2-rebased

The 'v2' series rebased on 'next'. To integrate 'v2' would require
re-rolling the series, but it already graduated to 'next', so this
series was sent in to make the maintainer's life easier.

* 0000-cover-letter.patch

    Date: Mon,  1 Aug 2011 12:20:06 +0800

    Message-Id: <1312172410-4380-1-git-send-email-rctay89@gmail.com>
* 0001-xdiff-do-away-with-xdl_mmfile_next.patch
* 0002-xdiff-xhistogram-rework-handling-of-recursed-results.patch
* 0003-xdiff-xhistogram-rely-on-xdl_trim_ends.patch
* 0004-xdiff-xhistogram-drop-need-for-additional-variable.patch

# Un-mailed patches

## raw-v*

These patches were a naive, line-to-line conversion of JGit's histogram
diff implementation. You can see that information is stored as an
integer bitmask, with shifts being used to read/write different fields.

## ptr-v*

These patches provided an implementation of the histogram diff algorithm
based on structs, instead of bitmasks. This reduced the complexity of
reading/writing of different fields and made code more readable.
However, there was a noticeable performance penalty, which held me back
from submitting these series. After running the code through a profiler,
I managed to pin the problem down to excessive `malloc()`s.

This implementation is the one that was submitted for inclusion.

## xd-v*

These patches were an attempt at improving the C histogram diff
implementation by reducing the calls to scanA() (the step that builds
the occurrence table) for each recursive sub-diff. This, of course,
ramped up the complexity of the code, but the performance gains were
minimal, so I junked this in the end.

## ptr2-v1

Like xd-v*; this was an initial try at reducing scanA() calls.

## guilt/master

Behind the scenes, I used [guilt] [guilt] to manage changes to my
patches. The folder contains a dump of my .git/patches/ repo (also
available in this repo at the [guilt/master branch] [guilt master].

## guilt/trim-ends

I toyed with the idea of skipping hashing of common head and tail lines
for a performance gain. However, as I worked on another diff-related
issue (see 'Related'), I began to doubt the correctness of this.

This series is still a work-in-progress and has not been submitted for
inclusion.

## gists/gist-*

* [gist: 1063164 [gsoc-diff] numbers] (https://gist.github.com/1063164)
* [gist: 1063175 [gsoc-diff] more numbers] (https://gist.github.com/1063175)
* [gist: 1066703 [gsoc-diff] alternate hashing ("golden ratio")] (https://gist.github.com/1066703)
* [gist: 1066773 [gsoc-diff] try_lcs: skip chains] (https://gist.github.com/1066773)
* [gist: 1066906 [gsoc-diff] diff --assume-text] (https://gist.github.com/1066906)
* [gist: 1068912 [gsoc-diff] malloc()/free() once] (https://gist.github.com/1068912)
* [gist: 1069099 [gsoc-diff] reduce number of rec accesses] (https://gist.github.com/1069099)
* [gist: 1074709 [gsoc-diff] reduce number of mallocs for gd-ptr] (https://gist.github.com/1074709)
* [gist: 1098863 [gsoc-diff] make trim tail support context] (https://gist.github.com/1098863)
* [gist: 1099236 [gsoc-diff] skip hashing of common leading lines] (https://gist.github.com/1099236)
* [gist: 1099515 [gsoc-diff] xdl_trim_head()] (https://gist.github.com/1099515)

# Related

Towards the close of the gsoc coding period, there was [a thread] [1] on
the git mailing list regarding performance issues with git-diff. With
the knowledge I had gained of git diff's machinery over the course of
gsoc, I managed to diagnose the problem and [submitted a patch] [2].

[1]: http://mid.gmane.org/loom.20110809T093124-847@post.gmane.org
[2]: http://mid.gmane.org/1313464312-5132-1-git-send-email-rctay89@gmail.com
[guilt]: http://www.kernel.org/pub/linux/kernel/people/jsipek/guilt/
[guilt master]: https://github.com/rctay/gsoc-2011-git/tree/guilt/master

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。