To search for duplicate commit $hash
excluding merge transactions:
git rev-list --no-merges --all | xargs -r git show | git patch-id \ | grep ^$(git show $hash|git patch-id|cut -c1-40) | cut -c42-80 \ | xargs -r git show -s --oneline
To find a duplicate merge commit $mergehash
replace $(git show $hash|git patch-id|cut -c1-40)
above with one of the two patch identifiers (1st column) given by git diff-tree -m -p $mergehash | git patch-id
git diff-tree -m -p $mergehash | git patch-id
. They correspond to differences in association with each of his two parents.
To find duplicates of all commits excluding merge transactions:
git rev-list --no-merges --all | xargs -r git show | git patch-id \ | sort | uniq -w40 -D | cut -c42-80 \ | xargs -r git log --no-walk --pretty=format:"%h %ad %an (%cn) %s" --date-order --date=iso
The search for duplicate commits can be expanded or limited by changing the arguments to git rev-list
, which accepts many options. For example, to limit the search to a specific branch, specify its name instead of the --all
option; or to search in the last 100 commits, arguments HEAD ^HEAD~100
are passed.
Note that these commands are fast because they do not use a shell loop, and the batch process commits.
To enable compilation, remove the --no-merges
and replace xargs -r git show
with xargs -r -L1 git diff-tree -m -p
. This is much slower because git diff-tree
runs once to commit.
Explanation:
The first line generates a patch identifier map with commit hashes (data from two columns of 40 characters each).
The second row stores commit hashes (2nd column) corresponding to duplicate patch identifiers (1st column).
The last line prints user information about duplicate commits.
source share