How to filter history based on gitignore?

To be clear on this, I am not asking about how to delete a single file from history, for example, this question: Completely delete a file from all Git repository commit history . I also do not ask about the absence of traces of files from gitignore, as in this question: Ignore files that have already been linked to the Git repository .

I'm talking about "updating the .gitignore file and then deleting everything that matches the list from the story", more or less like this question: Ignore files that have already been committed to the Git repository . However, unfortunately, the answer to this question is not suitable for this purpose, so I am here to try to work out a question and hopefully find a good answer that does not include a person browsing the entire source tree to manually make a filter branch in each associated file .

Here I provide a test script currently executing the procedure in the Ignore files that were already bound to a Git repository response . It is about to delete and create a folder rootunder PWD, so be careful before starting it. I will describe my purpose after the code.

#!/bin/bash -e

TESTROOT=${PWD}
GREEN="\e[32m"
RESET="\e[39m"

rm -rf root
mkdir -v root
pushd root

mkdir -v repo
pushd repo
git init

touch a b c x 
mkdir -v main
touch main/{a,x,y,z}

# Initial commit
git add .
git commit -m "Initial Commit"
echo -e "${GREEN}Contents of first commit${RESET}"
git ls-files | tee ../00-Initial.txt

# Add another commit just for demo
touch d e f y z main/{b,c}
## Make some other changes
echo "Test" | tee a | tee b | tee c | tee x | tee main/a > main/x
git add .
git commit -m "Some edits"

echo -e "${GREEN}Contents of second commit${RESET}"
git ls-files | tee ../01-Changed.txt

# Now I want to ignore all 'a' and 'b', and all 'main/x', but not 'main/b'
## Checkout the root commit
git checkout -b temp $(git rev-list HEAD | tail -1)
## Add .gitignores
echo "a" >> .gitignore
echo "b" >> .gitignore
echo "x" >> main/.gitignore
echo "!b" >> main/.gitignore
git add .
git commit --amend -m "Initial Commit (2)"
## --v Not sure if it is correct
git rebase --onto temp master
git checkout master
## --v Now, why should I delete this branch?
git branch -D temp
echo -e "${GREEN}Contents after rebase${RESET}"
git ls-files | tee ../02-Rebased.txt

# Supposingly, rewrite history
git filter-branch --tree-filter 'git clean -f -X' -- --all
echo -e "${GREEN}Contents after filter-branch${RESET}"
git ls-files | tee ../03-Rewritten.txt

echo "History of 'a'"
git log -p a

popd # repo

popd # root

, , . , . , a, b main/x , main/b . . ?

, . .

+4
3

. , git filter-branch --tree-filter, . : script, ; . .

-, : . , , , , " ". Git, . , . (, , Git . - .)

, , / , , , , , 1 git filter-branch. filter-branch , , . , , ; 100 . , 100 . , (10 000 000 000 ) . .

, . --index-filter, , , . - --tree-filter, . , , , script , , , ( git update-index ).


1 , . ( , , : , - , , , ; , ).


--tree-filter

git filter-branch --tree-filter, , , , , , . .git , git filter-branch ( .git, -d Git, , , ).

Git . , Git . , , . , , . , , . , , .

, .gitignore , ( .gitignore , , , -). , , - , rm -f known/path/to/file.ext. , . , , .

, :

rm -f $(cat /tmp/files-to-remove)

( , xargs ... | rm -f, , , xargs; -z , \0 ).

Git . "" , .

, /tmp/files-to-remove , xargs -0. :

xargs -0 /tmp/files-to-remove | git rm --cached -f --ignore-unmatch

rm -f , Git, , . ( -q git rm --cached, .)

.gitignore

script --tree-filter , :

git filter-branch --tree-filter 'git clean -f -X' -- --all

(< <229 > ):

-git rebase --onto temp master
+git rebase --onto temp temp master

, , , git clean -f -X , . , , .

, . : git clean , : , . git clean -f -X :

-git filter-branch --tree-filter 'git clean -f -X' -- --all
+git filter-branch --tree-filter 'git rm --cached -qrf . && git add . && git clean -fqX' -- --all

( "" ).

.gitignore, , .gitignore, ( -). :

mkdir /tmp/ignores-to-add
cp .gitignore /tmp/ignores-to-add
mkdir /tmp/ignores-to-add/main
cp main/.gitignore /tmp/ignores-to-add

( script, .gitignore, , ). --tree-filter :

cp -R /tmp/ignores-to-add . &&
    git rm --cached -qrf . &&
    git add . &&
    git clean -fqX

cp -R ( - git add ., ), .gitignore. , filter-branch.

. ( - rm $GIT_INDEX_FILE, , .)

., .. . .gitignore , .

git clean -qfX , , filter-branch .

+4

:

cp -R /tmp/ignores-to-add . &&
git rm --cached -qrf . &&
git add . &&
git clean -fqX

.

.gitignore:

git filter-branch --index-filter '
  git ls-files -i --exclude-from=.gitignore | xargs git rm --cached -q 
' -- --all

.gitignore :

cp ../.gitignore /d/tmp-gitignore
git filter-branch --index-filter '
  cp /d/tmp-gitignore ./.gitignore
  git add .gitignore
  git ls-files -i --exclude-from=.gitignore | xargs git rm --cached -q 
' -- --all
rm /d/tmp-gitignore

grep -v, , empty :

git ls-files -i --exclude-from=.gitignore | grep -vE "empty$" | xargs git rm --cached -q
+3

   git (//), - ( ).

/.git/info/exclude () pre-existing .gitignore , , /. 1

git , , , // , . 2

: - , , !

Also, the comments/revision history of this answer (and revision history of this question) may be useful/enlightening.

#commit up-to-date .gitignore (if not already existing)
#this command must be run on each branch

git add .gitignore
git commit -m "Create .gitignore"

#apply standard git ignore behavior only to current index, not working directory (--cached)
#if this command returns nothing, ensure /.git/info/exclude AND/OR .gitignore exist
#this command must be run on each branch

git ls-files -z --ignored --exclude-standard | xargs -0 git rm --cached

#Commit to prevent working directory data loss!
#this commit will be automatically deleted by the --prune-empty flag in the following command
#this command must be run on each branch

git commit -m "ignored index"

#Apply standard git ignore behavior RETROACTIVELY to all commits from all branches (--all)
#This step WILL delete ignored files from working directory UNLESS they have been dereferenced from the index by the commit above
#This step will also delete any "empty" commits.  If deliberate "empty" commits should be kept, remove --prune-empty and instead run git reset HEAD^ immediately after this command

git filter-branch --tree-filter 'git ls-files -z --ignored --exclude-standard | xargs -0 git rm -f --ignore-unmatch' --prune-empty --tag-name-filter cat -- --all

#List all still-existing files that are now ignored properly
#if this command returns nothing, it time to restore from backup and start over
#this command must be run on each branch

git ls-files --other --ignored --exclude-standard

, GitHub ( 6) , / .

git push origin --force --all
git push origin --force --tags
git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin
git reflog expire --expire=now --all
git gc --prune=now

, , , :

#fetch modified remote

git fetch --all

#"Pull" changes WITHOUT deleting newly-ignored files from working directory
#This will overwrite local tracked files with remote - ensure any local modifications are backed-up/stashed
#Switching branches after this procedure WILL LOOSE all newly-gitignored files in working directory because they are no longer tracked when switching branches

git reset FETCH_HEAD

1 /.git/info/exclude , , , .gitignore , , . , .gitignore , , . , /.git/info/exclude , , .gitignore , , , .

, git rebase git filter-branch, .gitignore ,

2 git ignore git rm --cached , delete . --prune-empty git filter-branch , " ". git , // . , , . GitHub :

Tell your collaborators to rebase , rather than merge any branches they created from your old (damaged) repository history. A single merge commit can restore some or all of the tainted history that you just cleared.

Alternative solutions that do not affect the remote repo are git update-index --assume-unchanged </path/file>or git update-index --skip-worktree <file>, examples of which can be found here .

+1
source

Source: https://habr.com/ru/post/1675013/


All Articles