Git Merge duplication after inefficient use of BFG

I somehow deeply linked the entire repository (used only by me) and could use some help in sorting it.

Here is what I did. I realized that in my commit history there were some files containing credentials that I did not want to just lay. So, I decided to be legal and try to use the BFG Repo-Cleaner to fix these problems. I threw all the credentials into .gitignores and went on to try to clear them out of history. In accordance with the documentation instructions, I executed the following commands:

git clone --mirror myrepo.git java -jar bfg.jar --delete-files stuffthatshouldbedeleted.txt myrepo.git 

At that moment, BFG told me that x files were found and deleted. Sweet.

 cd myrepo.git git reflog expire --expire=now --all git gc --prune=now --aggressive git push 

According to terminal magazines, he updated the repo. So far so good, right? I log in to my github account and after a few clicks find in my history the credentials, still there, the file and that’s it. I will come back and try the same set of commands, but using this line instead of deleting the file:

 java -jar bfg.jar --replace-text passwords.txt myrepo.git 

where passwords.txt is a file containing string instances of all the credentials that I would like to delete. Again, BFG logs show that there are several cases where it is fixed. I wring, check, and the credentials are still there, sitting in Github. I notice that the SHA-1 keys for all my commits have been changed, so presumably the BFG did something, just not what I want it to do.

At this moment, I give up and try to get back to work, I will explain later. I am doing some work, trying to push, to get a strange merge conflict (you are 50 ahead and 50 behind on a fix). What kind? I'm trying to pull and merge, and suddenly every commit in my git story is duplicated by name, and some of them are just empty. I am checking my Github network diagram, and there seems to be a second branch, starting with my initial commit, which accurately reflects all my commits that were zipped with my last commit (I never forked, just alternated linearly).

I cannot go back to the previous commit because they are all chronologically duplicated. My credentials are still there, twice as many times, and my story doubles and is very confusing to try to understand. When I try to start BFG from the very beginning, cloning and mirroring the repo again, it tells me that it does not have credentials, despite the fact that I can see them on Github. I could really use some help in understanding what happened, and how, if at all, I can return to the state of things again.

I am only considering deleting the entire repo and starting over. I really don't want to do this.

TL; DR; I tried to use BFG, somehow duplicated half-baked versions of all the commits in my repo, I could not unravel it, and to add an insult to the injury, BFG did nothing and claims that it was doing its job.

+6
source share
1 answer

I am the author of BFG, I will try to describe what, in my opinion, happened in stages based on your account:

Manual BFG preliminary cleaning ...

You first:

threw all credentials into .gitignores and went on to try to clean them out of history.

There are no two important steps in this description of your actions:

If you did not do these things, this would explain that your credentials would not be completely cleared from your repository.

Launching BFG for the first time ...

Moving on, you:

  • made a new mirror clown of your repo from github
  • started BFG, filtering with the --delete-files option (did you see a warning of protected content?)
  • clicked updated repository on github

... at what point:

According to terminal magazines, he updated the repo. So far so good, right? I log in to my github account and after a few clicks find the credentials, still there, the file and everything in my history

So, assuming that you correctly manually removed your bad content from your last commits before starting BFG, what you saw is rather strange. Possible reasons:

a) The repository was not cloned with the --mirror flag, so not all branches on GitHub were overwritten, leaving a dirty history in non-main branches. However, you explicitly stated that you are using the --mirror flag.

b) Even with a mirror click on GitHub, old commits are still available there when they are referenced by an explicit commit-id (i.e. the GitHub URL that has a commit code in it), up to the point GitHub starts automatic garbage collection in your repository. Pull requests and forks can also save commits from the old history. This would be another possible explanation for the dirty commits you saw.

Launching BFG a second time ...

In any case, at that moment you were worried, and:

  • ran BFG again, this time with --replace-text passwords.txt , which updates the contents of the file rather than deleting the entire file.

Again, the BFG magazines show that there are several instances that he fixed. I wring, verify, and credentials still exist while sitting in Github.

It's a little curious that the BFG said there was more content to clean up - maybe your credentials were in more places you thought - but in any case, regardless of the reason you see them around again after the first launch, that the same reason you saw them after the second run.

Return to work

At this moment, I give up and try to get back to work, I will explain later.

So, at this point, you rewrote your Git repository history (twice!) And dragged it to GitHub. But your account does not mention that you delete all your old old repo copies, as indicated in the BFG instructions:

"At this point, you're ready for everyone to dump their old repo reps and make fresh clones of good new, pristine data."

So, have you deleted your old working copy of the Git repository on your working computer and re-cloned with the new Git repository history? The story in your old repo would be different from the “cleared” story that would be present on GitHub at that moment (even if the “cleared” story was not “cleared” as you would like!).

I am doing some work, trying to push, to get a strange merge conflict (you are 50 ahead and 50 behind on a fix).

If you did the work in the old local copy of your Git repository (and not the new re-cloning from GitHub), then this is what you will see. You, in fact, are pushing 50 entries of the old dirty history to GitHub, and Git seems blissfully not aware that there are 50 completely different (before Git, which only care about fixing here) commits this branch already. Git thinks you're doing a little weird (“ahead of 50 and 50 per”) and trying to tell you that.

Worse situation ...

What? I'm trying to pull and merge, and suddenly every commit in my Git story is duplicated by name, and some of them are just empty. I check my Github network schedule, and it looks like the second branch starts with my initial commit, which accurately reflects all my commits that were zipped with my last commit

So, having performed the stretching and merging, you combined the cleaned history and the dirty history, combining them with the fixation of the merger. As for sorting your story, this is a bad idea. It would be best to reinstall the new work on top of the cleared story, click it, delete your old working repo and make a new clone.

Effects

When I try to start BFG from the very beginning, cloning and mirroring the repo again, it tells me that it does not have credentials, despite the fact that I can see them on Github.

This is rather strange, but in fact I have no explanation, other than an operator error, outside of the above explanation of GitHub gc. You can share the repository with me (if you want) so that I can perform a more detailed inspection or just send me a ZIP copy of the ".bfg-report" directory so that I can see what BFG diagnostics captured during its execution.

Recovery

I really could use some help in understanding what happened, and how, if at all, I can return to the state of things again.

I hope I managed to explain what happened.

In terms of sorting your story (i.e. getting rid of these two duplicate lines), you need to reset your Git story back to the (cleared) point before adding it to the merge merge. Look at the merge commit and determine which parent story you prefer. What is the last commit ( xxxx ) in this story before the merge?

 git reset --hard master xxxx 

This can greatly lose the last bit of work that you did in your old, dirty history. Define this commit ( yyyy ) and reinstall it on top of your story or just select it:

 git cherry-pick yyyy 

Finally, click the restored story to GitHub with the "force" flag:

 git push origin master -f 

... zip the archive of your old repo, and then delete all the old local copies of your repo to prevent further confusion. Make a new clone.

+16
source

Source: https://habr.com/ru/post/971682/


All Articles