Git fsck - directory check only

Question

Git fsck - directory check only

I serve the bare git repositories from my raspberry pi. My goal is to run git fsck --full at night to detect file system problems early on. I expect fsck to check both "object directories" and "objects", as well as see the output, such as

 pi@raspi2 :/media/usb/git/dw.git $ git fsck --full Checking object directories: 100% (256/256), done. Checking objects: 100% (14538/14538), done.

For one of my repositories, objects are not scanned:

 pi@raspi2 :/media/usb/git/ts-ch.git.borken $ git --version git version 2.11.0 pi@raspi2 :/media/usb/git/ts-ch.git.borken $ git fsck --full Checking object directories: 100% (256/256), done. pi@raspi2 :/media/usb/git/ts-ch.git.borken $

I changed one file in / objects (322KB .pdf file) and ran fsck again. He showed the same message as before, and no errors.

 cd objects/86/ chmod u+w f3e6e674431ab3006cbb56fddecbdb4a7724b4 echo "foosel" >> f3e6e674431ab3006cbb56fddecbdb4a7724b4 chmod uw f3e6e674431ab3006cbb56fddecbdb4a7724b4

All repositions are the same, they are bare and have no special configuration:

 pi@raspi2 :/media/usb/git/ts-ch.git $ git config --list core.repositoryformatversion=0 core.filemode=true core.bare=true

Am I missing something? Why is this modified object not found? His SHA1 will certainly not match. Thanks for any tips!

+5

git

dnswlt Jan 7 '17 at 11:52

source share

2 answers

I still don't understand why git refuses to report that it is checking objects in this repo,
I am going to output it without a git list, because I think git fsck should check everything carefully enough for all operations to work

This may be due to these two patch sets with git 2.12 (Q1 2017): recompiling git 2.12 on your raspberry pi may give better results now.

" git fsck " now carefully scans free objects.

See commit cce044d , commit c68b489 , commit f6371f9 , commit 118e6ce , commit 771e7d5 , commit 0b20f1a (January 13, 2017) by Jeff King ( peff ) .
^{( gitster Junio C Hamano - gitster - to commit 42ace93 , January 31, 2017}

AND:

" git fsck --connectivity-check " did not work at all.

See commit a2b2285 , commit 97ca7ca (January 26, 2017), commit c20d4d7 (January 24, 2017), commit c2d17b3 , commit c3271a0 , commit c6c7b16 , commit b4584e4 , commit 1ada11e (January 16, 2017) and commit 3e3f8bd (January 17, 2017) 2017) Jeff King ( peff ) .
^{(merged Junio C Hamano - gitster - in commit 4ba6197 , January 31, 2017}

0

Vonc Feb 12 '17 at 10:51

source share

jszakmeister · Accepted Answer · 2017-01-07T12:19:13+0000

On the issue of corruption

Yes, you are missing something. Namely, you did not damage the file in the way that attention is paid to Git. An object stored on disk usually begins with the type of object, followed by a space, followed by a size (using ASCII numbers), followed by NUL. Size determines how large the object is, and that all that Git ends with reading. Thus, binding data to an end like this will not actually damage the object. If you replaced the contents of the file with something else, you will see a problem.

For more information about the object, see the Git User Guide :

Object storage format
All objects have a statically defined type that identifies the format of the object (i.e. how it is used and how it can refer to other objects). There currently are four different types of objects: blob, tree, commit, and tag.
Regardless of the type of object, all objects have the following characteristics: they are all reset using zlib and have a header that not only determines their type, but also provides information about the size of the data in the object. It is worth noting that the SHA-1 hash, which is used to designate an object, is the hash of the original data plus this header, so the sha1sum file does not match the object's file name.
As a result, the general consistency of an object can always be checked regardless of the content or type of the object: all objects can be confirmed by checking that (a) their hashes correspond to the contents of the file and (b) the object is successfully inflated to a stream of bytes, which forms a sequence <ascii type without space> + <space> + <ascii decimal size> + <byte\0> + <binary object data> .
Structured objects can additionally have their own structure and the ability to connect to verified other objects. This is usually done using the git fsck program, which generates a complete dependency graph of all objects and checks their internal ones (in addition to simply confirming their surface consistency through a hash).

However, there is an interesting interaction that makes me think that git fsck should work more intensively and notice when there is garbage in the file at the end. If you try to run git gc in this repo, you will see an error similar to this:

 :: git gc Counting objects: 9, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. error: garbage at end of loose object '45b983be36b73c0788dc9cbcb76cbb80fc7bb057' fatal: loose object 45b983be36b73c0788dc9cbcb76cbb80fc7bb057 (stored in .git/objects/45/b983be36b73c0788dc9cbcb76cbb80fc7bb057) is corrupt error: failed to run repack

It seems that if git gc cannot be started, then git fsck should catch the problem.

Why you do not see the "Object Check"

This problem is actually very simple: there are no packed objects to scan. Those live in .git/objects/pack . If you do not have any of these files, you will not see the “Scan Objects” bit.

Git fsck - directory check only

On the issue of corruption

Object storage format

Why you do not see the "Object Check"

" `git fsck` " now carefully scans free objects.

" `git fsck --connectivity-check` " did not work at all.

More articles:

Git fsck - directory check only

On the issue of corruption

Object storage format

Why you do not see the "Object Check"

" git fsck " now carefully scans free objects.

" git fsck --connectivity-check " did not work at all.

More articles:

" `git fsck` " now carefully scans free objects.

" `git fsck --connectivity-check` " did not work at all.