How does git detect that a file has been modified?

How does git detect file modification so quickly?

Does every file have a hash in the repo and compare SHA1? It will take a long time, right?

Or does it compare atime , ctime or mtime ?

+47
git
Nov 22 '09 at 14:35
source share
4 answers

Git tries to convince itself that only one lstat () value indicates that the working line matches the index, because dropping the contents of a file is very expensive.

The documentation / technical / racygit.txt describes which statistics fields are used and how some race conditions are avoided due to the low granularity of time. This article contains more detailed information .

statistics values ​​are not protected against unauthorized access, see futimens (3). Git may be tricked by not modifying the file; which does not violate the integrity of the hashing of content.

+28
Nov 03 2018-10-11T00:
source share

There is an initial time check for reports such as "git status", but when the final commit is calculated, mtimes doesn't matter ... it matters SHA1.

+6
Nov 22 '09 at 17:53
source share

Well, it would be dangerous for me to assume that he uses a combination of stat() calls to develop what seems to have changed, and then, in turn, is really attached to setting the diff'ing engine with this, which This is true.

Here you can see the code for the diff mechanism here . I followed through the codebase to make sure that the status command is actually called into this code (it looks like it does a lot of things!), And actually it all makes a lot of sense when you know that Git works pretty badly on Windows where he uses the emulation layer to make these POSIX calls: he runs git status order of magnitude slower on this platform.

Anyway, without reading all the code from top to bottom (which I can later, if I have the time!), As far as I can take you at the moment ... maybe someone could be more final if they worked with code base.

Note. Another possible speedup comes from the wise use of inline functions, where this clearly makes sense, you can clearly see this in the headers.

[edit: see here for an explanation of stat() ]

+3
Nov 22 '09 at 15:32
source share

Depending on the platform, you may find out which system calls Git uses to determine its status. Try strace git status on Linux, truss git status on SunOS or it seems like the DTrace tool that Apple ships with the developer tools on Mac OS X.

+2
Nov 22 '09 at 22:59
source share



All Articles