CRC-32 is not good enough; it is trivial to create conflicts, i.e. two files (the same length if you want it) that have the same CRC-32. Even in the absence of a malicious attacker, collisions will occur randomly when you have about 65,000 separate files of the same length.
The hash function is designed to prevent collisions. With MD5 or SHA-1 you will not get random collisions. If your setup is security-related (i.e., someone, someone who might be actively trying to create conflicts), you need a secure hash function. MD5 is no longer protected (creating conflicts with MD5 is very simple), and SHA-1 is somewhat weaker in this respect (the actual collisions were not calculated, but the method for creating it is known and, although expensive, much cheaper than what it should be ) The usual recommendation is to use SHA-256 or SHA-512 (for security, SHA-256 is enough: SHA-512 can be a little faster on large 64-bit systems, but the bandwidth of reading files will be more limited than the hash rate).
Note: when using a cryptographic hash function, there is no need to store and compare the file length; the hash is sufficient to disambiguate the files.
In a configuration without protection (i.e. you are only afraid of random collisions), then MD4 . It is completely “broken” as a cryptographic hash function, but it is still a very good checksum and it is very fast (on some ARM-based platforms it is even faster than CRC-32 and significantly increases resistance to accidental collisions). Basically, you should not use MD5: if you have security problems, then MD5 should not be used (it is damaged, use SHA-256); and if you have no security issues, MD4 is faster than MD5.
source share