Why does the lack of code for the MD5 function help hackers break it?

Question

Why does the lack of code for the MD5 function help hackers break it?

I believe that I can download the code in PHP or Linux or something else and look directly at the source code for the MD5 function. Can I then not reprogram the encryption?

Here the code is http://dollar.ecom.cmu.edu/sec/cryptosource.htm

It seems that any encryption method will be useless if the "enemy" has the code with which it was created. I am wrong?

+6

security cryptography open-source hash md5

shady Apr 28 '11 at 0:51

source share

3 answers

This is a really good question.

MD5 is a hash function - it “mixes” the input data in such a way that some things should not be done, including restoring input based on output (this is not encryption, there is no key and it is not intended for inversion - rather the opposite). The description of manual work is that each input bit is entered several times in a sufficiently large internal state, which is mixed so that any difference quickly spreads to the entire state.

MD5 public since 1992 . There is no secret and has never been a secret for MD5 design.

MD5 is considered cryptographically broken since 2004, the year of publication of the first collision (two separate message entries that give the same result); it has been considered “weak” since 1996 (when some structural properties were discovered that were believed to ultimately help in creating collisions). However, there are other hash functions that are as public as MD5 and for which weakness is not yet known: the SHA-2 family. Newer hash functions are currently being evaluated as part of the SHA-3 contest.

The really troubling part is that there is no known mathematical proof that a hash function can actually exist. A hash function is a generally accepted efficient algorithm that can be embedded as a logical circuit of finite, fixed and small size. For practitioners of computational complexity, it is somewhat surprising that you can set up a circuit that cannot be inverted. So, now we only have candidates: functions for which no one has yet found flaws, and not a function for which there is no weakness. On the other hand, the MD5 case shows that, apparently, it takes a considerable amount of time to get actual known collisions with attacks from known structural flaws (weaknesses in 1996, collisions in 2004, application of collisions - to a pair of X.509 certificates - in 2008 year), so the current trend is to use the flexibility of the algorithm: when we use a hash function in the protocol, we also think about how we can go to another if the hash function is weak.

+9

Thomas pornin Apr 28 '11 at 13:34

source share

One of the criteria for good cryptographic operations is that knowledge of the algorithm should not facilitate encryption violation. Thus, encryption should not be reversible without knowledge of the algorithm and the key, and the hash function should not be reversible regardless of the knowledge of the algorithm (the term “computationally impracticable” is used).

MD5 and another hash function (such as SHA-1, SHA-256, etc.) perform a one-way data operation that creates a digest or fingerprint, which is usually much smaller than plaintext. This one-way function cannot be undone to get plain text, even if you know exactly what the function does.

Similarly, knowledge of the encryption algorithm does not facilitate (provided a good algorithm) the recovery of plaintext from encrypted text. The reverse process is “computationally impossible” without knowing the encryption key used.

+8

Andrew Cooper Apr 28 '11 at 1:07

source share

alex · Accepted Answer · 2011-04-28T00:53:16+0000

This is not encryption, but one way of hashing . It digests the string and creates ( hopefully ) a unique hash.

If it were reversible encryption, the zip and tar.gz formats would be pretty verbose. :)

The reason why this does not help the hackers too much (obviously knowing how to do it, is useful) is that if they find the password for the system that is hashed, for example. 2fcab58712467eab4004583eb8fb7f89 , they need to know the source line used to create it, and also if salt was used. This is because when you log in for obvious reasons, the password string is hashed using the same method as it was generated, and then the resulting hash is compared with what is stored.

In addition, many developers migrate to bcrypt , which includes a coefficient of work, if the hashing takes 1 second rather than 0.01 seconds, this will significantly slow down the creation of a rainbow table for your application, and those old PHP sites using md5() become only dense hanging fruit.

Why does the lack of code for the MD5 function help hackers break it?

More articles: