Disclaimer: I originally posted this on Quora , but thought the answer was more suitable for Stack Overflow.
The method used to store and verify user passwords without actually storing passwords is to compare user input with a stored hash.
What is hashing?
Hashing is the process of transmitting variable-length data (small passwords, large passwords, binary files, etc.) through an algorithm that returns it as a fixed-length set called a hash value. Hashes work only in one direction. A * .img file consisting of several Mb can be hashed in the same way as a password. (in fact, itβs common practice to use hashes on large files to verify their integrity, say you download a file using bittorrent when it finishes the software, hashes it and compares the hash of what you have with the hash of what you if they match, the download is not damaged).
How does auth work with hashes?
When a user logs in, he gives a password, say pass123 , which is then hashed (by any available hashing algorithm: sha1, sha256, etc. in this case md5) to the value 32250170a0dca92d53ec9624f336ca24 , and this value is stored equally in the database. Each time you try to log in, the system will hash your password in real time and compare it with the saved hash, if it matches, you are good to go. You can try the md5 online hasher here: http://md5-hash-online.waraxe.us/
What if two hashes are the same? Can a user log in with a different pass?
He could! This is called a collision. Let's say that with a fictitious hashing algorithm, the value pass123 would create an ec9624 hash, and the value pass321 would create the same hash that the hash algorithm would be violated. Both common md5 and sha1 algorithms (used by one LinkedIn) are broken as collisions are detected. Violation does not necessarily mean that it is unsafe.
How can you use conflicts?
If you can generate a hash, this is the same as a hash created by a user password that you can identify on this site as a user.
Rainbow attack tables.
Crackers quickly realized that once they grabbed a hashed password table, it would be impossible to use passwords one by one so that they developed a new attack vector. They will generate each separate password (aaa, aab, aac, aad, etc. etc.) and store all the hashes in the database. Then they will only need to search for the stolen hash in the database with all the hashes generated in succession (subsecond request) and get the corresponding password.
Salt for salvation (and where LinkedIn failed a lot!)
Security is determined by the amount of time it takes for a hack to crack your password and the frequency with which you change it. With rainbow tables, security drops very quickly, so the industry came up with salt. What if each password had a unique twist? This is the salt! For each user who registers you, generates a random string, say 3 characters (the industry recommends 16 characters - https://stackoverflow.com/a/18419 ... ). Then you combine the user password with your random string.
password - salt - sha1 hash qwerty - 123 - 5cec175b165e3d5e62c9e13ce848ef6feac81bff qwerty - 321 - b8b92ab870c50ce5fc59571dc0c77f9a4a90323c qazwsx - abc - c6aec64efe2a25c6bc35aeea2aafb2e86ac96a0c qazwsx - cba - 31e42c24f71dc5a453b2635e6ec57eadf03090fd
As you can see the same passwords, given different salt values, generate completely different hashes. This is the purpose of salt and why LinkedIn failed. Please note that on the table you will store the hash and salt! Never enter a password!
The first thing that the guys who got their hands on LinkedIn hashes was to sort the hashes and see if there were any matches (it was that several users had the same password - a shame on them!), These users were the first to fall. If the passage table was salty ... none of this would have happened, and it would have taken an excruciating amount of time (and computer resources) to crack every single password. This gave LinkedIn enough time to enforce the new password policy.
Hopefully the technical side of the answer gave an idea of ββhow authentication works (or should work).