Unique identifier for email

I am writing a C # application that allows users to store emails in an MS SQL Server database. Many times, multiple users will be copied by email from the client. If they all try to add the same letter to the database, I want the message to be added only once.

MD5 recalls how to do this. I don’t have to worry about malicious falsification, just to make sure that the same email address will be mapped to the same hash and that no two emails with other content will be displayed in the same hash.

My question really comes down to how can I combine multiple fields into one MD5 (or other) hash value. Some of these fields will have one value for each e-mail (for example, subject, body, sender email address), while others will have several values ​​(different number of attachments, recipients). I want to develop a way to uniquely identify email that is platform and language independent (not based on serialization). Any tips?

+3
source share
3 answers

How many letters do you plan to archive? If you do not expect that there will be many terabytes in the archive, I think this is a premature optimization.

, , , -. , .

EDIT Psuedocode

# intialized the hash object
hash = md5()

# compute the hashes for each field
hash.update(from_str)
hash.update(to_str)
hash.update(cc_str)
hash.update(body_str)
hash.update(...) # the rest of the email fields

# compute the identifier string
id = hash.hexdigest()

,

# concatenate all fields and hash
hash.update(from_str + to_str + cc_str + body_str + ...)

, , api.

, , , .

+2

, ( , OS X Mail):

X-Universally-Unique-Identifier: 82d00eb8-2a63-42fd-9817-a3f7f57de6fa
Message-Id: <EE7CA968-13EB-47FB-9EC8-5D6EBA9A4EB8@example.com>

. ( ). , .

, , , :)

+1

-? , , , . , .., . - mikerobi.

+1

Source: https://habr.com/ru/post/1742121/


All Articles