What is the normal way to create a unique identifier for POP3 emails?

IMAP messages have UID for which we all rejoice. However, I am trying to figure out how to create a unique identifier for a POP3 message and have problems (older systems like hotmail.com only allow POP3).

Available messages to the client are captured when the POP maildrop session is opened, and are identified by the message number local to the session or, optionally, using the unique identifier assigned to the message by the POP server. This unique identifier is persistent and unique to maildrop and allows the client to access the same message in different POP sessions. Mail is retrieved and marked to delete the message number. When a client leaves the session, mail marked for deletion is deleted from the mailbox. - wikipedia

It seems that the main LIST command simply returns an array of temporary numbers so that you can receive the email. These numbers are by no means unique, although it seems to have added another extension called UIDL: CAPA (POP3 extension mechanism).

POP3 claims that a UIDL is unique as long as the message exists.

The unique identifier of the message is an arbitrary string defined by the server, consisting of one to 70 characters in the range from 0x21 to 0x7E, which uniquely identifies the message within the maildrop and which is stored through the sessions. This persistence is required even if the session ends without entering the UPDATE state. The server should never reuse unique-id in a given mailbox until an object using unique-id exists.

Please note that messages marked as deleted are not listed.

Although it is usually preferable to store server implementations of randomly assigned unique identifiers in maildrop, this specification is intended to allow unique identifiers to be calculated as a hash message. Clients should be able to deal with a situation where two identical copies of a message in maildrop have the same unique identifier.

Which makes me think that it is possible that I can download another message a year later (after the first one has been deleted), which has the same UIDL and may run into my system.

Should I just hash the entire text of the message and use it as an identifier?

Instead of retrieving all the email for the hash, perhaps I just need to use TOP [id] 1 for the hash headers (and the first line) that should not match the existing email address, since the receiving server will always add some type of information correct ? Thus, the attacker could never cause a collision since the received or something had to be changed correctly?

The MDaemon program seems to solve the problem of partial hashing of headers:

MDaemon creates UIDL results using the message name, date stamp, size, and several other message information. As a result, if the message is changed on the server, it will be displayed as β€œnew” for mail clients, even if you do not rename it.

What is the correct way to create an identifier for POP3 email?

Note. Email often contains a Message-ID header, but I cannot rely on it because it can be used as an attack vector to confuse my system. It is also abandoned by some email clients.

+6
source share
3 answers

Personally, I just wanted to add a small subset of the email headers: something like Date , From , Subject and Message-ID , if available.

I often subscribe to mailing lists, where you usually receive several copies of the same message when someone answers you - the one that comes directly from them and the other through the mail server. Under these conditions, many headers are different from each other, but I would prefer not to receive two copies of the message.

And the likelihood that I receive two different letters from the same person at the same time with the same subject, and the same message identifier seems extremely unlikely.

Of course, this is not possible. They may not generate message identifiers, they may have an empty subject line, they may have a broken clock, and they may have all of these things at the same time. But then again, the router through which their email passes can be destroyed by a giant meteor from space.

Honestly, the most likely scenario is that the email will be detected by spam, and I will never see it. Email is simply not a reliable form of communication. You need something that works reasonably well, but if it fails to cope with this 1 millionth edge, you'll probably be fine anyway.

+3
source

Excuse me for asking your question, but ... the real question is: why do you care? It seems to me that you are really trying to come up with a natural primary key for emails. You do not need - and really not, anyway. What problem are you trying to solve?

Your understanding of UIDL is correct. A message must contain the same UIDL when it is in a specific mailbox, the same message can have the same UIDL (but not required), and UIDL should not be repeated in the context of the mailbox, but not required. The latter requirement, in particular, emphasizes the scope and purpose of UIDL. After the client has deleted the message from the mailbox, he must (and can) forget about his UIDL, because this value, if it appears again, will never transmit any relation to the previous message.

+1
source

I would like UIDL to mention along with the current timestamp that you must ensure the number is unique. If the UIDL is unique as long as the message leaves, using a timestamp ensures that the script you are referring to (another message with the same UIDL) does not happen!

0
source

Source: https://habr.com/ru/post/944053/


All Articles