Two Base64 encodings give the same decoding

Are two encodings expected to appear in the same decoding? I am trying to fix a digital signature issue by checking the health checks of intermediate strings with base64 encoding.

For example, the following base64 encoding:

R0VUDQoNCg0KRnJpLCAwNCBTZXAgMjAwOSAxMTowNTo0OSBHTVQrMDA6MDANCi8=

and

R0VUCgoKRnJpLCAwNCBTZXAgMjAwOSAxMDozMzoyOCBHTVQrMDA6MDAKLw==

both are decoded to:

GET


Fri, 04 Sep 2009 11:05:49 GMT+00:00
/

(C escapes it: GET\n\n\n Fri, 04 Sep 2009 11:05:49 GMT+00:00\n/)

The first encoding comes from testing two online base64 encoders.

The second encoding comes from the base64 Objective-C encoder available here .

Is there something wrong with the result that I am generating with the Obj-C encoder?

+3
source share
5 answers

, , - , . , , - "" β†’ "" (CR (\ r), LF (\n) CRLF (\ r\n)) -, .

, Base64 Base64.

+8

, , , Python:

>>> from base64 import decodestring as d
>>> a = "R0VUDQoNCg0KRnJpLCAwNCBTZXAgMjAwOSAxMTowNTo0OSBHTVQrMDA6MDANCi8="
>>> b = "R0VUCgoKRnJpLCAwNCBTZXAgMjAwOSAxMDozMzoyOCBHTVQrMDA6MDAKLw=="
>>> d(a)
'GET\r\n\r\n\r\nFri, 04 Sep 2009 11:05:49 GMT+00:00\r\n/'
>>> d(b)
'GET\n\n\nFri, 04 Sep 2009 10:33:28 GMT+00:00\n/'
>>> d(a) == d(b)
False

CRLF-, LF.

+14

, .

$ echo 'R0VUCgoKRnJpLCAwNCBTZXAgMjAwOSAxMDozMzoyOCBHTVQrMDA6MDAKLw==' | base64 -d | hexdump 
0000000 4547 0a54 0a0a 7246 2c69 3020 2034 6553
0000010 2070 3032 3930 3120 3a30 3333 323a 2038
0000020 4d47 2b54 3030 303a 0a30 002f          
000002b
$ echo 'R0VUDQoNCg0KRnJpLCAwNCBTZXAgMjAwOSAxMTowNTo0OSBHTVQrMDA6MDANCi8=' | base64 -d | hexdump
0000000 4547 0d54 0d0a 0d0a 460a 6972 202c 3430
0000010 5320 7065 3220 3030 2039 3131 303a 3a35
0000020 3934 4720 544d 302b 3a30 3030 0a0d 002f
000002f
+4

@sharptooth, \r\n , \n .

>>> base64.b64decode("R0VUDQoNCg0KRnJpLCAwNCBTZXAgMjAwOSAxMTowNTo0OSBHTVQrMDA6MDANCi8=")
'GET\r\n\r\n\r\nFri, 04 Sep 2009 11:05:49 GMT+00:00\r\n/'
>>> base64.b64decode("R0VUCgoKRnJpLCAwNCBTZXAgMjAwOSAxMDozMzoyOCBHTVQrMDA6MDAKLw==")
'GET\n\n\nFri, 04 Sep 2009 10:33:28 GMT+00:00\n/'
+3

The key is that base 64 strings are decoded in a sequence of bytes, not characters. A comparison of the byte arrays created by each of your base 64 lines shows that the difference is how the line ends β€” wherever the first has 13, followed by 10, the second only 10. This is the standard version of Windows-vs- Unix line split.

+2
source

Source: https://habr.com/ru/post/1716893/


All Articles