I am creating a Python utility that will include conversion integers into dictionary strings, where many integers can be displayed on one string. In my opinion, Python puts short strings and most hardcoded strings by default, preserving the memory overhead as a result, storing the "canonical" version of the string in the table. I thought I could benefit from this by executing string values, even if the string interpretation is more built to optimize key hashing. I wrote a quick test that checks the validity of lines for long lines, first only the lines stored in the list, and then the lines stored in the dictionary as values. I was unexpectedly lucky:
import sys top = 10000 non1 = [] non2 = [] for i in range(top): s1 = '{:010d}'.format(i) s2 = '{:010d}'.format(i) non1.append(s1) non2.append(s2) same = True for i in range(top): same = same and (non1[i] is non2[i]) print("non: ", same) # prints False del non1[:] del non2[:] with1 = [] with2 = [] for i in range(top): s1 = sys.intern('{:010d}'.format(i)) s2 = sys.intern('{:010d}'.format(i)) with1.append(s1) with2.append(s2) same = True for i in range(top): same = same and (with1[i] is with2[i]) print("with: ", same) # prints True ############################### non_dict = {} non_dict[1] = "this is a long string" non_dict[2] = "this is another long string" non_dict[3] = "this is a long string" non_dict[4] = "this is another long string" with_dict = {} with_dict[1] = sys.intern("this is a long string") with_dict[2] = sys.intern("this is another long string") with_dict[3] = sys.intern("this is a long string") with_dict[4] = sys.intern("this is another long string") print("non: ", non_dict[1] is non_dict[3] and non_dict[2] is non_dict[4]) # prints True ??? print("with: ", with_dict[1] is with_dict[3] and with_dict[2] is with_dict[4]) # prints True
I thought that checks without a dictate would lead to a βfalseβ listing, but I was clearly mistaken. Does anyone know what is going on, and can there be any interruptions in the translation in my case? I could have many more keys than one value if I combine data from several input texts, so I'm looking for a way to save memory space. (I may have to use a database, but that is beyond the scope of this question.) Thank you in advance!
synchronizer Jan 01 '16 at 2:15 2017-01-01 02:15
source share