You can try to pre-assign the list to its size in one expression, rather than adding one element at a time: (one large memory allocation should be faster than many small ones)
book = pyExcelerator.parse_xls(filepath) parsed_dictionary = defaultdict(lambda: '', book[0][1]) number_of_columns = 44 number_of_rows = 500000 result_list = [] * number_of_rows for i in range(0, number_of_rows): ok = False #result_list.append([]) for h in range(0, number_of_columns): item = parsed_dictionary[i,h] if type(item) is StringType or type(item) is UnicodeType: item = item.replace("\t","").strip() result_list[i].append(item) if item != '': ok = True if not ok: break
If this gives a noticeable increase in performance, you can also try to redistribute each element of the list with the number of columns, and then assign them by index rather than adding one value at a time. Here is a snippet that creates a 10x10 two-dimensional list in one expression with an initial value of 0:
L = [[0] * 10 for i in range(10)]
So, folded into your code, it might work something like this:
book = pyExcelerator.parse_xls(filepath) parsed_dictionary = defaultdict(lambda: '', book[0][1]) number_of_columns = 44 number_of_rows = 500000 result_list = [[''] * number_of_rows for x in range(number_of_columns)] for i in range(0, number_of_rows): ok = False #result_list.append([]) for h in range(0, number_of_columns): item = parsed_dictionary[i,h] if type(item) is StringType or type(item) is UnicodeType: item = item.replace("\t","").strip() result_list[i,h] = item if item != '': ok = True if not ok: break
source share