Python IndexError when trying to go through a large list

I have a list of approximately 200,000+ objects, each of which represents a file (but does not actually contain the contents of the file, just the full path name and date).

The program I am writing copies any subset of these files depending on the date range provided by the user. First I create a list of all the files in the source directory (with the module glob), create an instance of the class of my file view and add this instance to the list, for example:

for f in glob.glob(srcdir + "/*.txt"):
    LOG_FILES.append(LogFile(f))

Now, to keep the copied files fast and the code block clean, I delete the LogFile objects that do not fit within the date range.

for i in xrange(0, len(LOG_FILES)):
    if LOG_FILES[i].DATE < from_date or LOG_FILES[i].DATE > to_date:
        del(LOG_FILES[i])

Subsequently, I can simply copy the files remaining in the list:

for logfile in LOG_FILES:
    os.copy(logfile.PATH, destdir)

for i in xrange...: IndexError, i 63792.

IndexError: list index out of range.

?

! , , . , .:)

+3
7

[] , "<" " > " 'equals'.

LOG_FILES = [LogFile(f) for f in glob.glob(srcdir + "/*.txt")
                        if from_date <= f.DATE <= to_date]

LOG_FILES. ( , ( ), [] (). , .

, . (. , ).

:

" ( ) LogFile, 'f' f 'glob.glob(...)', if if true."

. "" .

+2

:

, ( , ). , (, ), .

itertools.ifilter, .

+7

, del() .

, del() , , .

list = [1,2,3,4,5]
del(list[2])
print list     # outputs [1, 2, 4, 5]
print list[2]  # outputs 4

0 , , , .

.

for f in glob.glob(srcdir + "/*.txt"):
    lf = LogFile(f)
    if lf.DATE < from_date and lf.DATE > to_date:
        LOG_FILES.append(lf)

, pythonic, , .

+3

, generate index errors. Either you have to iterate over the copy, or use a dynamic index. Since you said the array is large, we use the latter:

limit, i = len(LOG_FILES), 0
while i < limit:
    if LOG_FILES[i].DATE < from_date and LOG_FILES[i].DATE > to_date:
        del(LOG_FILES[i])
        limit -= 1
    else:
        i += 1
+1
source

You can also use filter:

LOG_FILES = filter(lambda log_file: log_file.DATE < from_date and \
                                    log_file.DATE > to_date, LOG_FILES)
+1
source

Cpfohl's answer has a problem:

LOG_FILES = [LogFile(f) for f in glob.glob(srcdir + "/*.txt")
             if f.DATE >= from_date and f.DATE <= to_date]

WITH

for f in glob.glob(srcdir + "/*.txt"):
    LOG_FILES.append(LogFile(f))

so LOG_FILES [i] is LogFile (f) and then LOG_FILES [i] .DATE is LogFile (f) .DATE, not f.DATE

+1
source

1) removal of elements during iteration in the list from the end to the beginning of the list of dissolving problems

LOG_FILES = [ 1,2,30,2,5,8,30,3,2,37,22,30,27,30,4 ]

print LOG_FILES

L = len(LOG_FILES)-1
for i,x in enumerate(LOG_FILES[::-1]):
    print i,L-i,' ',LOG_FILES[L-i],x
    if x>15:
        del LOG_FILES[L-i]

print LOG_FILES

result

[1, 2, 30, 2, 5, 8, 30, 3, 2, 37, 22, 30, 27, 30, 4]
0 14   4 4
1 13   30 30
2 12   27 27
3 11   30 30
4 10   22 22
5 9   37 37
6 8   2 2
7 7   3 3
8 6   30 30
9 5   8 8
10 4   5 5
11 3   2 2
12 2   30 30
13 1   2 2
14 0   1 1
[1, 2, 2, 5, 8, 3, 2, 4]

2) By the way,

if LOG_FILES[i].DATE < to_date and LOG_FILES[i].DATE > from_date :

can write

if from_date  < LOG_FILES[i].DATE < to_date:
0
source

Source: https://habr.com/ru/post/1786778/


All Articles