How to create and populate pandas dataframe from for loop?

Here is a simple example of the code I'm running, and I would like the results to fit in the pandas framework (if there is no better option):

for p in game.players.passing():
    print p, p.team, p.passing_att, p.passer_rating()

R.Wilson SEA 29 55.7
J.Ryan SEA 1 158.3
A.Rodgers GB 34 55.8

Using this code:

d = []
for p in game.players.passing():
    d = [{'Player': p, 'Team': p.team, 'Passer Rating':
        p.passer_rating()}]

pd.DataFrame(d)

I can get:

    Passer Rating   Player      Team
  0 55.8            A.Rodgers   GB

What frame size is 1x3, and I understand why this is only one row, but I cannot figure out how to make it multiline with the columns in the correct order. Ideally, the solution could deal with n number of rows (based on p), and it would be great (though not necessary) if the number of columns were given by the number of requested statistics. Any suggestions? Thanks in advance!

+43
source share
4 answers

Try this using a list comprehension:

from pandas import DataFrame as df

d = df[[p, p.team, p.passing_att, p.passer_rating()] for p in game.players.passing()]
+26

- , :

d = []
for p in game.players.passing():
    d.append({'Player': p, 'Team': p.team, 'Passer Rating':
        p.passer_rating()})

pd.DataFrame(d)

" dataframe " (, , ), .

d = pd.DataFrame()

for p in game.players.passing():
    temp = pd.DataFrame({'Player': p, 'Team': p.team, 'Passer Rating':
        p.passer_rating()})

    d = pd.concat([d, temp])
+57

, DataFrame:

d = []
for p in game.players.passing():
    d.append((p, p.team, p.passer_rating()))

pd.DataFrame(d, columns=('Player', 'Team', 'Passer Rating'))

, . , , , , .

:

def with_tuples(loop_size=1e5):
    res = []

    for x in range(int(loop_size)):
        res.append((x-1, x, x+1))

    return pd.DataFrame(res, columns=("a", "b", "c"))

def with_dict(loop_size=1e5):
    res = []

    for x in range(int(loop_size)):
        res.append({"a":x-1, "b":x, "c":x+1})

    return pd.DataFrame(res)

:

%timeit -n 10 with_tuples()
# 10 loops, best of 3: 55.2 ms per loop

%timeit -n 10 with_dict()
# 10 loops, best of 3: 130 ms per loop
+17

, , @amit .

from pandas import DataFrame as df
x = [1,2,3]
y = [7,8,9,10]

# this gives me a syntax error at 'for' (Python 3.7)
d1 = df[[a, "A", b, "B"] for a in x for b in y]

# this works
d2 = df([a, "A", b, "B"] for a in x for b in y)

# and if you want to add the column names on the fly
# note the additional parentheses
d3 = df(([a, "A", b, "B"] for a in x for b in y), columns = ("l","m","n","o"))
0

Source: https://habr.com/ru/post/1667947/


All Articles