Conditional value over Pandas DataFrame

I have a dataset from which I want several averages from several variables that I created.

I started with:

data2['socialIdeology2'].mean()

data2['econIdeology'].mean()

^ which works fine and gives me the averages I'm looking for.

Now I'm trying to make a conditional mean, so the value means only for the selected group in the dataset. (I want ideologies to be shattered by those who were voted on in the 2016 elections). In Stata, the code will look like:mean(variable) if voteChoice == 'Clinton'

Having looked at it, I came to the conclusion that the conditional value is simply not a thing (although I hope I'm wrong?), So I wrote my own function for myself.

I’m just starting with the “middle” function to create the basis of the conditional middle function:

def mean():
    sum = 0.0
    count = 0
    for index in range(0, len(data2['socialIdeology2'])):
        sum = sum + (data2['socialIdeology2'][index])
        print(data2['socialIdeology2'][index])
        count = count + 1
    return sum / count

print(mean())

"nan" . data2['socialIdeology2'][index] nan .

, : , socialIdeology2, nan ( , ), .mean()?

?

+4
2

- pandas. DataFrame.groupby():

means = data2.groupby('voteChoice').mean()

, , :

means = data2.groupby('voteChoice')['socialIdeology2'].mean()

. ( .) , voteChoice - , .

+5

(, ), , True , DataFrame, :

voted_for_clinton = data2['voteChoice'] == 'Clinton'
mean_for_clinton_voters = data2.loc[voted_for_clinton, 'socialIdeology2'].mean()

, groupby, . :

means_by_vote_choice = data2.groupby('voteChoice')['socialIdeology2'].mean()

['socialIdeology2'] .mean() , , .mean() (.. data2.groupby('voteChoice').mean()['socialIdeology2']), , 'socialIdeology2' , .

. DataFrames .loc groupby.

+1

Source: https://habr.com/ru/post/1680240/


All Articles