Convert Int64Index to Int

I iterate through a data frame (called hdf) and applying the changes in line by line. hdf is sorted by group_id and assigned a rank from 1 to n according to some criteria.

# Groupby function creates subset dataframes (a dataframe per distinct group_id).
grouped = hdf.groupby('group_id')

# Iterate through each subdataframe. 
for name, group in grouped:

    # This grabs the top index for each subdataframe
    index1 = group[group['group_rank']==1].index

    # If criteria1 == 0, flag all rows for removal
    if(max(group['criteria1']) == 0):    
        for x in range(rank1, rank1 + max(group['group_rank'])):
            hdf.loc[x,'remove_row'] = 1

I get the following error:

TypeError: int() argument must be a string or a number, not 'Int64Index'

I get the same error when I try to make rank1 explicitly I get the same error:

rank1 = int(group[group['auction_rank']==1].index)

Can someone explain what is happening and provide an alternative?

+4
source share
1 answer

The answer to your specific question is that index1it is Int64Index (basically a list), even if it has one element. To get this one item, you can use index1[0].

. "" , filter:

hdf = hdf.groupby('group_id').filter(lambda group: group['criteria1'].max() != 0)

, , apply:

def filter_group(group):
    if group['criteria1'].max() != 0:
        return group
    else:
        return group.loc[other criteria here]

hdf = hdf.groupby('group_id').apply(filter_group)

( -, , loc , , hdf.loc[group.index, 'remove_row'] = 1).

+1

Source: https://habr.com/ru/post/1611493/


All Articles