How to extract unique permutations from pandas DataSeries?

Question

How to extract unique permutations from pandas DataSeries?

Working in Jupyter with Pandas DataSeries I have a dataset with the following lines:

color: white engineType: diesel make: Ford manufacturingYear: 2004 accidentCount: 123

What I need to do is build accident counting charts (y axis) by year of production (x axis) for all color permutations / engineType / make. Any ideas how to do this?

To speed things up, I have this initial setup:

 import numpy as np import pandas as pd from pandas import DataFrame, Series import random colors = ['white', 'black','silver'] engineTypes = ['diesel', 'petrol'] makes = ['ford', 'mazda', 'subaru'] years = range(2000,2005) rowCount = 100 def randomEl(data): rand_items = [data[random.randrange(len(data))] for item in range(rowCount)] return rand_items df = DataFrame({ 'color': Series(randomEl(colors)), 'engineType': Series(randomEl(engineTypes)), 'make': Series(randomEl(makes)), 'year': Series(randomEl(years)), 'accidents': Series([int(1000*random.random()) for i in range(rowCount)]) })

+5

python pandas jupyter data-science

wciesiel Apr 15 '17 at 12:50

source share

1 answer

ASGM · Answer 1 · 2017-04-15T13:29:57+0000

You can get the number of crashes with the unique combinations of color , engineType and make using groupby() :

 accident_counts = df.groupby(['color', 'engineType', 'make'])['accidents'].sum()

Matplotlib is one way to build results:

 import matplotlib.pyplot as plt accident_counts.plot(kind='bar') plt.show()

How to extract unique permutations from pandas DataSeries?

More articles: