How to extract unique permutations from pandas DataSeries?

Working in Jupyter with Pandas DataSeries I have a dataset with the following lines:

color: white engineType: diesel make: Ford manufacturingYear: 2004 accidentCount: 123 

What I need to do is build accident counting charts (y axis) by year of production (x axis) for all color permutations / engineType / make. Any ideas how to do this?

To speed things up, I have this initial setup:

 import numpy as np import pandas as pd from pandas import DataFrame, Series import random colors = ['white', 'black','silver'] engineTypes = ['diesel', 'petrol'] makes = ['ford', 'mazda', 'subaru'] years = range(2000,2005) rowCount = 100 def randomEl(data): rand_items = [data[random.randrange(len(data))] for item in range(rowCount)] return rand_items df = DataFrame({ 'color': Series(randomEl(colors)), 'engineType': Series(randomEl(engineTypes)), 'make': Series(randomEl(makes)), 'year': Series(randomEl(years)), 'accidents': Series([int(1000*random.random()) for i in range(rowCount)]) }) 
+5
source share
1 answer

You can get the number of crashes with the unique combinations of color , engineType and make using groupby() :

 accident_counts = df.groupby(['color', 'engineType', 'make'])['accidents'].sum() 

Matplotlib is one way to build results:

 import matplotlib.pyplot as plt accident_counts.plot(kind='bar') plt.show() 
+6
source

Source: https://habr.com/ru/post/1266762/


All Articles