Problem with Pandas boxplot inside subtitle

I am having a problem with the Pandas graphics box in the subtitle. Based on the two methods I'm trying to do, creating a boxplot either removes all the subheadings that I already created, or breaks the box after the subnet grid. But I can not depict it in a subnet grid.

import matplotlib.pyplot as plt import pandas from pandas import DataFrame, Series data = {'day' : Series([1, 1, 1, 2, 2, 2, 3, 3, 3]), 'val' : Series([3, 4, 5, 6, 7, 8, 9, 10, 11])} df = pandas.DataFrame(data) 

The first thing I tried is the following:

 plt.figure() plt.subplot(2, 2, 1) plt.plot([1, 2, 3]) plt.subplot(2, 2, 4) df.boxplot('val', 'day') 

But it just creates a plot without subplots:

Attempt Aenter image description here

So, I then tried to set the axis manually:

 plt.figure() plt.subplot(2, 2, 1) plt.plot([1, 2, 3]) plt.subplot(2, 2, 4) ax = plt.gca() df.boxplot('val', 'day', ax=ax) 

But it just destroyed the subnet grid together, as well as the initial image:

enter image description here

Any ideas on how I can get the boxplot image in the lower right grid in the subheadings (the one that is empty in the first image set)?

+4
source share
1 answer

This is apparently a mistake or at least an undesirable behavior in setting up pandas build. What happens is that if the by argument is for boxplot , pandas issues its own subplots call, removing any existing subheadings. Apparently, this makes it so that if you want to build more than one value, it will create subtitles for each value (for example, one square for Y1 per day, another for Y2 per day, etc.).

However, however that may be, but it is not, it is a check to see if you are only creating a single value, in which case use the provided ax object (if any), rather than making your own subheadings. When you create only one value, it creates a 1 by 1 subnet grid, which is not very useful. Its logic is also a bit strange, as it creates a grid based on the number of columns you draw (the length of the first argument), but it only does this if you supply a by argument. The goal is to allow multi-line graphs such as df.boxplot(['col1', 'col2']) , but at the same time prevents your quite reasonable attempt to make df.boxplot('col1', 'grouper1') .

I would suggest a problem with pandas error debugger .

In the meantime, a somewhat hacky workaround is to do this:

 df.pivot('val', 'day', 'val').boxplot(ax=ax) 

This changes your data so that the group-by values ​​(days) are columns. The modified table has many NA values ​​for val values ​​that do not occur with a specific day value, but these NA are ignored when plotting, so you get the right section at the right position in the subtitle.

+5
source

Source: https://habr.com/ru/post/1480269/


All Articles