Mapping a Paired Chart in a Pandas Data Frame

I am trying to display a pair diagram by creating from a disp_matrix in a pandas dataframe. Here's how to create a pair chart:

# Create dataframe from data in X_train # Label the columns using the strings in iris_dataset.feature_names iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset.feature_names) # Create a scatter matrix from the dataframe, color by y_train grr = pd.scatter_matrix(iris_dataframe, c=y_train, figsize=(15, 15), marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8, cmap=mglearn.cm3) 

I want to show a pair chart to look something like this:

Enter a description of the image here.

I use Python v3.6 and PyCharm and do not use Jupyter Notebook.

+5
source share
3 answers

This code worked for me using Python 3.5.2:

 import pandas as pd import matplotlib.pyplot as plt %matplotlib inline from sklearn import datasets iris_dataset = datasets.load_iris() X = iris_dataset.data Y = iris_dataset.target iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names) # Create a scatter matrix from the dataframe, color by y_train grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8) 

For pandas version <v0.20.0.

Thanks to michael-szczepaniak , indicating that this API is deprecated.

 grr = pd.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8) 

I just had to remove the cmap=mglearn.cm3 fragment because I could not get mglearn to work. There is a version mismatch issue with sklearn.

In order not to display the image and save it directly to a file, you can use this method:

 plt.savefig('foo.png') 

Also remove

 # %matplotlib inline 

Enter a description of the image here.

+9
source

Just an update for Vikash great answer. The last two lines should be:

 grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8) 

The scatter_matrix function has been ported to the builder package, so the original answer, while correct, is now deprecated.

So now there will be the full code:

 import pandas as pd import matplotlib.pyplot as plt %matplotlib inline from sklearn import datasets iris_dataset = datasets.load_iris() X = iris_dataset.data Y = iris_dataset.target iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names) # create a scatter matrix from the dataframe, color by y_train grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o', hist_kwds={'bins': 20}, s=60, alpha=.8) 
+7
source

I finally know how to do this with PyCharm.

Just import matploblib.plotting as plt instead:

 import numpy as np import matplotlib.pyplot as plt import pandas as pd import mglearn from pandas.plotting import scatter_matrix from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split iris_dataset = load_iris() X_train,X_test,Y_train,Y_test = train_test_split(iris_dataset['data'],iris_dataset['target'],random_state=0) iris_dataframe = pd.DataFrame(X_train,columns=iris_dataset.feature_names) grr = scatter_matrix(iris_dataframe,c = Y_train,figsize = (15,15),marker = 'o', hist_kwds={'bins':20},s=60,alpha=.8,cmap = mglearn.cm3) plt.show() 

Then it works fine, as shown below:

Plot image

+1
source

Source: https://habr.com/ru/post/1265013/


All Articles