Python - data frame size

New to Python.

In R, you can get the dimension of the matrix using dim (...). What is the corresponding function in Python Pandas for their data frame?

+46
python pandas
Dec 17 '12 at 20:27
source share
3 answers

df.shape where df is your DataFrame.

+80
Dec 17 '12 at 20:29
source share

Summary of all methods for retrieving DataFrame or Series size information

There are several ways to get the attribute information of your DataFrame or Series.

Create Sample DataFrame and Series

 df = pd.DataFrame({'a':[5, 2, np.nan], 'b':[ 9, 2, 4]}) df ab 0 5.0 9 1 2.0 2 2 NaN 4 s = df['a'] s 0 5.0 1 2.0 2 NaN Name: a, dtype: float64 

shape attribute

The shape attribute returns a binary set of the number of rows and the number of columns in the DataFrame. For a series, it returns a tuple of one element.

 df.shape (3, 2) s.shape (3,) 



len function

To get the number of rows of a DataFrame or get the length of a series, use the len function. An integer will be returned.

 len(df) 3 len(s) 3 



size attribute

To get the total number of elements in a DataFrame or Series, use the size attribute. For DataFrames, this is the product of the number of rows and the number of columns. For the series, this will be equivalent to the len function:

 df.size 6 s.size 3 



ndim attribute

The ndim attribute returns the number of dimensions of your DataFrame or Series. It will always be 2 for DataFrames and 1 for the series:

 df.ndim 2 s.ndim 1 



Invalid count method

The count method can be used to return the number of missing values ​​for each column / row of the DataFrame. This can be very confusing because most people usually think of counting as the length of each line, which is not the case. When called in a DataFrame, a row is returned with the column names in the index and the number of missing values ​​as values.

 df.count() # by default, get the count of each column a 2 b 3 dtype: int64 df.count(axis='columns') # change direction to get count of each row 0 2 1 2 2 1 dtype: int64 

For a series, there is only one axis for calculation, and therefore it simply returns a scalar:

 s.count() 2 



Use the info method to extract metadata

The info method returns the number of non-empty values ​​and data types for each column.

df.info ()

 <class 'pandas.core.frame.DataFrame'> RangeIndex: 3 entries, 0 to 2 Data columns (total 2 columns): a 2 non-null float64 b 3 non-null int64 dtypes: float64(1), int64(1) memory usage: 128.0 bytes 
+1
Nov 06 '17 at 14:44
source share

The dataframe form is in the form of row columns *, so the form's search function

 dataframe_name.shape() 
-3
Aug 16 '17 at 11:28
source share



All Articles