Formatting the thousands separator for integers in a pandas frame

I am trying to use '{:,}'.format(number) , as in the example below, to format a number in a pandas frame:

 # This works for floats and integers print '{:,}'.format(20000) # 20,000 print '{:,}'.format(20000.0) # 20,000.0 

The problem is that it does not work with a data framework that has integers, and it works fine in a data frame with a float. See Examples:

 # Does not work. The format stays the same, does not show thousands separator df_int = DataFrame({"A": [20000, 10000]}) print df_int.to_html(float_format=lambda x: '{:,}'.format(x)) # Example of result # <tr> # <th>0</th> # <td> 20000</td> # </tr # Works OK df_float = DataFrame({"A": [20000.0, 10000.0]}) print df_float.to_html(float_format=lambda x: '{:,}'.format(x)) # Example of result # <tr> # <th>0</th> # <td>20,000.0</td> # </tr> 

What am I doing wrong?

+6
source share
2 answers

The formatters parameter in to_html will accept a dictionary of column names associated with the format function. Below is an example of a function for constructing a dict, which displays the same function for both float and ints.

 In [250]: num_format = lambda x: '{:,}'.format(x) In [246]: def build_formatters(df, format): ...: return {column:format ...: for (column, dtype) in df.dtypes.iteritems() ...: if dtype in [np.dtype('int64'), np.dtype('float64')]} ...: In [247]: formatters = build_formatters(df_int, num_format) In [249]: print df_int.to_html(formatters=formatters) <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>A</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>20,000</td> </tr> <tr> <th>1</th> <td>10,000</td> </tr> </tbody> </table> 
+5
source

pandas (starting from 0.20.1) does not allow you to easily override the default integer format. It is hardcoded in pandas.io.formats.format.IntArrayFormatter ( labmda function):

 class IntArrayFormatter(GenericArrayFormatter): def _format_strings(self): formatter = self.formatter or (lambda x: '% d' % x) fmt_values = [formatter(x) for x in self.values] return fmt_values 

I assume that you are really asking how you can override the format for all integers: replace ("monkey patch") IntArrayFormatter to print integer values โ€‹โ€‹with thousands of IntArrayFormatter separated as follows:

 import pandas class _IntArrayFormatter(pandas.io.formats.format.GenericArrayFormatter): def _format_strings(self): formatter = self.formatter or (lambda x: ' {:,}'.format(x)) fmt_values = [formatter(x) for x in self.values] return fmt_values pandas.io.formats.format.IntArrayFormatter = _IntArrayFormatter 

Note:

  • Before 0.20.0, the formatter was in pandas.formats.format .
  • Before 0.18.1, the formatter was in pandas.core.format .

Besides

For floats, you do not need to jump over these hoops, as there is a configuration parameter for it:

display.float_format : the caller should accept a floating point number and return a string with the desired number format. This is used in some places, for example SeriesFormatter . See core.format.EngFormatter for an example.

+6
source

Source: https://habr.com/ru/post/972737/


All Articles