Numpy.float64 changes when writing to Excel (.xlsx)

Question

Numpy.float64 changes when writing to Excel (.xlsx)

I noticed that when certain Numfl float64 values are saved as an Excel file (via Pandas DataFrame), they change. At first I thought this was due to some inaccuracy in Excel, but Excel seems to encode floating-point numbers as double precision, so I'm a little confused by this observation.

>>> import numpy as np
>>> import pandas as pd

# Create a floating point number that exhibits the problem.
>>> ba = bytearray(['\x53', '\x2a', '\xb0', '\x49', '\xf3', '\x79', '\x90', '\x40'])
>>> ba
bytearray(b'S*\xb0I\xf3y\x90@')
>>> f = np.frombuffer(ba)
>>> f[0]
1054.4875857854684

# Write to dataframe to save as Excel file.
>>> df = pd.DataFrame({'a': f})
>>> df.to_excel('test.xlsx', engine='xlsxwriter')

# Read excel file (when viewing the file in LibreOffice, the 
# value isn't 1054.4875857854684 any more).
>>> df2 = pd.read_excel('test.xlsx')
>>> df2.ix[0,'a']
1054.4875857854699
>>> df2.ix[0,'a'] == f[0]
False

Why is it impossible to read the same float64 file from a previously written Excel?

Openpyxl ( .xlsx) Xlwt ( .xls) . , xlsxwriter, Xlwt float . , , .xlsx?

# this uses the xlwt engine
>>> df.to_excel('test.xls')
>>> df2 = pd.read_excel('test.xls')
>>> df2.ix[0,'a'] == f[0]
True

+4

python numpy pandas xlwt xlsxwriter

orange 24 . '17 10:31

2

Pandas XlsxWriter numpy.float64 .xlsx:

1) numpy.float64 = > float ( ) pandas/io/excel.py

def _conv_value(val):
    # Convert numpy types to Python types for the Excel writers.
    if com.is_integer(val):
        val = int(val)
    elif com.is_float(val):
        val = float(val)
    elif com.is_bool(val):
        val = bool(val)
    elif isinstance(val, Period):
        val = "%s" % val
    elif com.is_list_like(val):
        val = str(val)

    return val

2) float = > string (attr += ' %s="%s"' % (key, value)). ( xlswriter/xmlwriter.py)

def _xml_number_element(self, number, attributes=[]):
    # Optimised tag writer for <c> cell number elements in the inner loop.
    attr = ''

    for key, value in attributes:
        value = self._escape_attributes(value)
        attr += ' %s="%s"' % (key, value)

    self.fh.write("""<c%s><v>%.15g</v></c>""" % (attr, number))

, ( 2) - , . , xls , float .

0

orange 25 . '17 10:08

jmcnamara · Accepted Answer · 2017-06-25T05:06:34+0000

Openpyxl ( .xlsx) Xlwt ( .xls) . , xlsxwriter, Xlwt , , float .

, .xls - , 64- IEEE 754 64 .

.xlsx, , XML . ( '%.16g') double. , , lossey , IEEE 754 .

, numpy , :

>>> '%.16g' % f[0]
'1054.487585785468'

>>> '%.17g' % f[0]
'1054.4875857854684'

>>> '%.18g' % f[0]
'1054.48758578546835'

, 1054.4875857854684 Excel, :

, :

- :

$ unzip numpy.xlsx -d numpy

$ xmllint --format numpy/xl/worksheets/sheet1.xml | grep 1054
        <v>1054.4875857854599</v>

, , Pandas.

Numpy.float64 changes when writing to Excel (.xlsx)

More articles: