At its core, this is really a NumPy problem, not a pandas problem.
map iterates over the values ββin the column to pass them to the lambda functions one at a time. Under the columns / rows in pandas, there are only (slices) of NumPy arrays, so pandas defines the following helper function to get the value from the base array for the function. This is called map at each iteration:
PANDAS_INLINE PyObject* get_value_1d(PyArrayObject* ap, Py_ssize_t i) { char *item = (char *) PyArray_DATA(ap) + i * PyArray_STRIDE(ap, 0); return PyArray_Scalar(item, PyArray_DESCR(ap), (PyObject*) ap); }
The key bit is PyArray_Scalar , which is a NumPy API function that copies a section of a NumPy array to return a scalar value.
The code that makes up this function is too long to post here, but here where it can be found in the codebase. All we need to know is that the scalar it returns will match the dtype of the array used.
Back to your series: s0 has an object dtype, and s1 has a float64 dtype. This means that PyArray_Scalar returns a different type of scalar for each series; the actual Python float and scalar NumPy respectively:
>>> type(s0[2]) float >>> type(s1[0]) numpy.float64
NaN values ββare returned as two different types, so different errors when trying to index them using the lambda function.
source share