This seems to be related to how the standard deviation is calculated.
>>> import numpy as np >>> a = np.array([[1, 2],[3, 1]]) >>> np.std(a, axis=0) array([ 1. , 0.5]) >>> np.std(a, axis=0, ddof=1) array([ 1.41421356, 0.70710678])
From the numpy.std documentation ,
ddof: int, optional
Means Delta Degrees Of Freedom. The divisor used in the calculations is N - ddof, where N is the number of elements. By default, ddof is zero.
Apparently R.scale() uses ddof=1 , but sklearn.preprocessing.StandardScaler() uses ddof=0 .
EDIT: (To explain how to use alternate ddof)
There seems to be no easy way to compute std with an alternative ddof without accessing the variables of the StandardScaler () object itself.
sc = StandardScaler() sc.fit(data)
source share