Rounding error in variable information from RxSpssData

I found that I consider it a mistake in the way MicrosoftR handles metadata from .sav files from SPSS.

Here is a summary of the Variable View:

ColumnA: 1 - Yes, 2 - No
ColumnB: 0.33 - Yes, 0.5 - Maybe, 0.66 - No, 0.99 - Why not, 1.00 - Yes, for sure.
ColumnC: A - Yes, B - No

My code is:

library(RevoScaleR)

df <- RxSpssData(
  "RoundingTest.sav", 
  stringsAsFactors = FALSE, 
  labelsAsInfo = TRUE, 
  labelsAsLevels = TRUE,
  mapMissingCodes = "none" 
)

test = rxImport(df)

Data is read in the order:

  ColumnA ColumnB ColumnC Var0001
1     Yes    0.33     Yes      NA
2      No    0.50     Yes      NA
3     Yes    0.66      No      NA

However, the InfoCodes values ​​do not matter:

attr(test$ColumnA, ".rxValueInfoCodes") # NULL
attr(test$ColumnB, ".rxValueInfoCodes") # "0" "0" "0" "0" "1"
attr(test$ColumnC, ".rxValueInfoCodes") # NULL

It seems that using some kind of floor function for metadata in numeric columns is used before converting them to character strings.

I tried to use options(scipen = 12)and rxOptions(numDigits = 12)with no success. Using rxDataStepinstead rxImportdoes not work. I believe the error was somewhere in the RxSpssData () function.

  • Has anyone experienced this with RxSpssData or any other type of file?
  • Is there a workaround?
  • Is there an official way to report this to Microsoft if this is a genuine error?

Thanks!

, :

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

EDIT: SAV GitHub .

+6

Source: https://habr.com/ru/post/1016636/


All Articles