Milliseconds puzzles when calling strptime in R

options(digits.secs = 3); > strptime("2007-03-30 15:00:00.007", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.007" > strptime("2007-03-30 15:00:00.008", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.008" > strptime("2007-03-30 15:00:00.009", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.008" > strptime("2007-03-30 15:00:00.010", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.01" > strptime("2007-03-30 15:00:00.011", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.010" > strptime("2007-03-30 15:00:00.999", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:00.998" 

I am confused why the difference in milliseconds is with "009", then again with "011".

+6
source share
2 answers

This is due to R-FAQ 7.31 , although this requires an unusual image.

The behavior you see is obtained from a combination of: (a) an inaccurate representation of (most) decimal values ​​by binary computers; and (b) the documented behavior of strftime and strptime , which shortens rather than rounds the fractional parts of seconds to the specified number of decimal places.

From the help file ?strptime (keyword "truncated"):

Specifically for R, there is "% OSn", which for output gives seconds truncated to '0 <= n <= 6 decimal places (and if'% OS is not and then a digit, it uses the 'GetOption ("digits.secs" setting ), or if it is not specified, 'n = 3).

The example probably illustrates what happens more efficiently than the further explanation:

 strftime('2011-10-11 07:49:36.3', format="%Y-%m-%d %H:%M:%OS6") [1] "2011-10-11 07:49:36.299999" strptime('2012-01-16 12:00:00.3', format="%Y-%m-%d %H:%M:%OS1") [1] "2012-01-16 12:00:00.2" 

In the above example, the fractional “.3” should be best approximated by a binary number that is slightly less than “0.300000000000000000” - something like “0.29999999999999999”. Since strptime and strftime truncate rather than round to the specified decimal place, 0.3 will be converted to 0.2 if the number of decimal places is set to 1. The same logic holds for your approximate times, of which half show this behavior, as expected (on average).

+9
source

I know this is the "answer", but these problems still exist for 32-bit R, there is an inconsistency in the implementation between 32-bit and 64-bit versions. The truncation problem is partially true, but this is not the result of the strptime function, but the print.POSIXlt method in this particular case.

This can be demonstrated by overwriting a function with a function that causes the expected behavior. For instance.

 print.POSIXlt = function(posix) { print(paste0(posix$year+1900,"-",sprintf("%02d",posix$mon+1),"-",sprintf("%02d",posix$mday)," ", sprintf("%02d",posix$hour),":",sprintf("%02d",posix$min),":",sprintf("%002.003f",posix$sec))) } 

Now the time is displayed as expected:

 > strptime("2007-03-30 15:00:00.009", format = "%Y-%m-%d %H:%M:%OS"); [1] "2007-03-30 15:00:0.009" 

For more information, I examined this R problem with rounding milliseconds

+2
source

Source: https://habr.com/ru/post/906171/


All Articles