Spark Streaming Bug - Windowed DStream window not working

Windowed Dstream does not work in Spark Streaming. It seems to be a scheduler bug inside Spark Streaming.

val layer0= // Input data val layer1 = layer0.window(Seconds(30), Seconds(30)) // Works layer1.foreachRDD(...) val layer2 = layer1.window(Seconds(60), Seconds(60)) // Does not work layer2.foreachRDD(...) 

Has anyone encountered this problem and found out how to fix it in Spark.

Add more details from the driver log:

Time 1433141250000:

2015-06-01 06:47:30 INFO MapValuedDStream - the time 1433141250000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 30000 ms, and the difference is 10000 ms

2015-06-01 06:47:30 INFO MapValuedDStream - the time 1433141250000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 60,000 ms, and the difference is 10000 ms

Time 1433141260000:

2015-06-01 06:47:40 INFO MapValuedDStream - time 1433141260000 ms is not valid, because zeroTime is 1433141240000 ms and slideDuration is 30,000 ms, and the difference is 20,000 ms

2015-06-01 06:47:40 INFO MapValuedDStream - the time 1433141260000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 60,000 ms, and the difference is 20,000 ms

Time 1433141270000: (30S)

2015-06-01 06:47:50 INFO FilteredDStream - slicing from 1433141250000 ms to 1433141270000 ms (aligned on 1433141250000 ms and 1433141270000 ms)

2015-06-01 06:47:50 INFO MapValuedDStream - the time 1433141270000 ms is invalid, because zeroTime is 1433141240000 ms, and slideDuration is 60,000 ms, and the difference is 30,000 ms

Time 1433141280000: 2015-06-01 06:48:00 INFO MapValuedDStream - time 1433141280000 ms is invalid, because zeroTime is 1433141240000 ms and slideDuration is 30,000 ms, and the difference is 40,000 ms.

2015-06-01 06:48:00 INFO MapValuedDStream - the time 1433141280000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 60,000 ms, and the difference is 40,000 ms

Time 1433141290000:

2015-06-01 06:48:10 INFO MapValuedDStream - the time 1433141290000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 30,000 ms, and the difference is 50,000 ms

2015-06-01 06:48:10 INFO MapValuedDStream - the time 1433141290000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 60,000 ms, and the difference is 50,000 ms

Time 1433141300000: (60S)

2015-06-01 06:48:20 INFO WindowedDStream - slicing from 1433141270000 ms to 1433141300000 ms (agreed with 1433141250000 ms and 1433141280000 ms)

2015-06-01 06:48:20 INFO WindowedDStream - the time 1433141250000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 30,000 ms, and the difference is 10,000 ms

2015-06-01 06:48:20 INFO WindowedDStream - time 1433141280000 ms is not valid, because zeroTime is 1433141240000 ms, and slideDuration is 30,000 ms, and the difference is 40,000 ms

+6
source share
1 answer

This is really a mistake, and I registered it as SPARK-7326 . I also fixed it myself. See my retrieval request that was merged with the wizard. I believe the fix will be in version 1.4.0.

+2
source

Source: https://habr.com/ru/post/988437/


All Articles