How to update a broadcast variable in streaming use?

Question

How to update a broadcast variable in streaming use?

I have a use case when I have a streaming job that receives input from the kafka queue. And I have reference data of 1 million rows, which are updated every hour. I load the reference data into the driver and then pass it to the workers. I would like to update this broadcast variable (in the driver) and send it to the workers.

What would be the best way to do this inside the spark without introducing hbase / redis / cassandra etc.

And how reliable is this?

Let me know if additional information is needed. Thank you in advance. =)

+4

apache-spark

Subba rao Sep 23 '15 at 12:52

source share

1 answer

Timofey Chernousov · Answer 1 · 2017-04-24T05:56:49+0000

: ?

, : "" , .

PS. , .

How to update a broadcast variable in streaming use?

More articles: