How to improve code performance with Sink?

Question

How to improve code performance with Sink?

I have a strange observation of the shells of a story. They work slowly. Does anyone know why? And is there a way to improve performance?

here are the relevant parts of my code: version without shell

//p is parameter with type p: Process[Task, Pixel] def printToImage(img: BufferedImage)(pixel: Pixel): Unit = { img.setRGB(pixel.x, pixel.y, 1, 1, Array(pixel.rgb), 0, 0) } val image = getBlankImage(2000, 4000) val result = p.runLog.run result.foreach(printToImage(image))

for execution

~ 7s required>

version with sink

 //p is the same as before def printToImage(img: BufferedImage)(pixel: Pixel): Unit = { img.setRGB(pixel.x, pixel.y, 1, 1, Array(pixel.rgb), 0, 0) } //I've found that way of doing sink in some tutorial def getImageSink(img: BufferedImage): Sink[Task, Pixel] = { //I've tried here Task.delay and Task.now with the same results def printToImageTask(img: BufferedImage)(pixel: Pixel): Task[Unit] = Task.delay { printToImage(img)(pixel) } Process.constant(printToImageTask(img)) } val image = getBlankImage(2000, 4000) val result = p.to(getImageSink(image)).run.run

this takes 33 seconds to complete. Because of this significant difference, I am completely confused.

+6

scalaz-stream

user2963977 Oct 14 '14 at 23:56

source share

1 answer

Eugene zhulenev · Accepted Answer · 2014-10-15T02:56:41+0000

In the second case, you assign Task to each pixel, and instead of directly calling printToImage, you execute it through Task, and this is a lot more steps in the call chain.

We use the scalaz stream a lot, but I firmly believe that it uses it too much for such problems. Code running inside Process / Channel / Sink should be much more complicated than just assigning / updating a variable.

We use Sinks to write data from a stream to databases (Cassandra), and we use batch processing to write single lines with great overhead. Process / Sinks is a super convenient abstraction, but for higher workflows. When it's easy to write for-loop, I would suggest writing for-loop.

How to improve code performance with Sink?

More articles: