Is there an implementation of fast parallel syntactic sugar in scala? eg. map reduce

Messaging with actors is great. But I would like to have even simpler code.

Examples (pseudo code)

val splicedList:List[List[Int]]=biglist.partition(100) val sum:Int=ActorPool.numberOfActors(5).getAllResults(splicedList,foldLeft(_+_)) 

where spliceIntoParts turns one large list into 100 small partofactors part lists, creates a pool that uses 5 actors and receives new tasks after completion and getallresults uses the method in the list. all this is done with messaging in the background. where getFirstResult is possible, calculates the first result and stops all other threads (for example, breaks the password)

+4
source share
5 answers

You can do this with less overhead than creating participants using futures:

 import scala.actors.Futures._ val nums = (1 to 1000).grouped(100).toList val parts = nums.map(n => future { n.reduceLeft(_ + _) }) val whole = (0 /: parts)(_ + _()) 

You must handle the problem decomposition and write the β€œfuture” block and rewrite it into the final answer, but this makes executing several small blocks of code in parallel simple.

(Note that _() on the left side is a function of the application of the future, which means: β€œGive me the answer that you are calculating in parallel!” And it will be blocked until the answer is available.)

The parallel collection library will automatically decompose the problem and recompile the answer for you (as in pmap in Clojure); which is not yet part of the core API.

+2
source

With Scala Parallel Collections, which will be included in 2.8.1, you can do things like this:

 val spliced = myList.par // obtain a parallel version of your collection (all operations are parallel) spliced.map(process _) // maps each entry into a corresponding entry using `process` spliced.find(check _) // searches the collection until it finds an element for which // `check` returns true, at which point the search stops, and the element is returned 

and the code will automatically execute in parallel. Other methods found in the regular collection library are also parallelized.

Currently 2.8.RC2 is very close (this or next week) and the 2.8 finals will come a few weeks after, I think. You can try parallel collections if you use the night watch 2.8.1.

+4
source

You can use the Scalaz concurrency function to achieve what you want.

 import scalaz._ import Scalaz._ import concurrent.strategy.Executor import java.util.concurrent.Executors implicit val s = Executor.strategy[Unit](Executors.newFixedThreadPool(5)) val splicedList = biglist.grouped(100).toList val sum = splicedList.parMap(_.sum).map(_.sum).get 

It would be pretty easy to make it prettier (i.e. write a mapReduce function that splits and stacks everything in one). In addition, parMap over the list is overly strict. You will want to start folding before the entire list is ready. More like:

 val splicedList = biglist.grouped(100).toList val sum = splicedList.map(promise(_.sum)).toStream.traverse(_.sum).get 
+3
source

I am not waiting for Scala 2.8.1 or 2.9, it would be better to write my own library or use another one, so I did more search query and found this: akka http://doc.akkasource.org/actors

which has object futures with methods

 awaitAll(futures: List[Future]): Unit awaitOne(futures: List[Future]): Future 

but http://scalablesolutions.se/akka/api/akka-core-0.8.1/ does not have documentation at all. This is bad.

But the good part is that the Akka actors are more compact than the Scala native. With all these libraries (including the skalaz) it would be really cool around if Scala himself could finally unite them officially

+2
source

In Scala Days 2010, there was a very interesting conversation by Alexander Prokopek (who is working on Scala in the EPFL) about Parallel Collections . This will probably be in 2.8.1, but you may have to wait a little longer. I'll see if I can get a presentation myself. for reference here.

The idea is to have a collection structure that parallelizes the processing of collections, doing what you offer, but transparent to the user. All you theoretically need to do is change the import from scala.collections to scala.parallel.collections. You obviously still have to do the work to check if what you are doing can really be parallelized.

+1
source

Source: https://habr.com/ru/post/1308229/


All Articles