Why does Scala support a collection type that does not return Iterable (as in .Net)?

Question

Why does Scala support a collection type that does not return Iterable (as in .Net)?

In Scala you can do

val l = List(1, 2, 3) l.filter(_ > 2) // returns a List[Int] val s = Set("hello", "world") s.map(_.length) // returns a Set[Int]

The question is why is this useful?

Scala collections are probably the only existing database that does this. The Scala community seems to agree that this functionality is necessary. However, no one misses this functionality in other languages. C # Example (name change to match Scala):

 var l = new List<int> { 1, 2, 3 } l.filter(i => i > 2) // always returns Iterable[Int] l.filter(i => i > 2).toList // if I want a List, no problem l.filter(i => i > 2).toSet // or I want a Set

In .NET, I always return Iterable, and it depends on me what I want to do with it. (This also makes .NET assemblies very simple).

~~The Scala example using Set makes me make a set of lengths from a set of strings.~~ ~~But what if I just want to iterate over the length, or build a list of lengths, or save Iterable for filtering later.~~ ~~Building a set immediately seems pointless.~~ ( EDIT : collection.view provides simpler .NET functionality, good)

I am sure that you will show me examples where the .NET approach is completely wrong or kills performance, but I just do not see it (using .NET for many years).

+6

collections design scala

Martin Konicek May 31 '11 at 8:26

source share

4 answers

Scala follows the principle of a uniform return type, ensuring that you always get the appropriate return type instead of losing this information like in C #.

C #'s reason is that their type system is not good enough to provide these assurances without overriding the entire implementation of each method in each individual subclass. Scala solves this problem using higher quality types.

Why does Scala use a single collection infrastructure? Because it’s more complicated than many people think, especially if things like “Strings and Arrays” that are not “real” collections should also be integrated:

 // This stays a String: scala> "Foobar".map(identity) res27: String = Foobar // But this falls back to the "nearest" appropriate type: scala> "Foobar".map(_.toInt) res29: scala.collection.immutable.IndexedSeq[Int] = Vector(70, 111, 111, 98, 97, 114)

+9

soc May 31 '11 at 11:32

source share

If you have a Set , and the operation on it returns Iterable , and its runtime type is still Set , then you lose important information about its behavior and access to certain settings methods.

By the way: there are other languages like Haskell that have a strong influence on Scala. The Haskell map version will look like this: Scala (without implicit magic):

 //the functor type class trait Functor[C[_]] { def fmap[A,B](f: A => B, coll: C[A]) : C[B] } //an instance object ListFunctor extends Functor[List] { def fmap[A,B](f: A => B, list: List[A]) : List[B] = list.map(f) } //usage val list = ListFunctor.fmap((x:Int) => x*x, List(1,2,3))

And I think the Haskell community is also evaluating this feature :-)

+7

Landei May 31 '11 at 9:49

source share

This is a matter of consistency. Things are what they are and return things like them. You can depend on him.

The difference you make here is rigor. The strict method is evaluated immediately, and the non-strict method is evaluated as necessary. This has consequences. Take this simple example:

 def print5(it: Iterable[Int]) = { var flag = true it.filter(_ => flag).foreach { i => flag = i < 5 println(i) } }

Test it with these two collections:

 print5(List.range(1, 10)) print5(Stream.range(1, 10))

List strict here, so its methods are strict. Conversely, Stream is non-strict, so its methods are not strict.

So this has nothing to do with Iterable all - because List and Stream are Iterable . Changing the type of collection return can cause all sorts of problems - at least this will make the task of maintaining a constant data structure more difficult.

On the other hand, there are advantages to delaying certain operations, even with strict collection. Here are some ways to do this:

 // Get an iterator explicitly, if it going to be used only once def print5I(it: Iterable[Int]) = { var flag = true it.iterator.filter(_ => flag).foreach { i => flag = i < 5 println(i) } } // Get a Stream explicitly, if the result will be reused def print5S(it: Iterable[Int]) = { var flag = true it.toStream.filter(_ => flag).foreach { i => flag = i < 5 println(i) } } // Use a view, which provides non-strictness for some methods def print5V(it: Iterable[Int]) = { var flag = true it.view.filter(_ => flag).foreach { i => flag = i < 5 println(i) } } // Use withFilter, which is explicitly designed to be used as a non-strict filter def print5W(it: Iterable[Int]) = { var flag = true it.withFilter(_ => flag).foreach { i => flag = i < 5 println(i) } }

+4

Daniel C. Sobral May 31, '11 at 19:26

source share

Jean-philippe pellet · Accepted Answer · 2011-05-31T08:44:09+0000

Not a complete answer to your question, but Scala never forces you to use one type of collection over another. You can write code like this:

 import collection._ import immutable._ val s = Set("hello", "world") val l: Vector[Int] = s.map(_.length)(breakOut)

Read more about breakOut in Daniel's detailed answer. breakOut to another question.

If you want your map or filter evaluated lazily, use this:

 s.view.map(_.length)

All this makes it easy to integrate your new collection classes and inherit all the powerful features of the standard collection without duplicating code; all this ensures that YourSpecialCollection#filter returns an instance of YourSpecialCollection ; that YourSpecialCollection#map returns an instance of YourSpecialCollection if it supports the display type, or a built-in backup collection if it is not (for example, what happens when you call map on a BitSet ). Of course, the C # iterator does not have a .toMySpecialCollection method.

See also: “Integrating New Kits and Maps” into the Scala Architecture Collections .

Why does Scala support a collection type that does not return Iterable (as in .Net)?

More articles: