Is it a good idea to replace Collection for Stream in returned values?

Question

Is it a good idea to replace Collection for Stream in returned values?

Prior to Java 8, a property representing a collection of elements usually returns a collection. In the absence of an immutable collection interface, the general idiom would be to wrap it like:

Collection<Foo> getFoos(){ return Collections.unmodifiableCollection(foos); }

Now that Stream is here, it's tempting to start publishing Streams instead of Collections.

The advantages that I see in them:

True immutable API
Most often, a client of this property is interested in querying or iterating the result (it would be really horrible if he wanted to make updates to the collection).

On the other hand, streams can be consumed only once and cannot be transmitted as regular collections. This is especially troubling.

This question is different from a similar question because it is wider in the sense that the OP explicitly states there that the threads that it intends to return are not going to be passed. In my opinion, this aspect was not considered in the answers to the original question.

In other words: it seems to me that if the API returns a stream, the general thinking should be that all interaction with it should end in a direct context. It is forbidden to let flow around.

But it seems that it is very difficult to provide if the developers are not familiar with the Stream API. This means that this type of API requires a paradigm shift. Am I right about this statement?

+6

lambda java-8 java-stream api-design

Vitaliy Feb 16 '15 at 11:20

source share

2 answers

It depends:

if you are returning threads from your methods, you should always be sure that they are not closed yet when returning.

Using Streams in the API of your applications will increase the likelihood that users of your application will also pass Streams instead of Collections, which implies that they also need to keep in mind that they should not return already closed streams.

In private projects using Streams will probably work, but if you are creating a public API, I would not consider Streams as a good idea.

Personally, I prefer to use Iterables in favor of Collections because of their immutability. I created a wrapper called Enumerables to extend Iterable with a similar functional API that Stream has.

0

Sauli tähkäpää Feb 17 '15 at 7:49

source share

Stuart marks · Accepted Answer · 2015-02-19T23:20:11+0000

Let me suggest a simple rule:

A Stream , which is passed as an argument to a method or returned as the return value of a method, must be the tail of an unused pipeline.

This is probably so obvious to those of us who worked on streams that we never bothered to record. But this is probably not obvious to people approaching streams for the first time, so it’s probably worth discussing.

The basic rule is described in the documentation . Stream API documentation for threads: a thread can have at most one terminal operation. After its termination, it is illegal to add any intermediate or terminal operations.

Another rule is that stream pipelines should be linear; they cannot have branches. This is not very well documented, but is mentioned in the thread class documentation about two-thirds of the way down. This means that it is illegal to add an intermediate or terminal operation to a stream if this is not the last operation in the pipeline.

Most flow methods are intermediate or terminal. If you try to use one of them in a thread that has completed or this is not the last operation, you will quickly find out when you get an IllegalArgumentException . Sometimes this happens, but I think that when people get the idea that the conveyor should be linear, they learn to avoid this problem, and the problem goes away. I think this is pretty easy for most people to understand; it should not require a paradigm shift.

Once you understand this, it is clear that if you are going to transfer an instance of Stream to another piece of code - either passing it as an argument or returning it to the caller - it must be the source of the stream or the last intermediate operation in the pipeline. That is, it should be the tail of an inexhaustible conveyor.

In other words: it seems to me that if the API returns a stream, the general thinking should be that all interaction with it should end in a direct context. It is forbidden to pass a stream.

I think this is too restrictive. As long as you stick to the rule that I suggested, you should be free to stream around as much as you want. Indeed, there are many use cases for getting a stream from somewhere, changing it and transmitting it. Here are some examples.

1) Open a text file containing the text representation of the POJO on each line. Call File.lines() to get a Stream<String> . Match each row with a POJO instance and return the Stream<POJO> caller. The caller can apply a filter or sort operation and return the stream to his caller.

2). When using Stream<POJO> you can have a web interface that allows the user to provide a complex set of search criteria. (For example, consider a shopping site with a lot of sorting and filtering options). Instead of compiling a large complex pipeline in your code, you might have a method such as:

 Stream<POJO> applyCriteria(Stream<POJO>, SearchCriteria)

which will receive the stream, apply search criteria by adding various filters and, possibly, sort or perform separate operations and return the received stream to the caller.

From these examples, I hope that you will see that there is considerable flexibility in transmitting streams, if you walk around, it is always the tail of an unused pipeline.

Is it a good idea to replace Collection for Stream in returned values?

More articles: