General considerations
There are several ways to do this, depending very much on the details of your needs. You can use the following helper methods that are part of scalaz threads:
foldWithIndex This gives you the current piece index as a number. You can distinguish based on this indexzipWithState You can add state from one call to your method to the next and use this state to track if you are still parsing the headers or if you have reached the body. In the next step, you can use this state to process the header and body of differentrepartition Use this to group the entire headline and all body elements together. Then you can process them in the next step.zipWithNext This function always presents you with the previous item grouped with the current item. You can use this to detect when you switch from the header to the body and react accordingly.
Perhaps you should think about what you really need. For your question, this will be zipwithIndex and then map . But if you change your mind about the problem, you will probably end up with repartition or zipWithState .
Code example
Make a simple example: an HTTP client that separates the elements of an HTTP header from the body (HTTP, not HTML). In the header, things like cookies in the body are real βcontent,β like an image or HTTP sources.
A simple HTTP client might look like this:
import scalaz.stream._ import scalaz.concurrent.Task import java.net.InetSocketAddress import java.nio.channels.AsynchronousChannelGroup implicit val AG = nio.DefaultAsynchronousChannelGroup def httpGetRequest(hostname : String, path : String = "/"): Process[Nothing, String] = Process( s"GET $path HTTP/1.1", s"Host: $hostname", "Accept: */*", "User-Agent: scalaz-stream" ).intersperse("\n").append(Process("\n\n")) def simpleHttpClient(hostname : String, port : Int = 80, path : String = "/")(implicit AG: AsynchronousChannelGroup) : Process[Task, String] = nio.connect(new InetSocketAddress(hostname, port)).flatMap(_.run(httpGetRequest(hostname, path).pipe(text.utf8Encode))).pipe(text.utf8Decode).pipe(text.lines())
Now we can use this code to separate the header lines from the rest. In HTTP, the header is structured in lines. It is separated from the body by an empty line. So, first count the number of lines in the header:
val demoHostName="scala-lang.org"
When I ran this, there were 8 lines in the header. First we define an enumeration, so we classify the parts of the answer:
object HttpResponsePart { sealed trait EnumVal case object HeaderLine extends EnumVal case object HeaderBodySeparator extends EnumVal case object Body extends EnumVal val httpResponseParts = Seq(HeaderLine, HeaderBodySeparator, Body) }
And then use the zipwithIndex plus map to classify parts of the response:
simpleHttpClient(demoHostName).zipWithIndex.map{ case (line, idx) if idx < 9 => (line, HeaderLine) case (line, idx) if idx == 10 => (line, HeaderBodySeparator) case (line, _) => (line, Body) }.take(15).runLog.run
It works great for me. But, of course, the number of header lines can change at any time without notice. It is much more reliable to use a very simple parser that considers the structure of the response. for this I use zipWithState :
simpleHttpClient(demoHostName).zipWithState(HeaderLine : EnumVal){ case (line, HeaderLine) if line.isEmpty => HeaderBodySeparator case (_, HeaderLine) => HeaderLine case (_, HeaderBodySeparator) => Body case (line, Body) => Body }.take(15).runLog.run
You can see that both approaches use a similar structure, and both approaches should produce the same result. Great, both approaches can be easily reused. You can simply replace the source, for example. with the file, and nothing needs to be changed. The same with processing after classification. .take(15).runLog.run is the same in both approaches.