HTTP 1.1 Persistent Socket Connections in Java

Say I have a Java program that makes an HTTP request to a server using HTTP 1.1 and does not close the connection. I make one request and read all the data returned from the input stream associated with the socket. However, after the second request, I do not receive any response from the server (or there is a problem with the stream - it does not provide more input). If I make requests in order (request, request, read), it works fine, but (request, read, request, read) does not.

Could someone have shed some insight into why this might happen? (Below are snippets of code). No matter what I do, the second isr_reader.read () read loop only ever returns -1.

try{ connection = new Socket("SomeServer", port); con_out = connection.getOutputStream(); con_in = connection.getInputStream(); PrintWriter out_writer = new PrintWriter(con_out, false); out_writer.print("GET http://somesite HTTP/1.1\r\n"); out_writer.print("Host: thehost\r\n"); //out_writer.print("Content-Length: 0\r\n"); out_writer.print("\r\n"); out_writer.flush(); // If we were not interpreting this data as a character stream, we might need to adjust byte ordering here. InputStreamReader isr_reader = new InputStreamReader(con_in); char[] streamBuf = new char[8192]; int amountRead; StringBuilder receivedData = new StringBuilder(); while((amountRead = isr_reader.read(streamBuf)) > 0){ receivedData.append(streamBuf, 0, amountRead); } // Response is processed here. if(connection != null && !connection.isClosed()){ //System.out.println("Connection Still Open..."); out_writer.print("GET http://someSite2\r\n"); out_writer.print("Host: somehost\r\n"); out_writer.print("Connection: close\r\n"); out_writer.print("\r\n"); out_writer.flush(); streamBuf = new char[8192]; amountRead = 0; receivedData.setLength(0); while((amountRead = isr_reader.read(streamBuf)) > 0 || amountRead < 1){ if (amountRead > 0) receivedData.append(streamBuf, 0, amountRead); } } // Process response here } 

Answers to questions: Yes, I get answers from the server. I use raw sockets due to external constraints.

Apologies for the mess of the code - I rewrote it from memory and seemed to introduce some errors.

So, the consensus is that I should either do (request, request, read), and let the server close the stream after I get to the end, or, if I do (request, read, request, read), stop before I remove the end of the stream so that the stream is not closed.

+4
source share
5 answers

According to your code, the only time you even reach the operators sending the second request is when the server closes the output stream (your input stream) after receiving / responding to the first request.

The reason for this is because your code, which should only read the first answer

 while((amountRead = isr_reader.read(streamBuf)) > 0) { receivedData.append(streamBuf, 0, amountRead); } 

will be blocked until the server completes the output stream (i.e., when read returns -1 ) or until the read timeout in the socket expires. In the case of a read timeout, an exception will be thrown, and you won’t even be able to send a second request.

The problem with HTTP responses is that they do not tell you how many bytes are read from the stream until the end of the response. This is not very important for HTTP 1.0 responses, because the server just closes the connection after the response, which allows you to get the response (status bar + headers + body) by simply reading everything to the end of the stream.

With persistent HTTP 1.1 connections, you can no longer just read everything to the end of the stream. First you need to read the status bar and headers, in turn, and then based on the status code and headers (for example, Content-Length) decide how many bytes to read to get the response body (if it is present in all). If you do this correctly, your read operations will be completed before the connection is closed or a timeout occurs, and you will definitely read the response sent by the server. This will allow you to send the next request and then read the second answer in exactly the same way as the first.

PS Request, request, reading can be β€œworking” in the sense that your server supports pipelining the request and, thus, receives and processes both requests, and as a result, you read both responses in the same buffer as your β€œfirst” response.

PPS Make sure your PrintWriter uses US-ASCII encoding. Otherwise, depending on your system encoding, the request string and the headers of your HTTP requests may be distorted (incorrect encoding).

+5
source

Writing a simple RFC related http / 1.1 client is not such a difficult task. To solve the problem of blocking I / O access while reading a socket in java, you should use the java.nio classes. SocketChannels enable non-blocking I / O access.

This is necessary to send an HTTP request to a persistent connection.

In addition, nio classes will give better results.

My stress test gives the following results:

  • HTTP / 1.0 (java.io) β†’ HTTP / 1.0 (java.nio) = + 20% faster

  • HTTP / 1.0 (java.io) β†’ HTTP / 1.1 (java.nio with a permanent connection) = + 110% faster

+3
source

Make sure you have Connection: keep-alive in your request. However, this can be controversial.

What response does the server return? Do you use chunked gear? If the server does not know the size of the response body, it cannot provide the Content-Length header and must close the connection at the end of the response body to indicate to the client that the content has ended. In this case, keep-alive will not work. If you create content on the fly using PHP, JSP, etc., you can enable output buffering, check the size of the accumulated body, click the Content-Length header and reset the output buffer.

0
source

Is there a specific reason why you are using raw sockets rather than connecting to a Java or Commons HTTPClient URL ?

HTTP is not easy to get right. I know that Commons HTTP Client can reuse connections as you try.

Unless you have a specific reason for using sockets, this is what I would recommend :)

0
source

Writing your own correct HTTP / 1.1 client implementation is non-trivial; historically, most of the people I saw tried to do it wrong. Their implementation usually ignores the specification and simply does what works with one specific test server - in particular, they usually ignore the requirement to be able to process response responses.

Writing your own HTTP client is probably a bad idea unless you have VERY weird requirements.

0
source

Source: https://habr.com/ru/post/1277567/


All Articles