Raw HTTP request analysis

I am working on an HTTP Traffic Data dataset that consists of a full POST and GET request. As shown below. I wrote code in java that separated each of these requests and saved it as a string element in an array list. Now I am confused how to parse this raw HTTP request in java, is there any better method than manual parsing?

GET http://localhost:8080/tienda1/imagenes/3.gif/ HTTP/1.1 User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.8 (like Gecko) Pragma: no-cache Cache-control: no-cache Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Encoding: x-gzip, x-deflate, gzip, deflate Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5 Accept-Language: en Host: localhost:8080 Cookie: JSESSIONID=FB018FFB06011CFABD60D8E8AD58CA21 Connection: close 
+4
source share
3 answers

I'm [am] working on [a] an HTTP traffic dataset that consists of a full POST request and GET [s]

So, you want to parse a file or list containing multiple HTTP requests. What data do you want to extract? In any case, here is the Java parsing class Java, which can read the method, version, and URIs used in the query string and reads all the headers in the Hashtable.

You can use this or write yourself if you want to reinvent the wheel. Take a look at the RFC to see what the query looks like, to parse it correctly:

 Request = Request-Line ; Section 5.1 *(( general-header ; Section 4.5 | request-header ; Section 5.3 | entity-header ) CRLF) ; Section 7.1 CRLF [ message-body ] ; Section 4.3 
+2
source

Here is a general Http request analyzer for all types of methods (GET, POST, etc.) for your convenience:

  package util.dpi.capture; import java.io.BufferedReader; import java.io.IOException; import java.io.StringReader; import java.util.Hashtable; /** * Class for HTTP request parsing as defined by RFC 2612: * * Request = Request-Line ; Section 5.1 (( general-header ; Section 4.5 | * request-header ; Section 5.3 | entity-header ) CRLF) ; Section 7.1 CRLF [ * message-body ] ; Section 4.3 * * @author izelaya * */ public class HttpRequestParser { private String _requestLine; private Hashtable<String, String> _requestHeaders; private StringBuffer _messagetBody; public HttpRequestParser() { _requestHeaders = new Hashtable<String, String>(); _messagetBody = new StringBuffer(); } /** * Parse and HTTP request. * * @param request * String holding http request. * @throws IOException * If an I/O error occurs reading the input stream. * @throws HttpFormatException * If HTTP Request is malformed */ public void parseRequest(String request) throws IOException, HttpFormatException { BufferedReader reader = new BufferedReader(new StringReader(request)); setRequestLine(reader.readLine()); // Request-Line ; Section 5.1 String header = reader.readLine(); while (header.length() > 0) { appendHeaderParameter(header); header = reader.readLine(); } String bodyLine = reader.readLine(); while (bodyLine != null) { appendMessageBody(bodyLine); bodyLine = reader.readLine(); } } /** * * 5.1 Request-Line The Request-Line begins with a method token, followed by * the Request-URI and the protocol version, and ending with CRLF. The * elements are separated by SP characters. No CR or LF is allowed except in * the final CRLF sequence. * * @return String with Request-Line */ public String getRequestLine() { return _requestLine; } private void setRequestLine(String requestLine) throws HttpFormatException { if (requestLine == null || requestLine.length() == 0) { throw new HttpFormatException("Invalid Request-Line: " + requestLine); } _requestLine = requestLine; } private void appendHeaderParameter(String header) throws HttpFormatException { int idx = header.indexOf(":"); if (idx == -1) { throw new HttpFormatException("Invalid Header Parameter: " + header); } _requestHeaders.put(header.substring(0, idx), header.substring(idx + 1, header.length())); } /** * The message-body (if any) of an HTTP message is used to carry the * entity-body associated with the request or response. The message-body * differs from the entity-body only when a transfer-coding has been * applied, as indicated by the Transfer-Encoding header field (section * 14.41). * @return String with message-body */ public String getMessageBody() { return _messagetBody.toString(); } private void appendMessageBody(String bodyLine) { _messagetBody.append(bodyLine).append("\r\n"); } /** * For list of available headers refer to sections: 4.5, 5.3, 7.1 of RFC 2616 * @param headerName Name of header * @return String with the value of the header or null if not found. */ public String getHeaderParam(String headerName){ return _requestHeaders.get(headerName); } } 
+12
source

If you just want to send the raw request as is, it is very simple, just send the actual string using a TCP socket!

Something like that:

  Socket socket = new Socket(host, port); BufferedWriter out = new BufferedWriter( new OutputStreamWriter(socket.getOutputStream(), "UTF8")); for (String line : getContents(request)) { System.out.println(line); out.write(line + "\r\n"); } out.write("\r\n"); out.flush(); 

See this JoeJag blog post for full code.

UPDATE

I started the RawHTTP project to provide HTTP parsers for requests, responses, headers, etc ... it turned out so good that it’s pretty simple to write HTTP servers and clients on top of it. Check this out if you are looking for something low.

+2
source

Source: https://habr.com/ru/post/1444414/


All Articles