2012-11-06 153 views
4

我工作在HTTP流量由完整的POST和GET請求組成的數據集如下所示。我已經用java編寫了代碼,它將這些請求中的每一個分開並將它們作爲字符串元素保存在數組列表中。 現在我很困惑如何在java中解析這些原始HTTP請求是否有比手動解析更好的方法?解析原始HTTP請求

GET http://localhost:8080/tienda1/imagenes/3.gif/ HTTP/1.1 
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.8 (like Gecko) 
Pragma: no-cache 
Cache-control: no-cache 
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 
Accept-Encoding: x-gzip, x-deflate, gzip, deflate 
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5 
Accept-Language: en 
Host: localhost:8080 
Cookie: JSESSIONID=FB018FFB06011CFABD60D8E8AD58CA21 
Connection: close 
+0

你需要解析這些嗎?在Servlet或類似的技術(或)普通的Java類中? – kosa

+1

數據來自哪裏?你需要解析什麼? – Perception

+2

如果你絕對必須直接執行HTTP,而這不是爲了一個類,我強烈建議使用類似Apache Commons HttpClient的東西。自己做這件事有很多缺陷。 (例如分塊傳輸編碼) –

回答

2

我[是]工作它是由完整的POST [一] HTTP流量數據集和GET請求,[S]

所以,你要分析一個文件或目錄包含多個HTTP請求。你想提取什麼數據?無論如何here是一個Java HTTP解析類,它可以讀取請求行中使用的方法,版本和URI,並將所有標題讀入Hashtable。

如果你想重新發明輪子,你可以使用那個或自己寫一個。看看在RFC看到一個請求是什麼樣子,以正確分析:

Request  = Request-Line    ; Section 5.1 
        *((general-header  ; Section 4.5 
        | request-header   ; Section 5.3 
        | entity-header) CRLF) ; Section 7.1 
        CRLF 
        [ message-body ]   ; Section 4.3 
12

這裏是一個普通的HTTP請求分析器對於所有類型的方法(GET,POST等)爲您的舒適:

package util.dpi.capture; 

import java.io.BufferedReader; 
import java.io.IOException; 
import java.io.StringReader; 
import java.util.Hashtable; 

/** 
* Class for HTTP request parsing as defined by RFC 2612: 
* 
* Request = Request-Line ; Section 5.1 ((general-header ; Section 4.5 | 
* request-header ; Section 5.3 | entity-header) CRLF) ; Section 7.1 CRLF [ 
* message-body ] ; Section 4.3 
* 
* @author izelaya 
* 
*/ 
public class HttpRequestParser { 

    private String _requestLine; 
    private Hashtable<String, String> _requestHeaders; 
    private StringBuffer _messagetBody; 

    public HttpRequestParser() { 
     _requestHeaders = new Hashtable<String, String>(); 
     _messagetBody = new StringBuffer(); 
    } 

    /** 
    * Parse and HTTP request. 
    * 
    * @param request 
    *   String holding http request. 
    * @throws IOException 
    *    If an I/O error occurs reading the input stream. 
    * @throws HttpFormatException 
    *    If HTTP Request is malformed 
    */ 
    public void parseRequest(String request) throws IOException, HttpFormatException { 
     BufferedReader reader = new BufferedReader(new StringReader(request)); 

     setRequestLine(reader.readLine()); // Request-Line ; Section 5.1 

     String header = reader.readLine(); 
     while (header.length() > 0) { 
      appendHeaderParameter(header); 
      header = reader.readLine(); 
     } 

     String bodyLine = reader.readLine(); 
     while (bodyLine != null) { 
      appendMessageBody(bodyLine); 
      bodyLine = reader.readLine(); 
     } 

    } 

    /** 
    * 
    * 5.1 Request-Line The Request-Line begins with a method token, followed by 
    * the Request-URI and the protocol version, and ending with CRLF. The 
    * elements are separated by SP characters. No CR or LF is allowed except in 
    * the final CRLF sequence. 
    * 
    * @return String with Request-Line 
    */ 
    public String getRequestLine() { 
     return _requestLine; 
    } 

    private void setRequestLine(String requestLine) throws HttpFormatException { 
     if (requestLine == null || requestLine.length() == 0) { 
      throw new HttpFormatException("Invalid Request-Line: " + requestLine); 
     } 
     _requestLine = requestLine; 
    } 

    private void appendHeaderParameter(String header) throws HttpFormatException { 
     int idx = header.indexOf(":"); 
     if (idx == -1) { 
      throw new HttpFormatException("Invalid Header Parameter: " + header); 
     } 
     _requestHeaders.put(header.substring(0, idx), header.substring(idx + 1, header.length())); 
    } 

    /** 
    * The message-body (if any) of an HTTP message is used to carry the 
    * entity-body associated with the request or response. The message-body 
    * differs from the entity-body only when a transfer-coding has been 
    * applied, as indicated by the Transfer-Encoding header field (section 
    * 14.41). 
    * @return String with message-body 
    */ 
    public String getMessageBody() { 
     return _messagetBody.toString(); 
    } 

    private void appendMessageBody(String bodyLine) { 
     _messagetBody.append(bodyLine).append("\r\n"); 
    } 

    /** 
    * For list of available headers refer to sections: 4.5, 5.3, 7.1 of RFC 2616 
    * @param headerName Name of header 
    * @return String with the value of the header or null if not found. 
    */ 
    public String getHeaderParam(String headerName){ 
     return _requestHeaders.get(headerName); 
    } 
} 
1

如果你只是想發送原始請求,它很容易,只需發送實際的字符串使用TCP套接字!

事情是這樣的:

Socket socket = new Socket(host, port); 

    BufferedWriter out = new BufferedWriter(
      new OutputStreamWriter(socket.getOutputStream(), "UTF8")); 

    for (String line : getContents(request)) { 
     System.out.println(line); 
     out.write(line + "\r\n"); 
    } 

    out.write("\r\n"); 
    out.flush(); 

看到這個blog post通過JoeJag的全部代碼。

UPDATE

我啓動了一個項目,RawHTTP提供的請求,響應報頭等等HTTP解析器......原來如此之好,這使得它很容易編寫HTTP服務器和客戶端頂部的。如果你正在尋找低級別的東西,請檢查一下。