2013-02-01 31 views
1

我正在嘗試套接字並嘗試構建一個非常簡單的webbot。爲什麼我的程序在HTTP/1.1中讀取額外的字節

這是我的代碼:

#include <sys/types.h> 
#include <sys/socket.h> 
#include <arpa/inet.h> 
#include <netdb.h> 
#include <errno.h> 
#include <cstring> 
#include <iostream> 
#include <string> 

#define HTTP_PORT "80" 

#define HOST "www.taringa.net" 
#define PORT HTTP_PORT 

#define IN_BUFFSIZE 1024 
#define OUT_BUFFSIZE 1024 

#define REQUEST "GET /Taringa/posts HTTP/1.0\r\nHost: www.taringa.net\r\nUser-Agent: foo\r\n\r\n" 

using namespace std; 

int main(int argc, char **argv) { 

    struct addrinfo hints, *res; 
    struct sockaddr_in servAddress; 
    int sockfd; 

    char addrstr[100]; 
    char buff_msg_out[OUT_BUFFSIZE], buff_msg_in[IN_BUFFSIZE]; 

    memset(&hints, 0, sizeof(hints)); 
    hints.ai_family = AF_UNSPEC; 
    hints.ai_socktype = SOCK_STREAM; 
    if (getaddrinfo(HOST, HTTP_PORT, &hints, &res) != 0) { 
    cerr << "Error en getaddrinfo" << endl; 
    return -1; 
    } 


    // Crear socket 
    if ((sockfd = socket(res->ai_family, res->ai_socktype, res->ai_protocol)) < 0) 
    { 
    cerr << "Error en socket()" << endl; 
    return -1; 
    } 


    // Iniciar conexion 
    if (connect(sockfd, res->ai_addr, res->ai_addrlen) == -1) 
    { 
    cerr << "Error en connect()" << endl; 
    cerr << "Error: " << strerror(errno) << endl; 
    return -1; 
    } 

    cout << "Conectado con éxito" << endl; 

    // Enviar datos 
    strncpy(buff_msg_out, REQUEST, strlen(REQUEST)); 

    if (send(sockfd, buff_msg_out, strlen(buff_msg_out), 0) <= 0) 
    { 
    cerr << "Error en write()" << endl; 
    return -1; 
    } 

    cout << "Mensaje enviado:" << endl << buff_msg_out << endl << endl; 

    int bytes_recv = 0; 

    while ((bytes_recv = recv(sockfd, buff_msg_in, IN_BUFFSIZE-1, 0)) > 0) 
    { 
    buff_msg_in[bytes_recv] = '\0'; 
    cout << buff_msg_in << endl;   
    } 

    freeaddrinfo(res); 
    close(sockfd); 

    return 0; 
} 

這是輸出當請求HTTP/1.0

HTTP/1.1 200 OK 
Server: n0 
Date: Fri, 01 Feb 2013 19:57:26 GMT 
Content-Type: text/html; charset=utf8 
Connection: close 
Set-Cookie: trngssn=06359673; path=/ 
Set-Cookie: trngssn=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: taringa_user_id=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: lastNick=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: fbs=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: tws=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: iB-friendfind=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" lang="es" xml:lang="es" > 
     <head profile="http://purl.org/NET/erdf/profile" prefix="og: http://ogp.me/ns 
# fb: http://ogp.me/ns/fb# article: http://ogp.me/ns/article#"> 
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
     <meta http-equiv=」X-Frame-Options」 content=」Deny」 /> 

       <link rel="alternate" type="application/atom+xml" title="Últimos Posts de Taringa" href="/rss/Taringa/posts/" /> 
       <link rel="alternate" type="application/atom+xml" title="Últimos Temas de Taringa" href="/rss/Taringa/tem 
as/" /> 

       <title>Posts de Taringa! - Taringa!</title> 

...

</body> 
</html> 

但是當我指定HTTP/1.1,回覆是

HTTP/1.1 200 OK 
Server: n0 
Date: Fri, 01 Feb 2013 20:03:54 GMT 
Content-Type: text/html; charset=utf8 
Transfer-Encoding: chunked 
Connection: keep-alive 
Set-Cookie: trngssn=81047255; path=/ 
Set-Cookie: trngssn=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: taringa_user_id=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: lastNick=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: fbs=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: tws=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 
Set-Cookie: iB-friendfind=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/; domain=.taringa.net 

d0f 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" lang="es" xml:lang="es" > 
     <head profile="http://purl.org/NET/erdf 
/profile" prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# article: http://ogp.me/ns/article#"> 
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
     <meta http-equiv=」X-Frame-Options」 content=」Deny」 /> 

       <link rel="alternate" type="application/atom+xml" title="Últimos Posts de Taringa" href="/rss/Taringa/posts/" /> 
       <link rel="alternate" type="application/atom+xml" title="Últimos Te 
mas de Taringa" href="/rss/Taringa/temas/" /> 

       <title>Posts de Taringa! - Taringa!</title> 

...

和這裏的問題是

</body> 
</html> 

0 

,並關閉通信之前等待幾秒鐘。

我只是嘗試與stackoverflow.com/about,它工作正常。服務器發送給我的以下文本除外

</html>HTTP/1.0 400 Bad request 
Cache-Control: no-cache 
Connection: close 
Content-Type: text/html 

<html><body><h1>400 Bad request</h1> 
Your browser sent an invalid request. 
</body></html> 

我錯過了什麼嗎?

回答

3

服務器:N0
日期:星期五,2013年2月1日20時03分54秒GMT
的Content-Type:text/html的;字符集= UTF8
傳輸編碼:分塊
連接:保持活躍

不要說你支持HTTP 1.1,如果你不這樣做。

所有HTTP/1.1應用程序必須能夠接收和解碼「分塊」的傳輸編碼,並且必須忽略它們不理解的塊擴展擴展。 - HTTP 1.1

5

服務器正在使用chunked encoding。 d0f是以十六進制表示的八位字節中「塊」的長度。 0是下一個塊的長度(即沒有)。

+4

偉大的思想思想相似,但更偉大的思想認爲快70秒。 –

+0

*笑*這是一個非常棒的評論,我希望我可以兩次讚揚它。 –

相關問題