2016-07-28 37 views
0

我試圖使用C獲取此頁面的HTML http://pastebin.com/raw/7y7MWssc。到目前爲止,我試圖使用套接字&端口80連接到pastebin,然後使用HTTP請求獲取HTML在該pastebin頁面上。C語言,獲取HTML源碼

我知道我到目前爲止大概是遙遠,但在這裏它是:上述

#include <stdio.h> 
#include <string.h> 
#include <sys/socket.h> 
#include <netinet/in.h> 
#include <netdb.h> 

int main() 
{ 
    /*Define socket variables */ 
    char host[1024] = "pastebin.com"; 
    char url[1024] = "/raw/7y7MWssc"; 
    char request[2000]; 
    struct hostent *server; 
    struct sockaddr_in serverAddr; 
    int portno = 80; 

    printf("Trying to get source of pastebin.com/raw/7y7MWssc ...\n"); 

    /* Create socket */ 
    int tcpSocket = socket(AF_INET, SOCK_STREAM, 0); 
    if(tcpSocket < 0) { 
     printf("ERROR opening socket\n"); 
    } else { 
     printf("Socket opened successfully.\n"); 
    } 

    server = gethostbyname(host); 
    serverAddr.sin_port = htons(portno); 
    if(connect(tcpSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr)) < 0) { 
     printf("Can't connect\n"); 
    } else { 
     printf("Connected successfully\n"); 
    } 

    bzero(request, 2000); 
    sprintf(request, "Get %s HTTP/1.1\r\n Host: %s\r\n \r\n \r\n", url, host); 
    printf("\n%s", request); 

    if(send(tcpSocket, request, strlen(request), 0) < 0) { 
     printf("Error with send()"); 
    } else { 
     printf("Successfully sent html fetch request"); 
    } 
    printf("test\n"); 

} 

的代碼是有意義的某一點,現在我很困惑。我如何使這個從http://pastebin.com/raw/7y7MWssc獲得網絡資源?

+0

1)去除多餘的空格在請求行的開始,2)改變獲取到GET ,3)收到迴應並打印出來 – immibis

+0

@immibis完成了,但是在第38行打印請求後,仍然存在問題,因爲最後一行打印「測試」從未執行過。 – BotHam

+0

它是「從未執行過的」?那麼,你的程序是......退出時出現錯誤?掛?如果它掛着,它會卡在哪一行?如果有錯誤,它是什麼? – larsks

回答

2

固定的,我需要設置添加

serverAddr.sin_family = AF_INET; 

和bzero serverAddr,也是我的HTTP請求是錯誤的,它有一個額外的/ R/N和空間,像@immibis說。

更正:

sprintf(request, "GET %s HTTP/1.1\r\nHost: %s\r\n\r\n", url, host); 
1

你得到通過的gethostbyname返回的指針(),但你沒有做任何事的。

您需要使用地址,域和端口填充sockaddr_in。

這工作......但現在,你不必擔心在獲得響應...

#include <stdio.h> 
#include <string.h> 
#include <stdlib.h> 
#include <sys/socket.h> 
#include <netinet/in.h> 
#include <netdb.h> 

int main() 
{ 
    /*Define socket variables */ 
    char host[1024] = "pastebin.com"; 
    char url[1024] = "/raw/7y7MWssc"; 
    char request[2000]; 
    struct hostent *server; 
    struct sockaddr_in serverAddr; 
    short portno = 80; 

    printf("Trying to get source of pastebin.com/raw/7y7MWssc ...\n"); 

    /* Create socket */ 
    int tcpSocket = socket(AF_INET, SOCK_STREAM, 0); 
    if(tcpSocket < 0) { 
     printf("ERROR opening socket\n"); 
     exit(-1); 
    } else { 
     printf("Socket opened successfully.\n"); 
    } 

    if ((server = gethostbyname(host)) == NULL) { 
     fprintf(stderr, "gethostbybname(): error"); 
     exit(-1); 
    } 

    memcpy(&serverAddr.sin_addr, server -> h_addr_list[0], server -> h_length); 
    serverAddr.sin_family = AF_INET; 
    serverAddr.sin_port = htons(portno); 

    if(connect(tcpSocket, (struct sockaddr *) &serverAddr, sizeof(serverAddr)) < 0) { 
     printf("Can't connect\n"); 
     exit(-1); 
    } else { 
     printf("Connected successfully\n"); 
    } 

    bzero(request, 2000); 
    sprintf(request, "Get %s HTTP/1.1\r\n Host: %s\r\n \r\n \r\n", url, host); 
    printf("\n%s", request); 

    if(send(tcpSocket, request, strlen(request), 0) < 0) { 
     printf("Error with send()"); 
    } else { 
     printf("Successfully sent html fetch request"); 
    } 
    printf("test\n"); 

}