2017-06-25 39 views
0

我試圖禁止用戶代理惱人的機器人。我把這個成nginx的配置服務器部分:Nginx不會阻止用戶代理的機器人

server { 

    listen 80 default_server; 

    .... 

    if ($http_user_agent ~* (AhrefsBot)) { 
     return 444; 
    } 

由捲曲檢查:

[[email protected] site_avaliable]# curl -I -H 'User-agent: Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)' localhost/ 
curl: (52) Empty reply from server 

,所以我檢查/var/log/nginx/access.log,我看到一些連接得到444,但另一個連接獲得200!

51.255.65.78 - - [25/Jun/2017:15:47:36 +0300 - -] "GET /product/kovriki-avtomobilnie/volkswagen/?PAGEN_1=10 HTTP/1.1" 444 0 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498394856.155 
217.182.132.60 - - [25/Jun/2017:15:47:50 +0300 - 2.301] "GET /product/bryzgoviki/toyota/ HTTP/1.1" 200 14500 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498394870.955 

這怎麼可能?

回答

0

好吧,明白了! 我已經加$服務器名稱和$ SERVER_ADDR nginx的日誌格式,只見那狡猾的機器人通過IP連接,而無需服務器名:

51.255.65.40 - _ *myip* - [25/Jun/2017:16:22:27 +0300 - 2.449] "GET /product/soyuz_96_2/mitsubishi/l200/ HTTP/1.1" 200 9974 "-" "Mozilla/5.0 (compatible; AhrefsBot/5.2; +http://ahrefs.com/robot/)" 1498396947.308 

所以我說這一點,機器人不能連接了

server { 
      listen *myip*:80; 
      server_name _; 
      return 403; 
     }