2011-02-03 41 views
7

有沒有人嘗試過使用旋轉代理?這是多麼容易實現?它工作正常嗎?你的經驗請PHP和旋轉代理

PS:我看到像"how to make php script use a list of proxies"這樣的問題收集了很多的缺點。你可以解釋這個之前把-1?

+1

我記得試圖做這樣的客戶端。最大的問題是獲得穩定的代理列表。最初我們是從一個網站上刪除這個列表,但問題在於他們可以改變這個事實並且腳本停止工作。 – xil3 2011-02-03 16:25:43

+2

請訪問[此鏈接](http://scraperblog.blogspot.com/2013/07/php-scrape-website-with-rotating-proxies.html)以獲得正確的解決方案。 – pguardiario 2013-07-10 12:38:10

回答

15

------已更新2017年3月4日-------


我一直在那裏,發現最好的解決辦法是:

如果您沒有一個專門的服務器或至少一個VPS和一點點耐心,也懶得去閱讀後的休息...

1 - 從源代碼安裝魷魚3.2(檢查下面的註釋)
2 - 將20個左右的ip列表添加到squid.conf中(成本約25美元)
3 - 使用新功能ACLrandom來旋轉傳出IP。

這樣你就不需要轉動你的PHP腳本知識產權的名單,相反,你會被連接到相同的IP(例如:192.168.1.1:3129),但可見傳出IPtcp_outgoing_address)將基於隨機設置對每個請求進行旋轉。

您需要使用'-enable-http-violations'編譯squid 3.2以使其成爲精英匿名代理。

步驟通過步驟安裝:

yum -y groupinstall 'Development Tools' 
yum -y install openssl-devel 
mkdir /meus 
cd /meus 
wget http://www.squid-cache.org/Versions/v3/3.2/squid-3.2.13.tar.gz 
tar -xvf squid-3.2.13.tar.gz 
cd squid-3.2.13 
./configure -prefix=/squid32 '--enable-removal-policies=heap,lru' '--enable-ssl' '--with-openssl' '--enable-linux-netfilter' '--with-pthreads' '--enable-ntlm-auth-helpers=SMB,fakeauth' '--enable-external-acl-helpers=ip_user,ldap_group,unix_group,wbinfo_group' '--enable-auth-basic' '--enable-auth-digest' '--enable-auth-negotiate' '--enable-auth-ntlm' '--with-winbind-auth-challenge' '--enable-useragent-log' '--enable-referer-log' '--disable-dependency-tracking' '--enable-cachemgr-hostname=localhost' '--enable-underscores' '--enable-build-info' '--enable-cache-digests' '--enable-ident-lookups' '--enable-follow-x-forwarded-for' '--enable-wccpv2' '--enable-fd-config' '--with-maxfd=16384' '-enable-http-violations' 
make 
make install 

樣品的squid.conf(位於在這種情況下/squid32/etc/squid.conf):

#this will be the ip and port where squid will run 
http_port 5.5.5.5:33333 # change this ip and port ... 

#Extra parameters on squid.conf to make an elite proxy 

request_header_access Allow allow all 
request_header_access Authorization allow all 
request_header_access WWW-Authenticate allow all 
request_header_access Proxy-Authorization allow all 
request_header_access Proxy-Authenticate allow all 
request_header_access Cache-Control allow all 
request_header_access Content-Encoding allow all 
request_header_access Content-Length allow all 
request_header_access Content-Type allow all 
request_header_access Date allow all 
request_header_access Expires allow all 
request_header_access Host allow all 
request_header_access If-Modified-Since allow all 
request_header_access Last-Modified allow all 
request_header_access Location allow all 
request_header_access Pragma allow all 
request_header_access Accept allow all 
request_header_access Accept-Charset allow all 
request_header_access Accept-Encoding allow all 
request_header_access Accept-Language allow all 
request_header_access Content-Language allow all 
request_header_access Mime-Version allow all 
request_header_access Retry-After allow all 
request_header_access Title allow all 
request_header_access Connection allow all 
request_header_access Proxy-Connection allow all 
request_header_access User-Agent allow all 
request_header_access Cookie allow all 
request_header_access All deny all 

via off 
forwarded_for off 
follow_x_forwarded_for deny all 

acl vinte1 random 1/5 
acl vinte2 random 1/5 
acl vinte3 random 1/5 
acl vinte4 random 1/5 
acl vinte5 random 1/5 

tcp_outgoing_address 1.1.1.1 vinte1 # fake ip's , replace with yours 
tcp_outgoing_address 1.1.1.2 vinte2 
tcp_outgoing_address 1.1.1.3 vinte3 
tcp_outgoing_address 1.1.1.4 vinte4 
tcp_outgoing_address 1.1.1.5 vinte5 

tcp_outgoing_address 1.1.1.6 # this will be the default tcp outgoing address 

示例PHP CURL請求使用魷魚代理:

$proxy = "1.1.1.1:33333"; 
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"; 
$url = "https://api.ipify.org/"; 

$ch = curl_init(); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,15); 
curl_setopt($ch, CURLOPT_HTTP_VERSION,'CURL_HTTP_VERSION_1_1'); 
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1); 
curl_setopt($ch, CURLOPT_PROXY, $proxy); 
curl_setopt($ch, CURLOPT_PROXYUSERPWD,'USER:PASS'); 
curl_setopt($ch, CURLOPT_USERAGENT,$useragent); 
curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,0); 
$result=curl_exec ($ch); 
curl_close ($ch); 
echo $result 

有用的鏈接:
魷魚3.2來源http://www.squid-cache.org/Versions/v3/3.2/squid-3.2.13.tar.gz
Rotating_three_IPshttp://wiki.squid-cache.org/ConfigExamples/Strange/RotatingIPs#Example:_Rotating_three_IPs_based_on_time_of_day
AclRandomhttp://wiki.squid-cache.org/Features/AclRandom
在CentOS 5.3安裝魷魚3.2 - http://www.guldmyr.com/blog/installing-squid-3-2-on-centos-5-3/
添加密碼魷魚How to set up a squid Proxy with basic username and password authentication?

我發現這是最可靠和最安全的方式來旋轉代理,因爲你不依賴於第三方代理提供商和你的信息(密碼,數據等)會更安全。 這可能聽起來有點難以設置,但它會支付每一秒,你花了,GL :)