1
我在代碼中缺少什麼能夠獲取網站的html源代碼(信貸給@Michal Kottman)? 就像您要右鍵單擊並在Chrome中單擊「查看頁面源代碼」一樣。我如何使用luacurl/libcurl/curl和Lua獲取HTML代碼
local curl = require "luacurl"
local c = curl.new()
function GET(url)
c:setopt(curl.OPT_URL, url)
c:setopt(curl.OPT_PROXY, "http://myproxy.bla.com:8080")
c:setopt(curl.OPT_HTTPHEADER, "Connection: Keep-Alive", "Accept-Language: en-us")
c:setopt(curl.OPT_CONNECTTIMEOUT, 30)
local t = {} -- this will collect resulting chunks
c:setopt(curl.OPT_WRITEFUNCTION, function (param, buf)
table.insert(t, buf) -- store a chunk of data received
return #buf
end)
c:setopt(curl.OPT_PROGRESSFUNCTION, function(param, dltotal, dlnow)
print('%', url, dltotal, dlnow) -- do your fancy reporting here
end)
c:setopt(curl.OPT_NOPROGRESS, false) -- use this to activate progress
assert(c:perform())
return table.concat(t) -- return the whole data as a string
end
--local s = GET 'http://www.lua.org/'
local s = GET 'https://www.youtube.com/watch?v=dT_fkwX4fRM'
print(s)
file = io.open("text.html", "wb")
file:write(s)
file:close()
不幸的是它必須是使用Lua和使用luacurl與libcurl爲luasocket它時所提供的代理工作不綁定(至少對我來說)。 我下載的文件是空的。使用CMD我得到沒有問題的頁面源 curl http://mypage.com
它適用於lua.org,但對於youtube鏈接它沒有。我錯過了什麼?