1
如何配置抓取工具nutch,以便只抓取英文頁面?Nutch crawler:只接受英文頁面
我在Nutch的-site.xml文件設定了這項政策,但它不工作:
<property>
<name>http.accept.language</name>
<value>en-us,en-gb,en;q=0.7,*;q=0.3</value>
<description>Value of the "Accept-Language" request header field.
This allows selecting non-English language as default one to retrieve.
It is a useful setting for search engines build for certain national group.
</description>
</property>
我只想抓取英語和烏爾都語語言的網頁,我可怎麼辦呢? – Shafiq 2015-03-05 04:15:51