1
我已經安裝了Nutch 1.9並將其配置爲成功使用Solr 4.10.1進行爬網。我試圖設置Nutch索引元數據,如此處所述https://wiki.apache.org/nutch/IndexMetatags如何索引nutch中的所有元標記
如何將它設置爲索引站點上的所有元數據?我對metatags.names設定值*這樣
<property>
<name>metatags.names</name>
<value>*</value>
<description>Names of the metatags to extract, separated by ','. Use '*' to extract all metatags. Prefixes the names with 'metatag.' in the parse-metadata. For instance to index description and keywords, you need to activate the plugin index-metadata and set the
value of the parameter 'index.parse.md' to 'metatag.description,metatag.keywords'.
</description>
</property>
,但我不確定如何設置index.parse.md值,而不列出個別元標記的名稱。我想這
<property>
<name>index.parse.md</name>
<value>meta*</value>
<description>Comma-separated list of keys to be taken from the parse metadata to generate fields. Can be used e.g. for 'description' or 'keywords' provided that these values are generated by a parser (see parse-metatags plugin)
</description>
</property>
但運行
bin/nutch indexchecker http://nutch.apache.org/
時不顯示任何元數據,我相信有元數據在該網站上,因爲它在運行時返回Parse元數據
bin/nutch parsechecker http://nutch.apache.org/
任何幫助將不勝感激!謝謝