我有大約300,000條記錄要上傳到solr雲建議者。這些記錄是動態的,即將添加新文檔,並且將來定期刪除一些文檔。我現在面臨的問題可以是:如何優化solr雲建議器上的documentdictionary構建?
使用FileDictionaryFactory:這種方法操作的噩夢。我需要繼續生成文件並將其上傳到zookeeper(還沒有想出如何將這個大文件上傳到zookeeper)。並且可能需要分別在solr雲上的每個服務器上創建索引。經常這樣做似乎不可能。
使用DocumentDictionaryFactory:這個方法看起來很明顯,但是在這裏構建索引也是一個噩夢。每次嘗試構建索引時,都會收到「設備上沒有空間」錯誤。我試圖在5K記錄上構建它,並且它是成功的。但是這花了40分鐘,並且在整個40分鐘內消耗了全部10GB的內存。
我的問題是,如果我們遵循第二種方法,我們可以優化這個索引建立時間嗎? 或者,如果我遵循第一種方法,應該如何處理頻繁更改以在索爾雲上編制索引的理想方式。
我CONFIGS:
對於FileDictionaryFactory:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestions</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">FileDictionaryFactory</str>
<str name="field">searchfield</str>
<str name="weightField">searchscore</str>
<str name="suggestAnalyzerFieldType">text_ngram</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
<str name="sourceLocation">spellings.txt</str>
<str name="storeDir">autosuggest_dict</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggestions</str>
<str name="suggest.dictionary">results</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
對於DocumentDictionaryFactory:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestions</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">searchfield</str>
<str name="weightField">searchscore</str>
<str name="payloadField">payload</str>
<str name="suggestAnalyzerFieldType">text_ngram</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
<str name="sourceLocation">spellings.txt</str>
<str name="storeDir">autosuggest_dict</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggestions</str>
<str name="suggest.dictionary">results</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
您需要顯示您的建議者conf,特別是buildOnCommit等設置 – Persimmonium
在說明中添加了配置。 – diwakarb