用於在複雜JSON中搜索的Solr設計

使用Solr在複雜JSON中搜索的好設計是什麼？比如有可能是一個文件，如：用於在複雜JSON中搜索的Solr設計

{ 
    "books" : [ 
     { 
      "title" : "Some title", 
      "author" : "Some author", 
      "genres" : [ 
       "thriller", 
       "drama" 
      ] 
     }, 
     { 
      "title" : "Some other title", 
      "author" : "Some author", 
      "genres" : [ 
       "comedy", 
       "nonfiction", 
       "thriller" 
      ] 
     } 
    ] 
}

一個示例查詢，將得到的是有一本書，其作者是「一些作家」等書的類型之一是「劇」的所有文檔。

現在我想出的設計是具有在schema.xml中一個dynamicField該指標一切爲文本（現在），像這樣：

<dynamicField name="*" type="text" index="true" stored="true"/>

然後SolrJ用來解析JSON併爲每個數據段創建一個SolrInputDocument字段。例如，這些都將用於例如創建現場/值JSON以上：

books0.title : "Some title" 
books0.author : "Some author" 
books0.genres0 : "thriller" 
books0.genres1 : "drama" 
books1.title : "Some other title" 
books1.author : "Some author" 
books1.genres0 : "comedy" 
books1.genres1 : "nonfiction" 
books1.genres2 : "thriller"

在這一點上，我們可以使用LukeRequestHandler把所有的字段索引，然後做出一個大的Solr查詢檢查我們感興趣的所有字段。對於上面的示例查詢，查詢將檢查所有「books＃.author」和「books＃.genres＃」字段。這個解決方案看起來不夠優雅，如果有很多字段，查詢可能會變得非常大。

能夠在字段名中使用通配符會很有用，但我認爲Solr並不可行。

有沒有更好的方法來實現這一點，可能是通過在模式中使用一些巧妙的「copyField」和「multiValued」組合？

來源

2012-07-16 Thomas

您可以將圖書實體索引爲文檔。

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text_general" indexed="true" stored="true"/> 
<!-- Don't perform stemming on authors - You can use field with lower case, ascii folding for analysis --> 
<field name="authors" type="string" indexed="true" stored="true" multiValued="true"/> 
<field name="genre" type="string" indexed="true" stored="true" multiValued="true"/>

使用Dismax parser搜索作者和流派。
匹配這些字段應該返回您的文檔。
您可以使用流派進行過濾，例如filter query以及fq =類型：劇情

如果您希望搜索行爲不同，您可以簡單地使用copyField複製字段並對其執行不同的分析。例如

<field name="genre_search" type="text_general" indexed="true" stored="true" multiValued="true"/> 

<copyField source="genre" dest="genre_search"/>

來源

2012-07-17 04:33:16 Jayendra

你的意思是每個「書」都是一個文件嗎？ – Thomas 2012-07-17 14:26:14

是一個單獨的文件。 – Jayendra 2012-07-17 14:31:44

我接受了這個答案，因爲它讓我感到這個http://www.lucidimagination.com/search/document/93e8b09e90b0076c/help_with_denormalizing_issues#60890dcb99a3004d，它讓我確信，JSON中的每個「對象」都需要被編入索引爲自己的文檔。 – Thomas 2012-07-18 13:45:04

也許你值得看看Solr Joins。它只能在4.0版本中使用，現在使用alpha版本，但可以讓您對至少部分或所有複雜關係進行建模。性能不如香草solr沒有連接，但可能是完全有效的，你應該驗證。

來源

2012-07-16 22:24:12 Persimmonium

用於在複雜JSON中搜索的Solr設計

回答

相關問題