Solr索引的MongoDB集合

假設我有一個測試應用程序代表一些朋友列表。該應用程序使用一個集合，其中的所有文件都採用以下格式：Solr索引的MongoDB集合

_id : ObjectId("someString"), 
name : "George", 
description : "some text", 
age : 35, 
friends : { 
    [ 
     { 
     name: "Peter", 
     age: 30 
     town: { 
        name_town: "Paris", 
        country: "France" 
       } 
     }, 
     { 
     name: "Thomas", 
     age: 25 
     town: { 
        name_town: "Berlin", 
        country: "Germany" 
       } 
     }, ...    // more friends 
    ] 
} 
...       // more documents

我怎樣才能在schema.xml中描述了這樣的收藏？我需要製作方面的問題，如：「給我國家，喬治的朋友住在哪裏」。另一個用例可能是 - 「歸還所有30歲以下的朋友。」等

我最初的想法是，以紀念這個schema.xml中定義「朋友」屬性的文本字段：

<fieldType name="text_wslc" class="solr.TextField" positionIncrementGap="100"> 
.... 
<field name="friends" type="text_wslc" indexed="true" stored="true" />

，並嘗試搜索如。文中的「年齡」和「30」字，但這不是一個非常可靠的解決方案。

請留下，不要在邏輯上形成良好的集合體繫結構。這只是我剛剛面對的類似問題的一個例子。

任何幫助或想法將不勝感激。

編輯：樣品 '的schema.xml'

<?xml version="1.0" encoding="UTF-8" ?> 
<schema name="text-schema" version="1.5"> 
    <types> 
     <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> 
     <fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0" /> 
     <fieldType name="trInt" class="solr.TrieIntField" precisionStep="0" omitNorms="true" /> 
     <fieldType name="text_p" class="solr.TextField" positionIncrementGap="100"> 
      <analyzer type="index"> 
       <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
       <filter class="solr.TrimFilterFactory"/> 
       <filter class="solr.WordDelimiterFilterFactory"/> 
       <filter class="solr.LowerCaseFilterFactory"/> 
      </analyzer> 
      <analyzer type="query"> 
       <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
       <filter class="solr.TrimFilterFactory"/> 
       <filter class="solr.WordDelimiterFilterFactory"/> 
       <filter class="solr.LowerCaseFilterFactory"/> 
      </analyzer> 
     </fieldType> 
    </types> 

    <fields> 
      <field name="_id" type="string" indexed="true" stored="true" required="true" /> 
      <field name="_version_" type="long" indexed="true" stored="true"/> 
      <field name="_ts" type="long" indexed="true" stored="true"/> 
      <field name="ns" type="string" indexed="true" stored="true"/>    
      <field name="description" type="text_p" indexed="true" stored="true" /> 
      <field name="name" type="text_p" indexed="true" stored="true" /> 
      <field name="age" type="trInt" indexed="true" stored="true" /> 
      <field name="friends" type="text_p" indexed="true" stored="true" />   <!-- Here is the problem - when the type is text_p, all fields are considered as a text; optimal solution would be something like "collection" tag to mark name_town and town as descendant of the field 'friends' but unfortunately, this is not how the solr works--> 

      <field name="town" type="text_p" indexed="true" stored="true"/> 
      <field name="name_town" type="string" indexed="true" stored="true"/>  
      <field name="town" type="string" indexed="true" stored="true"/> 
    </fields> 

    <uniqueKey>_id</uniqueKey>

來源

2013-10-04 user1949763

好吧，如果你要堅持你的架構的想法，我沒有看到你的需求的解決方案。您將需要連接功能，因爲您希望執行諸如嵌套實體之類的操作。沒有其他可靠的方法來查詢這樣的事情，而不會遇到更新 - 地獄。 – cheffe

由於Solr的是文檔爲中心的，你需要扁平化，就像你可以下來。根據您提供的示例，我將創建一個schema.xml，如下所示。

<?xml version="1.0" encoding="UTF-8" ?> 
<schema name="friends" version="1.0"> 

    <fields> 
     <field name="id" 
      type="int" indexed="true" stored="true" multiValued="false" /> 
     <field name="name" 
      type="text" indexed="true" stored="true" multiValued="false" /> 
     <field name="description" 
      type="text" indexed="true" stored="true" multiValued="false" /> 
     <field name="age" 
      type="int" indexed="true" stored="true" multiValued="false" /> 
     <field name="town" 
      type="text" indexed="true" stored="true" multiValued="false" /> 
     <field name="townRaw" 
      type="string" indexed="true" stored="true" multiValued="false" /> 
     <field name="country" 
      type="text" indexed="true" stored="true" multiValued="false" /> 
     <field name="countryRaw" 
      type="string" indexed="true" stored="true" multiValued="false" /> 
     <field name="friends" 
      type="int" indexed="true" stored="true" multiValued="true" /> 
    </fields> 
    <copyField source="country" dest="countryRaw" /> 
    <copyField source="town" dest="townRaw" /> 

    <types> 
     <fieldType name="string" class="solr.StrField" sortMissingLast="true"/> 
     <fieldType name="int" class="solr.TrieIntField" 
      precisionStep="0" positionIncrementGap="0" /> 
     <fieldType name="text" class="solr.TextField" 
      positionIncrementGap="100"> 
      <analyzer> 
       <tokenizer class="solr.StandardTokenizerFactory" /> 
       <filter class="solr.LowerCaseFilterFactory" /> 
      </analyzer> 
     </fieldType> 
    </types> 
</schema>

我會用這種方法來模擬每個人自己。兩個人之間的關係通過朋友的屬性來建模，後者轉換爲ID數組。因此，在索引時間，您需要獲取一個人的所有朋友的ID並將其放入該字段。

大多數其他領域都很簡單。有趣的是兩個原始字段。既然你說過你想要面向國家，你將需要國家不變或優化分面。通常，字段的類型根據其目的而不同（搜索它們，修改它們，自動創建它們等）。在這種情況下國家和城市進行索引，就像他們中給出。

我們您的使用情況，

給我的國家，在那裏喬治的朋友都住

然後可以做通過刻面。你可以查詢

喬治
方面對countryRaw的ID

這樣的查詢看起來像q=friends:1&rows=0&facet=true&facet.field=countryRaw&facet.mincount=1

返回的所有文件（的人），他的朋友是30歲。

這一個更難。首先你需要Solr's join feature。您需要在您的solrconfig.xml中進行配置。

<config> 
    <!-- loads of other stuff --> 
    <queryParser name="join" class="org.apache.solr.search.JoinQParserPlugin" /> 
    <!-- loads of other stuff --> 
</config>

將根據連接查詢看起來像這樣q={!join from=id to=friends}age:[30 TO *]

這說明如下

與age:[30 TO *]您搜索是30歲以上的老年人
然後你把所有的人他們的身份證，並加入所有其他朋友attibute
這將返回所有人hav e通過其朋友屬性中的初始查詢匹配的ID

由於我沒有寫下這個問題，所以您可能會看看我的github上的solrsample項目。我已經有添加了一個測試用例涉及一個問題：

https://github.com/chriseverty/solrsample/blob/master/src/main/java/de/cheffe/solrsample/FriendJoinTest.java

來源

2013-10-04 12:12:51 cheffe

Cheffe，感謝您準確回答的問題。但也許我並沒有真正強調模式不應該被改變。假設說明了模式。你能找到任何可能的解決方案如何才能訪問指定的數據？ – user1949763

user1949763，在這種情況下，我需要更多的schema.xml。最好是包含'types'的整個''元素。 – cheffe

我將'schema.xml'定義添加到原始文章中。但是這些定義很模糊，因爲我無法克服這個限制。 – user1949763

Solr索引的MongoDB集合

回答

相關問題