2011-03-19 55 views
11

我正在嘗試爲我的網站構建更好的自動完成功能。我想爲此使用Hibernate Search,但據我測試,它只能爲我找到完整的單詞。使用Hibernate搜索的自動完成功能

所以,我的問題:是否有可能只搜索一些字符?

例如。用戶鍵入3個字母並使用hibernate搜索向他顯示包含這3個字母的我的db對象的所有單詞?

PS。現在我正在使用「like」查詢...但是我的db增長很多,我還希望將搜索功能擴展到另一個表...

回答

6

您可以按照建議使用NGramFilterhere。爲了獲得最佳效果,您應該使用Apache Solr中的EdgeNgramFilter,它可以從術語的開始邊緣創建ngram,並且還可以用於hibernate搜索。

11

主要編輯 一年來,我能提高原代碼我張貼產生這樣的:

我的索引實體:

@Entity 
@Indexed 
@AnalyzerDef(name = "myanalyzer", 
// Split input into tokens according to tokenizer 
tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), // 
filters = { // 
// Normalize token text to lowercase, as the user is unlikely to care about casing when searching for matches 
@TokenFilterDef(factory = LowerCaseFilterFactory.class), 
// Index partial words starting at the front, so we can provide Autocomplete functionality 
@TokenFilterDef(factory = NGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "1024") }), 
// Close filters & Analyzerdef 
}) 
@Analyzer(definition = "myanalyzer") 
public class Compound extends DomainObject { 
public static String[] getSearchFields(){...} 
... 
} 

所有@Field s的標記化並存儲在索引中;需要這種工作:
@Field(index = Index.TOKENIZED, store = Store.YES)

@Transactional(readOnly = true) 
public synchronized List<String> getSuggestions(final String searchTerm) { 
    // Compose query for term over all fields in Compound 
    String lowerCasedSearchTerm = searchTerm.toLowerCase(); 

    // Create a fullTextSession for the sessionFactory.getCurrentSession() 
    FullTextSession fullTextSession = Search.getFullTextSession(getSession()); 

    // New DSL based query composition 
    SearchFactory searchFactory = fullTextSession.getSearchFactory(); 
    QueryBuilder buildQuery = searchFactory.buildQueryBuilder().forEntity(Compound.class).get(); 
    TermContext keyword = buildQuery.keyword(); 
    WildcardContext wildcard = keyword.wildcard(); 
    String[] searchfields = Compound.getSearchfields(); 
    TermMatchingContext onFields = wildcard.onField(searchfields[0]); 
    for (int i = 1; i < searchfields.length; i++) 
     onFields.andField(searchfields[i]); 
    TermTermination matching = onFields.matching(input.toLowerCase()); 
    Query query = matching.createQuery(); 

    // Convert the Search Query into something that provides results: Specify Compound again to be future proof 
    FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(query, Compound.class); 
    fullTextQuery.setMaxResults(20); 

    // Projection does not work on collections or maps which are indexed via @IndexedEmbedded 
    List<String> projectedFields = new ArrayList<String>(); 
    projectedFields.add(ProjectionConstants.DOCUMENT); 
    List<String> embeddedFields = new ArrayList<String>(); 
    for (String fieldName : searchfields) 
     if (fieldName.contains(".")) 
      embeddedFields.add(fieldName); 
     else 
      projectedFields.add(fieldName); 

    @SuppressWarnings("unchecked") 
    List<Object[]> results = fullTextQuery.setProjection(projectedFields.toArray(new String[projectedFields.size()])).list(); 

    // Keep a list of suggestions retrieved by search over all fields 
    List<String> suggestions = new ArrayList<String>(); 
    for (Object[] projectedObjects : results) { 
     // Retrieve the search suggestions for the simple projected field values 
     for (int i = 1; i < projectedObjects.length; i++) { 
      String fieldValue = projectedObjects[i].toString(); 
      if (fieldValue.toLowerCase().contains(lowerCasedSearchTerm)) 
       suggestions.add(fieldValue); 
     } 

     // Extract the search suggestions for the embedded fields from the document 
     Document document = (Document) projectedObjects[0]; 
     for (String fieldName : embeddedFields) 
      for (Field field : document.getFields(fieldName)) 
       if (field.stringValue().toLowerCase().contains(lowerCasedSearchTerm)) 
        suggestions.add(field.stringValue()); 
    } 

    // Return the composed list of suggestions, which might be empty 
    return suggestions; 
} 

有一些爭論,我在最後處理@IndexedEmbedded領域做的事情。如果你沒有這些,你可以簡化代碼,只是投射searchFields,而忽略文檔& embeddedField處理。

和以前一樣: 希望這對下一個遇到此問題的人有用。如果有人對上述發佈的代碼有任何批評或改進,請隨時編輯,並請讓我知道。


EDIT3:此代碼從被帶到該項目此後一直開源;下面是相關的類:

https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-core/src/main/java/org/metidb/domain/Compound.java
https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-core/src/main/java/org/metidb/dao/CompoundDAOImpl.java
https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-search/src/main/java/org/metidb/search/text/Autocompleter.java

+0

另一個問題是,結果只出現在單個單詞中,如果我搜索例如:「扭傷」有結果但「扭傷」沒有。有沒有辦法解決這個問題? – nanospeck 2013-05-08 05:40:31

+0

TermTermination matching = onFields.matching(input.toLowerCase());我也認爲'input.toLowerCase()'應該是'lowerCasedSearchTerm'。 – nanospeck 2013-05-08 05:46:02

+0

@nanospeck我有點模糊,因爲我已經工作了一段時間,但關於短語搜索「扭傷」,您可能需要將'WhitespaceTokenizerFactory'調整爲別的東西,因爲我認爲這會打斷單個詞。 – Tim 2013-05-08 06:16:57

2

添的回答是輝煌的,幫我克服困難的部分。它僅適用於我的單個單詞查詢。如果有人希望它使它適用於短語搜索。只需將所有'Term'實例替換爲相應的'Phrase'類。以下是Tim代碼的替換行,它爲我完成了這個技巧。

// New DSL based query composition 
      //org.hibernate.search.query.dsl 
      SearchFactory searchFactory = fullTextSession.getSearchFactory(); 
      QueryBuilder buildQuery = searchFactory.buildQueryBuilder().forEntity(MasterDiagnosis.class).get(); 
      PhraseContext keyword = buildQuery.phrase(); 
      keyword.withSlop(3); 
      //WildcardContext wildcard = keyword.wildcard(); 
      String[] searchfields = MasterDiagnosis.getSearchfields(); 
      PhraseMatchingContext onFields = keyword.onField(searchfields[0]); 
      for (int i = 1; i < searchfields.length; i++) 
       onFields.andField(searchfields[i]); 
      PhraseTermination matching = onFields.sentence(lowerCasedSearchTerm); 
      Query query = matching.createQuery(); 
// Convert the Search Query into something that provides results: Specify Compound again to be future proof