默認情況下Lucene和Microsoft analyzers捷克語不會忽略變音符號。實現你想要的最簡單的方法是使用standardasciifolding.lucene分析器。或者,您可以構建custom analyzer以將ASCII摺疊令牌篩選器添加到捷克的標準分析鏈。例如:
{
"name":"example",
"fields":[
{
"name":"id",
"type":"Edm.String",
"key":true
},
{
"name":"text",
"type":"Edm.String",
"searchable":true,
"retrievable":true,
"analyzer":"my_czech_analyzer"
}
],
"analyzers":[
{
"name":"my_czech_analyzer",
"@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer":"standard",
"tokenFilters":[
"lowercase",
"czech_stop_filter",
"czech_stemmer",
"asciifolding"
]
}
],
"tokenFilters":[
{
"name":"czech_stop_filter",
"@odata.type":"#Microsoft.Azure.Search.StopTokenFilter",
"stopwords_list":"_czech_"
},
{
"name":"czech_stemmer",
"@odata.type":"#Microsoft.Azure.Search.StemmerTokenFilter",
"language":"czech"
}
]
}
我們意識到現在的體驗並非最佳。我們正在努力使這樣的定製更容易。
讓我知道這是否回答你的問題