我想查找文本中單詞的出現次數。 我有這樣RavenDb:在文本中搜索出現緩慢
public class Page
{
public string Id { get; set; }
public string BookId { get; set; }
public string Content { get; set; }
public int PageNumber { get; set; }
}
一類我有我的指標是這樣的:
class Pages_SearchOccurrence : AbstractIndexCreationTask<Page, Pages_SearchOccurrence.ReduceResult>
{
public class ReduceResult
{
public string PageId { get; set; }
public int Count { get; set; }
public string Word { get; set; }
public string Content { get; set; }
}
public Pages_SearchOccurrence()
{
Map = pages => from page in pages
let words = page.Content
.ToLower()
.Split(new string[] { " ", "\n", ",", ";" }, StringSplitOptions.RemoveEmptyEntries)
from w in words
select new
{
page.Content,
PageId = page.Id,
Count = 1,
Word = w
};
Reduce = results => from result in results
group result by new { PageId = result.PageId, result.Word } into g
select new
{
Content = g.First().Content,
PageId = g.Key.PageId,
Word = g.Key.Word,
Count = g.ToList().Count()
};
Index(x => x.Content, Raven.Abstractions.Indexing.FieldIndexing.Analyzed);
}
}
最後,我的查詢是這樣的:
using (var session = documentStore.OpenSession())
{
RavenQueryStatistics stats;
var occurence = session.Query<Pages_SearchOccurrence.ReduceResult, Pages_SearchOccurrence>()
.Statistics(out stats)
.Where(x => x.Word == "works")
.ToList();
}
但我意識到,RavenDb很慢(或我的查詢不好)012) stats.IsStale = true和烏鴉工作室花費太多時間,只給出幾個結果。 我有1000個文檔「Pages」,每頁1000個字的內容。 爲什麼我的查詢不好,我如何才能找到頁面中的事件? 謝謝你的幫助!
你爲什麼不靠Lucene來做這件事?它具有您所知的全文索引和查詢功能。我錯過了什麼嗎? –
你可能會覺得這有幫助:http://stackoverflow.com/questions/16774036/search-inside-an-attachment-in-ravendb – NoChance