什麼數據庫設計可用於可鏈接文本？

如果該文本提及某個人，那麼該文本需要鏈接到該人物實體，前提是該文字提及了與該國家實體鏈接的國家。什麼數據庫設計可用於可鏈接文本？

該簧想到的唯一一件事就是刪除所有文本數據庫，並使用某種解析的事實後，如數據庫列包含這樣的「[PersonEntityID6]是[CountryEntityID6]」

條目

2016-03-14 0x4f3759df

這聽起來像一個相當大的工作（有多少實際的名字有人有？First + Last，First + Middle + Last或更多？）。使用全文索引可能是一種更簡單的方法？ –

您的問題主要是關於數據庫的設計以存儲這些關係，還是關於查找哪些文本包含哪些名稱和國家？無論如何，兩者都可以解決。

首先，有一個像

person(id, name), with an index on name, 
country(id, name), with an index on name 
text(id, title, full text only if needed) 
person_in_text(id, person_id, text_id, position in text if needed) 
country_in_text, similar

數據庫要分析文本：

for each word in the text 
    select name from person where name like word% 
    for each person found 
     if substring of text starting on current position equals name 
     insert text_id, person_id into person_in_text 
    ... same for country

根據文本的長度，以及人員和國家的數量，這可能是更好的負載每個人並在文本中執行人名的子字符串搜索;國家也一樣。

來源

2016-03-15 10:05:53 TAM

什麼數據庫設計可用於可鏈接文本？

回答

相關問題