獲取python字符串等價，像SQL匹配一樣工作

我想匹配兩個字符串，Serhat Kılıç和serhat kilic。在SQL這是很容易的，我能做到的：獲取python字符串等價，像SQL匹配一樣工作

select name from main_creditperson where name = 'serhat kılıç' 
union all 
select name from main_creditperson where name = 'serhat kilic'; 

=== 
name 
Serhat Kılıç 
Serhat Kılıç

換句話說，這兩個名稱返回相同的結果。我如何在python中做一個等價的字符串來看這兩個名字在SQL意義上是「相同的」。我正在尋找類似的東西：

if name1 == name2: 
    do_something()

我試着去unicodedata.normalize('NFKD', input_str)的方式，但它沒有讓我到任何地方。我將如何解決這個問題？

來源

2016-08-20 David542

此外，行爲的SQL查詢將非常依賴於實現。 –

如果你確定與ASCII的一切，你可以檢查Where is Python's "best ASCII for this Unicode" database?Unidecode是相當不錯的，但它是GPL-許可這可能是某個項目的問題。無論如何，它會在你的情況，並在相當許多人工作，工作在Python 2和3一樣（這些都是從Python 3中，使其更容易地看到發生了什麼事在）：

>>> from unidecode import unidecode 
>>> unidecode('serhat kılıç') 
'serhat kilic' 
>>> unidecode('serhat kilic') 
'serhat kilic' 
>>> # as a bonus it does much more, like 
>>> unidecode('北亰') 
'Bei Jing '

來源

2016-08-20 05:44:05

謝謝，這是所有鏈接的一個很好的答案。 – David542

我發現這個

def compare_words (str_1, str_2): 
    return unidecode(str_1.decode('utf-8')) == str_2

測試在Python 2.7版：

In[2]: from unidecode import unidecode 
In[3]: def compare_words (str_1, str_2): 
    return unidecode(str_1.decode('utf-8')) == str_2 
In[4]: print compare_words('serhat kılıç', 'serhat kilic') 
True

來源

2016-08-20 05:16:42

我已經嘗試過這種方法。它不起作用：>>> remove_accents（u'serhatkılıç'）== remove_accents（u'serhat kilic'） False'。請注意，我不想刪除重音，「我」字符不是重音。 – David542

我嘗試了一些新的 –

獲取python字符串等價，像SQL匹配一樣工作

回答

相關問題