2012-09-04 23 views
2

您好,我正在使用NLTK學習自然語言處理。我正在嘗試實現本書的babelize_shell()示例。我正在執行的是執行babelize_shell(),在此之後,我輸入我的字符串,然後按照書中所述的德語,然後運行。在NLTK中使用babelize_shell()進行機器翻譯

我得到的錯誤是:

Traceback (most recent call last): 
    File "<pyshell#148>", line 1, in <module> 
    babelize_shell() 
    File "C:\Python27\lib\site-packages\nltk\misc\babelfish.py", line 175, in babelize_shell 
    for count, new_phrase in enumerate(babelize(phrase, 'english', language)): 
    File "C:\Python27\lib\site-packages\nltk\misc\babelfish.py", line 126, in babelize 
    phrase = translate(phrase, next, flip[next]) 
    File "C:\Python27\lib\site-packages\nltk\misc\babelfish.py", line 106, in translate 
    if not match: raise BabelfishChangedError("Can't recognize translated string.") 
BabelfishChangedError: Can't recognize translated string. 

下面是一個例子會話:

>>> babelize_shell() 
NLTK Babelizer: type 'help' for a list of commands. 
Babel> how long before the next flight to Alice Springs? 
Babel> german 
Babel> run 
0> how long before the next flight to Alice Springs? 
1> wie lang vor dem folgenden Flug zu Alice Springs? 
2> how long before the following flight to Alice jump? 
3> wie lang vor dem folgenden Flug zu Alice springen Sie? 
4> how long before the following flight to Alice do you jump? 
5> wie lang, bevor der folgende Flug zu Alice tun, Sie springen? 
6> how long, before the following flight to Alice does, do you jump? 
7> wie lang bevor der folgende Flug zu Alice tut, tun Sie springen? 
8> how long before the following flight to Alice does, do you jump? 
9> wie lang, bevor der folgende Flug zu Alice tut, tun Sie springen? 
10> how long, before the following flight does to Alice, do do you jump? 
11> wie lang bevor der folgende Flug zu Alice tut, Sie tun Sprung? 
12> how long before the following flight does leap to Alice, does you? 
+0

我沒有在我面前的書,所以你可以略圖在幾行樣本的代碼? –

回答

7
現在我有同樣的問題

我發現這一點: http://nltk.googlecode.com/svn/trunk/doc/api/nltk.misc.babelfish-module.html

和它說:當babelfish.yahoo.com改變他們的HTML佈局的一些細節 BabelfishChangedError 拋出,並以正確的形式babelizer不再將數據提交,或不能再解析結果。

我打算看看是否有辦法解決這個問題。

我現在出來的解決方案使用Microsoft Translator Web服務(SOAP)。這不是一個簡單的解決方案,但有趣的代碼。

我跟着http://msdn.microsoft.com/en-us/library/hh454950說明,然後修改它在NLTK發現的/ misc/babelfish.py

  1. 訂閱微軟翻譯API Azure上的市場

訂閱的babelfish.py到Azure Marketplace上的Microsoft Translator API,我選擇了免費訂閱。

  1. 註冊應用程序Azure的DataMarket

註冊到Azure的DataMarket您的應用程序,請訪問datamarket.azure.com/developer/applications/使用步驟1中的LiveID的憑據,然後單擊「註冊」 。寫下你的客戶端ID和客戶端祕密以後使用

  1. 爲Python安裝泡沫fedorahosted.org/suds/

  2. 修改babelfish.py(使用自己的cliend_id和密碼):

//進口增加

from suds.client import Client 
import httplib 
import ast 

... 

#added function 
def soaped_babelfish(TextToTranslate,codeLangFrom, codeLangTo): 

    #Oauth credentials 
    params = urllib.urlencode({'client_id': 'babelfish_soaped', 'client_secret': '1IkIG3j0ujiSMkTueCZ46iAY4fB1Nzr+rHBciHDCdxw=', 'scope': 'http://api.microsofttranslator.com', 'grant_type': 'client_credentials'}) 


    headers = {"Content-type": "application/x-www-form-urlencoded"} 
    conn = httplib.HTTPSConnection("datamarket.accesscontrol.windows.net") 
    conn.request("POST", "/v2/OAuth2-13/", params, headers) 
    response = conn.getresponse() 
    #print response.status, response.reason 

    data = response.read() 


    #obtain access_token 
    respondeDict = ast.literal_eval(data) 
    access_token = respondeDict['access_token'] 
    conn.close() 


    #use the webservice with the accesstoken 
    client = Client('http://api.microsofttranslator.com/V2/Soap.svc') 

    result = client.service.Translate('Bearer'+' '+access_token,TextToTranslate,codeLangFrom, codeLangTo, 'text/plain','general') 

    return result 

... 

#modified translate method 
def translate(phrase, source, target): 
    phrase = clean(phrase) 
    try: 
     source_code = __languages[source] 
     target_code = __languages[target] 
    except KeyError, lang: 
     raise ValueError, "Language %s not available " % lang 

    return clean(soaped_babelfish(phrase,source_code,target_code)) 

而這一切的皁洗版本!有一天,我會嘗試一個基於Web的解決方案(類似於目前的babelfish.py,但適應變化)