導入用戶在紅移UDF定義庫

導入我的用戶定義的Python函數內的圖書館中，我創建了一個稱爲NLTK的庫如下

[CREATE OR REPLACE LIBRARY nltk LANGUAGE plpythonu FROM 's3://nltk.zip' CREDENTIALS 'aws_access_key_id=*****;aws_secret_access_key=****';]

創建一次我試着在函數的輸入這是

CREATE OR REPLACE FUNCTION f_function (sentence varchar) 
    RETURNS VARCHAR STABLE AS $$ 
    from nltk import tokenize 
    token = nltk.word_tokenize(sentence) 
    return token $$ LANGUAGE plpythonu;

記號化是NLTK庫中一個子目錄

但是當我嘗試通過調用它放在一個表作爲

SELECT f_function(text) from table_txt;

我得到一個錯誤運行等功能

Amazon Invalid operation: ImportError: No module named nltk. Please look at svl_udf_log for more information
Details:
-----------------------------------------------
error: ImportError: No module named nltk. Please look at svl_udf_log for more information
code: 10000
context: UDF
query: 69145
location: udf_client.cpp:298
process: query0_21 [pid=3165]

誰能幫我我在哪裏做錯了？

來源

2016-03-07 sachin katarki

你使用nltk進入redshift嗎？ –

首先，你的Python代碼存在一個明顯的問題：你永遠不會導入nltk，然後調用nltk.word_tokenize。

其次，在下載nltk包後，您需要壓縮包內的模塊文件夾並將此zip文件上傳到RedShift。

nltk-X.Y.zip 
├─ setup.py 
├─ requirements.txt 
├─ nltk <- This is the folder that should be zipped and uploaded to S3 
... ├─ __init__.py 
    ├─ tokenize.py

紅移只能加載模塊 - 你的根文件夾應該有一個__init__.py文件。 http://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html

來源

2016-03-14 20:31:27 Lilley

我仍與上述說明掙扎，所以，當我終於找到了我想我會沿着我如何得到它的工作通過。

首先，創建庫：

create or replace library stem 
language plpythonu 
from 's3://[Your Bucket Here]/stem.zip' 
credentials 'aws_access_key_id=[aws key];aws_secret_access_key=[aws secret key]';

這裏是詞幹NLTK拉鍊庫我編輯的（我在COMPAT拉，使其自包含），然後上傳到S3：https://drive.google.com/file/d/0BzNI6AJdNrJCVThoSXVHY1NyUGM/view?usp=sharing 爲了使用它，我必須編輯init .py庫以引用上面創建的Redshift創建的UDF庫（「Stem」）。

然後我在紅移創建我的Python UDF功能：

create or replace function f_lancaster_stem (text varchar) 
returns varchar 
immutable as $$ 
    from stem import LancasterStemmer 
    st = LancasterStemmer() 
    return st.stem(text) 
$$ LANGUAGE plpythonu;

然後只需調用UDF！

select f_lancaster_stem('resting') from dual;

來源

2016-05-13 05:03:47

導入用戶在紅移UDF定義庫

回答

相關問題