2011-08-27 75 views
0

我花了至少兩個小時試圖讓這個工作。我在SO和Google羣組中看到了很多不同的問題,但沒有一個答案似乎適用於我。AppEngine bulkloader通過設置key_name上傳實體

問:如何批量上傳數據如下到數據存儲的CSV文件,以創建具有在CSV文件中定義的KEY_NAME(相同的結果,使用下面的附加功能)實體。

這是我的模型:

class RegisteredDomain(db.Model): 
    """ 
    Domain object class. It has no fields because it's existence is 
    proof that it has been registered. Indivdual registered domains 
    can be found using keys. 
    """ 
    pass 

下面是我通常添加/刪除域等:

def add(domains): 
    """ 
    Add domains. This functions accepts a single domain string or a 
    list of domain strings and adds them to the database. The domain(s) 
    must be valid unicode strings (a ValueError is thrown if the domain 
    strings are not valid. 
    """ 
    if not isinstance(domains, list): 
     domains = [domains] 

    cleaned_domains = [] 
    for domain in domains: 
     clean_domain_ = clean_domain(domain) 
     is_valid_domain(clean_domain_) 
     cleaned_domains.append(clean_domain_) 

    domains = cleaned_domains 

    db.put([RegisteredDomain(key_name=make_key(domain)) for domain in domains]) 


def get(domains): 
    """ 
    Get domains. This function accepts a single domain string or a list 
    of domain strings and queries the database for them. It returns a 
    dictionary containing the domain name and RegisteredDomain object or 
    None if the entity was not found. 
    """ 
    if not isinstance(domains, list): 
     domains = [domains] 

    entities = db.get([Key.from_path('RegisteredDomain', make_key(domain)) for domain in domains]) 
    return dict(zip(domains, entities)) 

注:在上面的代碼make_key只是使域小寫,並預置一'D'。

就是這樣。現在我瘋了試圖從一個CSV文件上傳一些RegisteredDomain實體。下面是CSV文件(注意第一個字符「d」是有原因的事實,鍵名可能不以數字開頭):

key 
dgoogle.com 
dgoogle11.com 
dfacebook.com 
dcool.com 
duuuuuuu.com 
dsdsdsds.com 
dffffooo.com 
dgmail.com 

我一直沒能自動生成bulkloader YAML文件,因爲應用引擎仍未更新我的數據存儲區統計信息(1天加上幾個小時)。因此,這(和許多類似的排列)是我想出了(主要是改變import_transform位):

python_preamble: 
- import: google.appengine.ext.bulkload.transform 
- import: google.appengine.api.datastore 
- import: google.appengine.ext.db 
- import: utils 
- import: bulk_helper 

transformers: 
- kind: RegisteredDomain 
    connector: csv 
    connector_options: 
    encoding: utf-8 
    property_map: 
    - property: __key__ 
     external_name: key 
     export_transform: bulk_helper.key_to_reverse_str 
     import_template: transform.create_foreign_key('RegisteredDomain') 

現在,出於某種原因,當我嘗試上傳的說,一切都很好,X實體已經轉移等,但沒有任何更新的數據存儲(正如我可以從管理控制檯看到的)。這是我如何上傳:

appcfg.py upload_data --application=domain-sandwich --kind=RegisteredDomain --config_file=bulk.yaml --url=http://domain-sandwich.appspot.com/remote_api --filename=data.csv 

最後這是我的數據存儲瀏覽器的樣子: Datastore Viewer

注:我將Dev-服務器和AppEngine上這樣做(無論工作.. )。

感謝您的幫助!

回答

0

的問題是應用服務引擎bulkloader(或數據存儲API)中的錯誤。我貼幾個問題,關於這個問題(issue 1issue 2issue 3issue 4),但這裏是供將來參考bulkloader錯誤的文本:

VERSION: 
release: "1.5.2" 
timestamp: 1308730906 
api_versions: ['1'] 

的bulkloader不會沒有性能導入模型。例如:

class MetaObject(db.Model): 
    """ 
    Property-less object. Identified by application set key. 
    """ 
    pass 

在你可以使用這些實體這樣的應用程序:

db.put([MetaObject(key_name=make_key(obj)) for obj in objs]) 
db.get([Key.from_path('MetaObject', make_key(obj)) for obj in objs]) 
db.delete([Key.from_path('MetaObject', make_key(obj)) for obj in objs]) 

現在,當我嘗試使用bulkloader導入數據時出現的問題。通過bulkloader碼看後,錯誤竟然是在EncodeContent方法(線一四零零年至1406年):

1365 def EncodeContent(self, rows, loader=None): 
1366  """Encodes row data to the wire format. 
1367 
1368  Args: 
1369  rows: A list of pairs of a line number and a list of column values. 
1370  loader: Used for dependency injection. 
1371 
1372  Returns: 
1373  A list of datastore.Entity instances. 
1374 
1375  Raises: 
1376  ConfigurationError: if no loader is defined for self.kind 
1377  """ 
1378  if not loader: 
1379  try: 
1380   loader = Loader.RegisteredLoader(self.kind) 
1381  except KeyError: 
1382   logger.error('No Loader defined for kind %s.' % self.kind) 
1383   raise ConfigurationError('No Loader defined for kind %s.' % self.kind) 
1384  entities = [] 
1385  for line_number, values in rows: 
1386  key = loader.generate_key(line_number, values) 
1387  if isinstance(key, datastore.Key): 
1388   parent = key.parent() 
1389   key = key.name() 
1390  else: 
1391   parent = None 
1392  entity = loader.create_entity(values, key_name=key, parent=parent) 
1393 
1394  def ToEntity(entity): 
1395   if isinstance(entity, db.Model): 
1396   return entity._populate_entity() 
1397   else: 
1398   return entity 
1399 
1400  if not entity: 
1401 
1402   continue 
1403  if isinstance(entity, list): 
1404   entities.extend(map(ToEntity, entity)) 
1405  elif entity: 
1406   entities.append(ToEntity(entity)) 
1407 
1408  return entities 

因爲(也將發佈這一個問題)的數據存儲實體對象的子類字典沒有覆蓋在非零len個方法的實體不包含任何屬性,但確實有一個鍵,不會真(使得「如果沒有實體」真實的,即使whena鍵已設置),並因此將不附加到實體。

下面是在bulkloader或通過覆蓋在實體非零修復了這個(或一部作品)一個差異:

--- bulkloader.py  2011-08-27 18:21:36.000000000 +0200 
+++ bulkloader_fixed.py 2011-08-27 18:22:48.000000000 +0200 
@@ -1397,12 +1397,9 @@ 
     else: 
      return entity 

-  if not entity: 
- 
-  continue 
     if isinstance(entity, list): 
     entities.extend(map(ToEntity, entity)) 
-  elif entity: 
+  else: 
     entities.append(ToEntity(entity)) 

    return entities 
--- datastore.py  2011-08-27 18:41:16.000000000 +0200 
+++ datastore_fixed.py 2011-08-27 18:40:50.000000000 +0200 
@@ -644,6 +644,12 @@ 

    self.__key = Key._FromPb(ref) 

+ def __nonzero__(self): 
+  if len(self): 
+   return True 
+  if self.__key: 
+   return True 
+ 
    def app(self): 
    """Returns the name of the application that created this entity, a 
    string or None if not set. 

發佈錯誤報告:

問題1:http://code.google.com/p/googleappengine/issues/detail?id=5712

問題2:http://code.google.com/p/googleappengine/issues/detail?id=5713

問題3:http://code.google.com/p/googleappengine/issues/detail?id=5714

問題4:http://code.google.com/p/googleappengine/issues/detail?id=5715

相關問題