執行AppEngine模型Memcaching的最佳方式是什麼？

目前我的應用程序緩存模型在內存緩存是這樣的：執行AppEngine模型Memcaching的最佳方式是什麼？

memcache.set("somekey", aModel)

但尼克斯後在http://blog.notdot.net/2009/9/Efficient-model-memcaching表明，首先將其轉換爲protobuffers是很多更有效。但經過一些測試後，我發現它的尺寸確實比較小，但實際上比較慢（〜10％）。

其他人是否有相同的經歷或我做錯了什麼？

測試結果：http://1.latest.sofatest.appspot.com/?times=1000

import pickle 
import time 
import uuid 

from google.appengine.ext import webapp 
from google.appengine.ext import db 
from google.appengine.ext.webapp import util 
from google.appengine.datastore import entity_pb 
from google.appengine.api import memcache 

class Person(db.Model): 
name = db.StringProperty() 

times = 10000 

class MainHandler(webapp.RequestHandler): 

def get(self): 

    self.response.headers['Content-Type'] = 'text/plain' 

    m = Person(name='Koen Bok') 

    t1 = time.time() 

    for i in xrange(int(self.request.get('times', 1))): 
    key = uuid.uuid4().hex 
    memcache.set(key, m) 
    r = memcache.get(key) 

    self.response.out.write('Pickle took: %.2f' % (time.time() - t1)) 


    t1 = time.time() 

    for i in xrange(int(self.request.get('times', 1))): 
    key = uuid.uuid4().hex 
    memcache.set(key, db.model_to_protobuf(m).Encode()) 
    r = db.model_from_protobuf(entity_pb.EntityProto(memcache.get(key))) 


    self.response.out.write('Proto took: %.2f' % (time.time() - t1)) 


def main(): 
application = webapp.WSGIApplication([('/', MainHandler)], debug=True) 
util.run_wsgi_app(application) 


if __name__ == '__main__': 
main()

來源

2010-02-19 Koen Bok

我剛剛嘗試過真正大型和複雜的模型，但結果大致相同。 – 2010-02-19 21:36:34

也許GAE上有http://docs.python.org/library/timeit.html？這應該顯示更準確的結果，但仍然 - 在閱讀您鏈接到的博客條目後，我會預期protobuffers的性能與pickle之間的數量級差異 - 並且這應該由time.time（）無論如何趕上。 – 2010-02-21 23:34:36

我是使用java appengine，所以我懶得測試這個理論 - pickle（）在某個地方緩存幕後結果，而to_protobuf不是？基於這篇文章，我不確定我會期望速度會有一個完整的數量級增長，因爲即使使用protobuf版本，pickle仍然被稱爲。儘管如此，使用的空間肯定會大大縮小。 – 2010-02-22 02:45:28

內存緩存調用仍然泡菜物體使用或不使用protobuf的。味酸是具有protobuf的對象，因爲它具有非常簡單的模型

平原泡菜對象比的protobuf +鹹菜對象更大更快，因此，它們節省內存緩存時間，但是有更多的處理器時間在做protobuf的轉換

因此，一般來說，任何方法都可以解決大致相同的問題......但是

您應該使用protobuf的原因是它可以處理模型版本之間的變化，而Pickle會出錯。這個問題有一天會咬你，所以最好儘快處理它

來源

2010-02-28 22:00:10 TFD

儘管提出了一些優點，但並非所有內容都是真實的。如果您查看代碼，memcache api只會醃製非字符串。因此，使用protobuffed模型的列表將被酸洗，而不是單個模型。實際上protobufs的輸出更簡單和更小，我的測試表明它不是cpu密集型的 - 因此是最初的問題。模型版本點是有效的，但對我來說不是太重要，因爲無論如何，您應該有一種處理無效緩存結果的方法，並且它不會經常發生。 – 2010-03-02 21:01:19

在App Engine中，pickle和protobufs都很慢，因爲它們是用純Python實現的。我發現使用str.join之類的方法編寫我自己的簡單序列化代碼往往會更快，因爲大部分工作都是在C中完成的。但這隻適用於簡單的數據類型。

來源

2010-03-14 22:58:43

你是否也爲模型對象做過這個工作？我會很好奇看到你的實施。 – 2010-03-15 11:00:28

我曾經這樣做，但python2.7給了我們cpickle，它現在更快。 – FoxyLad 2012-08-16 00:23:07

更快地做到這一點的一種方法是將模型轉換爲字典並使用本地eval/repr函數作爲您的（de）序列化器 - 當然，一如既往的使用邪惡eval，但它應該因爲沒有外部步驟，所以在這裏是安全的。

下面是一個類Fake_entity實例的實例。您首先通過fake = Fake_entity(entity)創建您的字典，然後您可以簡單地通過memcache.set(key, fake.serialize())存儲您的數據。 serialize（）是對repr的本地字典方法的簡單調用，如果需要，還可以添加一些內容（例如在字符串的開頭添加標識符）。

要取回它，只需使用fake = Fake_entity(memcache.get(key))即可。 Fake_entity對象是一個簡單的字典，其鍵也可以作爲屬性訪問。你可以正常訪問你的實體屬性，除了referenceProperties提供的鍵而不是提取對象（這實際上非常有用）。你也可以通過fake.get（）或者更多的方式獲取（）實際的實體，改變它然後用fake.put（）保存。

它不適用於列表（如果您從查詢中獲取多個實體），但可以通過使用像'### FAKE MODEL ENTITY ###'這樣的標識符作爲分隔符的連接/拆分函數輕鬆進行調整。只與db.Model一起使用，需要對Expando進行小的調整。

class Fake_entity(dict): 
    def __init__(self, record): 
     # simple case: a string, we eval it to rebuild our fake entity 
     if isinstance(record, basestring): 
      import datetime # <----- put all relevant eval imports here 
      from google.appengine.api import datastore_types 
      self.update(eval(record)) # careful with external sources, eval is evil 
      return None 

     # serious case: we build the instance from the actual entity 
     for prop_name, prop_ref in record.__class__.properties().items(): 
      self[prop_name] = prop_ref.get_value_for_datastore(record) # to avoid fetching entities 
     self['_cls'] = record.__class__.__module__ + '.' + record.__class__.__name__ 
     try: 
      self['key'] = str(record.key()) 
     except Exception: # the key may not exist if the entity has not been stored 
      pass 

    def __getattr__(self, k): 
     return self[k] 

    def __setattr__(self, k, v): 
     self[k] = v 

    def key(self): 
     from google.appengine.ext import db 
     return db.Key(self['key']) 

    def get(self): 
     from google.appengine.ext import db 
     return db.get(self['key']) 

    def put(self): 
     _cls = self.pop('_cls') # gets and removes the class name form the passed arguments 
     # import xxxxxxx ---> put your model imports here if necessary 
     Cls = eval(_cls) # make sure that your models declarations are in the scope here 
     real_entity = Cls(**self) # creates the entity 
     real_entity.put() # self explanatory 
     self['_cls'] = _cls # puts back the class name afterwards 
     return real_entity 

    def serialize(self): 
     return '### FAKE MODEL ENTITY ###\n' + repr(self) 
     # or simply repr, but I use the initial identifier to test and eval directly when getting from memcache

我歡迎這個速度測試中，我會以爲這是一個相當比其他方式更快。此外，如果您的模型在此期間發生了某種變化，則不會有任何風險。

下面是一個序列化假實體的例子。採取在日期時間（創建）一個特定的外觀以及參考屬性（子域）：

### FAKE模型實體###
{ '狀態'：u'admin'， 'session_expiry'：無，' first_name'：u'Louis'，'last_name'：u'Le Sieur'，'modified_by'：None，'password_hash'：u'a9993e364706816aba3e25717000000000000000'，'language'：u'fr'，'created'：datetime.datetime '，'modified'：None，'created_by'：None，'email'：u' [email protected]'，'key'：'agdqZXJlZ2xlcgwLEgVMb2dpbhjmAQw'，'session_ref '：None，'_cls'：'models.Login'，'groups'：[]，'email___password_hash'：u' [email protected]+a9993e364706816aba3e25717000000000000000'，'subdomain'：datastore_types.Key.from_path（u'Subdomain' ，229L，_app = u'jeregle'），'allowed'：[]，'permissions'：[]}

就我個人而言，我也使用靜態變量（比memcache更快）在短期內緩存我的實體，並在服務器發生更改或由於某種原因刷新其內存時獲取數據存儲（事實上經常發生這種情況）。

來源

2010-07-20 18:43:26

執行AppEngine模型Memcaching的最佳方式是什麼？

回答

相關問題