8

我有一個網站運行在使用Elastic Beanstalk部署的Amazon Web服務上,並在至少2個EC2微型實例上運行。一個自動擴展策略已經到位,所以它可以根據網站中的流量進行擴展和縮小。由於這個自動擴展策略,我想避免使用粘性會話,因此我使用memcached-session-manager。我正在使用Amazon ElastiCache(小型實例)作爲memcached服務器。AWS上的memcached會話管理器

在context.xml中的配置如下:

<Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager" 
    memcachedNodes="sessions.myinstancecode.0001.use1.cache.amazonaws.com:11211" 
    sticky="false" 
    sessionBackupAsync="false" 
    lockingMode="none" 
    transcoderFactoryClass="de.javakaffee.web.msm.serializer.kryo.KryoTranscoderFactory" /> 

也能正常工作時流量較低(即少於10個用戶在線),但有時會導致EC2實例重新啓動。你可以想象,如果網站目前正在兩個實例上運行,並且他們都決定同時重新啓動,那麼該網站變得無法訪問,這是一個大問題。這些都是在tail_catalina.log最後幾行之前EC2實例決定重新啓動在Amazon S3上旋轉:

Jun 13, 2012 12:32:27 AM de.javakaffee.web.msm.BackupSessionTask handleException 
WARNING: Could not store session 42F9761AC24F826E1FC3F2A834FBF442 in memcached. 
Note that this session was relocated to this node because the original node was not available. 
net.spy.memcached.internal.CheckedOperationTimeoutException: Timed out waiting for operation - failing node: sessions.myinstancecode.0001.use1.cache.amazonaws.com/10.194.23.99:11211 
    at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:73) 
    at de.javakaffee.web.msm.BackupSessionTask.storeSessionInMemcached(BackupSessionTask.java:230) 
    at de.javakaffee.web.msm.BackupSessionTask.doBackupSession(BackupSessionTask.java:195) 
    at de.javakaffee.web.msm.BackupSessionTask.call(BackupSessionTask.java:120) 
    at de.javakaffee.web.msm.BackupSessionTask.call(BackupSessionTask.java:51) 
    at de.javakaffee.web.msm.BackupSessionService$SynchronousExecutorService.submit(BackupSessionService.java:339) 
    at de.javakaffee.web.msm.BackupSessionService.backupSession(BackupSessionService.java:198) 
    at de.javakaffee.web.msm.MemcachedSessionService.backupSession(MemcachedSessionService.java:967) 
    at de.javakaffee.web.msm.SessionTrackerValve.backupSession(SessionTrackerValve.java:226) 
    at de.javakaffee.web.msm.SessionTrackerValve.invoke(SessionTrackerValve.java:128) 
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) 
    at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) 
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) 
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539) 
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
    at java.lang.Thread.run(Thread.java:636) 
Jun 13, 2012 12:32:28 AM de.javakaffee.web.msm.LockingStrategy onAfterBackupSession 
WARNING: An error occurred during onAfterBackupSession. 
net.spy.memcached.internal.CheckedOperationTimeoutException: Timed out waiting for operation - failing node: sessions.myinstancecode.0001.use1.cache.amazonaws.com/10.194.23.99:11211 
    at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:73) 
    at de.javakaffee.web.msm.LockingStrategy.onAfterBackupSession(LockingStrategy.java:287) 
    at de.javakaffee.web.msm.MemcachedSessionService.backupSession(MemcachedSessionService.java:970) 
    at de.javakaffee.web.msm.SessionTrackerValve.backupSession(SessionTrackerValve.java:226) 
    at de.javakaffee.web.msm.SessionTrackerValve.invoke(SessionTrackerValve.java:128) 
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) 
    at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680) 
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) 
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539) 
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:298) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
    at java.lang.Thread.run(Thread.java:636) 

好像亞馬遜ElastiCache節點出現故障,但問題是,檢查在亞馬遜CloudWatch,我可以看到CPU利用率從未超過8%。 Amazon ElastiCache節點出現故障是否有任何原因,即使它沒有被強調太多?此外,當Amazon ElastiChace節點失敗時,爲什麼Amazon決定重新啓動(或者更好:終止並啓動新實例)?

任何幫助非常感謝。

謝謝!

回答

7

您應該增加memcached的會話管理器的sessionBackupTimeout,從documentation

sessionBackupTimeout(可選,默認爲100)

在一個會話備份之後毫秒超時被認爲是 爲蜜蜂失敗了。該屬性僅在會話同步存儲(通過sessionBackupAsync設置) 時才被評估。默認值 是100毫秒。