2012-03-18 50 views
1

我正在嘗試使用boto Python API設置EMR工作流(使用DynamoDB和Hive)。 我可以使用Amazon EMR Console手動運行腳本。然而,博託在創建表時失敗了 。EMR + DynamoDB工作流安裝程序拋出Hive.createTable NoSuchMethodError JsonErrorResponseHandler

這裏的博託腳本設置的EMR的工作流程:

args1 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--install-hive', 
     u'--hive-versions', 
     u'0.7.1.3'] 
args2 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script', 
     u'--base-path', 
     u's3://us-east-1.elasticmapreduce/libs/hive/', 
     u'--hive-versions', 
     u'0.7.1.3', 
     u'--run-hive-script', 
     u'--args', 
     u'-f', 
     u's3://foo/foobar/hiveexample.sql'] 
steps = [] 
for name, args in zip(('Setup Hive','Run Hive Script'),(args1,args2)): 
    step = JarStep(name, 
        's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar', 
        step_args=args 
        ) 
    steps.append(step) 

conn = boto.connect_emr() 
job_id = conn.run_jobflow('EpisodePlay', u's3://foo/foobar/logs/', 
          steps=steps, 
          master_instance_type='m1.small', 
          slave_instance_type='m1.small', 
          num_instances=5, 
          hadoop_version="0.20.205", 
          ami_version="2.0") 

但是腳本失敗與下面的異常。

Hive history file=/mnt/var/lib/hive_07_1/tmp/history/hive_job_log_hadoop_201203161922_1801322338.txt 
java.lang.NoSuchMethodError: com.amazonaws.http.JsonErrorResponseHandler.<init>(Ljava/util/List;)V 
    at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:663) 
    at com.amazonaws.services.dynamodb.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:525) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBClient$1.call(DynamoDBClient.java:73) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBClient$1.call(DynamoDBClient.java:70) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBFibonacciRetryer.runWithRetry(DynamoDBFibonacciRetryer.java:65) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBClient.describeTable(DynamoDBClient.java:70) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBSerDe.verifyDynamoDBWriteThroughput(DynamoDBSerDe.java:139) 
    at org.apache.hadoop.hive.dynamodb.DynamoDBSerDe.initialize(DynamoDBSerDe.java:52) 
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:199) 
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) 
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:484) 
    at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:455) 
    at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3159) 
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:215) 
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) 
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) 
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) 
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) 
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) 
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:171) 
    at org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:253) 
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:234) 
    at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:284) 
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:461) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
    at java.lang.reflect.Method.invoke(Method.java:597) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.DDLTask 
Command exiting with ret '255' 

回答

3

我設法解決了這個問題。我沒有使用正確的AMI版本。 當我從控制檯啓動工作流程時,它找到了支持DynamoDB連接的最新AMI版本,但當boto腳本啓動時,情況並非如此。

請參考以下鏈接: http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/EMRforDynamoDB_PreRequisites.html

job_id = conn.run_jobflow('EpisodePlay', u's3://dfhivescript/episodePlay/logs/', 
steps=steps, 
master_instance_type='m1.small', 
slave_instance_type='m1.small', 
num_instances=5, 
hadoop_version="0.20.205", 
**ami_version="2.0.4") # Correct AMI version** 
+0

+1與您的解決方案,感謝跟進! – 2012-03-19 16:55:32

+0

+1,因爲它也適用於我! – Adauto 2012-12-17 03:07:58