3
簡而言之:在使用MRJob時,「socket.error:[Errno 104]由對等方重置連接」異常。該腳本實際上可以訪問S3,因爲它確實創建了存儲區並上傳了一些小文件(我通過AWS控制檯手動檢查過)。但是最大的文件 - 輸入 - 不會被上傳。嘿,它只有7GB的測試數據!MRJob:socket.error:[Errno 104]通過對等方重置連接
已經嘗試了4次,總是得到錯誤。
mrjob == 0.4.2
CONFIG
# cat /etc/mrjob.conf
runners:
inline:
base_tmp_dir: /home/tmp
emr:
base_tmp_dir: /home/tmp
aws_access_key_id: [VALID KEY HERE]
aws_secret_access_key: [VALID SECRET HERE]
aws_region: us-east-1
ec2_instance_type: m1.medium
num_ec2_instances: 7
TRACEBACK
# python /home/bigdata/mr_job_1.py -r emr /home/filesystem/INPUT > /home/filesystem/OUTPUT
using configs in /etc/mrjob.conf
creating new scratch bucket mrjob-f02b7cd37b2bfffd
using s3://mrjob-f02b7cd37b2bfffd/tmp/ as our scratch dir on S3
creating tmp directory /home/tmp/mr_job_1.root.20131216.152251.298419
writing master bootstrap script to /home/tmp/mr_job_1.root.20131216.152251.298419/b.py
creating S3 bucket 'mrjob-f02b7cd37b2bfffd' to use as scratch space
Copying non-input files into s3://mrjob-f02b7cd37b2bfffd/tmp/mr_job_1.root.20131216.152251.298419/files/
Traceback (most recent call last):
File "/home/bigdata/workers/process_data/mr_job_1.py", line 178, in <module>
MRSwapData().run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 494, in run
mr_job.execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 512, in execute
super(MRJob, self).execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 147, in execute
self.run_job()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 208, in run_job
runner.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/runner.py", line 458, in run
self._run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 806, in _run
self._prepare_for_launch()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 817, in _prepare_for_launch
self._upload_local_files_to_s3()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 905, in _upload_local_files_to_s3
s3_key.set_contents_from_filename(path)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1290, in set_contents_from_filename
encrypt_key=encrypt_key)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1221, in set_contents_from_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 713, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 889, in _send_file_internal
query_args=query_args
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 547, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 947, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 908, in _mexe
raise e
socket.error: [Errno 104] Connection reset by peer